MHTML is a neat format, it's unfortunate it never got much steam. I think it could have been more popular if web browsers had defaulted to it when saving pages, rather than this weird html + _files/ directory (which on Windows is mysteriously linked so that when you delete one, you delete the other - no idea how they do that!).
What I've read of EPUB is also pretty disappointing. Seeing as it's a compiled format, once again, instead of going the zipfile + bunch of html inside + specific layout, we could have had a subset of html in .mhtml.gz with, like, metadata in a <script type="application/json" id="x-epub-metadata">. And then, guess what, web browsers could have been able to read it natively…
> (which on Windows is mysteriously linked so that when you delete one, you delete the other - no idea how they do that!)
Probably just some special code in Windows Explorer watching for the combo of .htm(l) file plus simlarly-named folder – via the command line I can delete just the HTML file or the folder separately without problems.
I've been doing a lot of research that applies here. The answer comes down to a few things:
1. using vector graphics wherever possible and then encoding it as SVG
2. if bitmap graphics are absolutely required and they can be procedurally generated, then do that
3. if large photographic data, video, or any other kind of data is required that can't be handled using the above steps, then separate that data set as you normally would using the file system directories, place the data set subtree into a ZIP archive, write your code so it references items by file paths relative to the ZIP, and then put your page into the root of the ZIP file, too, e.g. as index.html—your readers and reviewers follow along by using their system's native ZIP support to explore the contents of the ZIP file so they can locate index.html and then double click it, and index.html opens up with an "open dataset" button which you use to then feed in its own parent ZIP archive
The last part might sound complicated, but it's not much different from asking someone to use MS Office or VS Code or an IDE to open a file/project. (It's just that instead of requiring then to already have that IDE installed, you're giving them the IDE they need at the same time that they're getting the document/dataset they're actually interested in).
These approaches are robust enough that they're very unlikely to be broken by future browser changes. It's not that the tech is lacking right now, it's that human habits are lagging behind and we haven't yet established this as a cultural norm/protocol/expectation.
There are also other situations when the data is neither procedurally generated, nor large enough† to warrant this kind of treatment : photographs, video, (non-MIDI) sound …
†IMHO as long as your document doesn't cross 10 Mo, you shouldn't have to separate the data…
I don't understand your comment. It sounds like an argument against a process for manually creating these kinds of files, which is not at all what my comment was about. It was about accessibility, real-world engineering, and describing a file format/packaging convention.
The packaging convention I described is similar to the container formats used and created by MS Office apps. The difference is that DOCX, XLSX, etc rely on XML instead of HTML that can be used without requiring a separate proprietary app. People create and exchange those files every day (even for things as trivial as a single-page flyer) without knowing or caring about whether it should "warrant this kind of treatment". Worrying about a purported edge case for <10 MB(?) of data sounds like an imaginary concern.
> Embedding a MP4 can work great, but your text editor will likely hate it.
Well, Libre Office Writer deals with (multiple, 100 Ko < size < 10 Mo) MP4 just fine. It's when the ODT is converted to PDF that most(?) PDF readers seem to be unable to read those MP4 properly.
- GIF is obsolete (~100x heavier than MP4 in my use-case, so out of the question)
- MP4 has poor support in PDF readers
(- Besides, PDF is not appropriate for electronic documents.)
- EPUB doesn't seem to support MP4 at all
(- EPUB does support PNG, not sure about APNG, will have to try it out...)
- MHTML=EML support has been dropped from browsers, which is completely baffling to me. There are alternatives like SingleFile, but they feel like dirty hacks : https://addons.mozilla.org/en-US/firefox/addon/single-file/
- What future for AV1 support ?