There are latex to HTML converters and they only work for a subset of latex func...

graycat · on Oct 26, 2014

Let me be more clear: Long ago a friend kept suggesting that I write a converter from TeX (maybe also LaTeX) to HTML. I kept telling him that that was essentially impossible because TeX is a programming language, likely Turing machine equivalent, complete with if-then-else, allocate-free, file read-write, while HTML is just a text markup language. No doubt JavaScript is Turning machine equivalent, but I'd have a tough time believing that HTML is.

So, my suggestion here was not to convert TeX input to HTML.

Instead my suggestion was just to convert TeX output, that is, a DVI file, to HTML. Why? Because a DVI file is essentially just text, or, as I outlined, it specifies put this character at these coordinates on the page, put that character there on the page, go to a new page, etc.

To be more clear, say, about the file reading-writing, that happens when the TeX program reads the user's TeX input and before the DVI file is generated. Given only the DVI file and displaying it, there is no file reading-writing.

So, it looks like could convert TeX DVI output to HTML.

You pointed out that maybe HTML with a browser has more flexibility than TeX output. Okay, maybe. But I didn't claim that, given an HTML file, there would be a TeX input file and a corresponding TeX DVI output file that my envisioned converter would convert to the given HTML file. Instead, I just claimed that for a given TeX and DVI file, the converter would generate an HTML file.

Or the converter would be a function from the set of all TeX DVI files to the set of all HTML files. That is, for each TeX DVI file there would be a corresponding HTML file from the converter. But the function would not be onto the set of HTML files, that is, not all HTML files would be a value of the converter; not all HTML files could be obtained by using TeX input, the TeX program, the DVI file and the envisioned converter.

You also mentioned some ways in which HTML, say, with <div>, is more flexible than TeX. Fine. But I was discussing just converting TeX DVI to HTML.

And, again, I see no way to convert TeX input, which is a programming language, to HTML, which is not a programming language.

Whew!

More clear now?

Leszek · on Oct 26, 2014

The point isn't to convert a TeX program into an equivalent HTML program, it's to be HTML an output of a TeX program. For example, make \emph{foo} output "<i>foo</i>" instead of "/Times-Italic 12 selectfont (foo) show" or whatever the PS output would be.

graycat · on Oct 27, 2014

> The point isn't to convert a TeX program into an equivalent HTML program.

You are correct, of course. And one of my main points is that such a conversion is essentially impossible. E.g., TeX can read and write files, but, thankfully for Internet security, HTML can't.

So, my solution and envisioned converter is to convert TeX output in a DVI file to an HTML file. Such a converter seems doable and to solve a concern the OP had.

Further, my envisioned converter from DVI to HTML would do just what you are describing.

Or, the DVI file has to put the 'f' of 'foo' at some coordinates (x,y) on the page in some font, say, some bold font. Fine. TeX can handle lots of fonts, the many standard ones and more if want to make routine use of the ability of TeX to handle essentially any font given in the form TeX wants.

Want to create your own fonts? Knuth has been there, done that, and left a terrific tool MetaFont, open source, beautiful documentation. Create all the fonts you want and have TeX use them. Then create equivalent fonts for HTML and that a Web browser can use. Such work with fonts is just making routine use of what TeX has had for decades.

So, from the DVI, write to the HTML the markup string

     <b>f</b>

at a position given by absolute coordinates while also specifying the desired font. That's about all there is to it. Seems quite doable to me.

Want to convert to PS? Okay, from the times I read the big, red Adobe books on PS, converting from DVI to PS is also quite doable. Indeed, there is likely a TeX device driver for that conversion now, as there is from DVI to PDF -- which I use heavily. Indeed, checking, my script for converting DVI to PDF uses EXE

     I:\protex1p2_run\miktex\bin\dvipdfm.exe

and that EXE is standard in the TeX world. It works fine.

Leszek · on Oct 27, 2014

Any reasoning about TeX being able to do things that HTML can't is irrelevant. TeX -> PDF can be done without an intermediate DVI stage using pdftex. There could therefore be a similar "htmltex" which could directly convert TeX -> HTML.

In the same way that pdftex has the advantage of knowing its output format (and can e.g. write pdf metadata), this hypothetical "htmltex" would know that its output is html, and could do things like allowing paragraph re-flow and embedding maths using MathJax.

Of course, this wouldn't be easy, you'd likely need to fork TeX to implement it correctly (or only support a subset of LaTeX features like the current TeX->HTML converters), but it's far from impossible.

graycat · on Oct 27, 2014

You are correct. And I am correct. But we are not taking about even a little bit of the same thing.

Once again I will try to be clear: Knuth's work resulted in a computer program, TeX, as an EXE file, say, tex.exe.

A user of TeX as a word processor types in a file with three letter extension TEX, say, my_math.tex. This file, my_math.tex, actually is a computer program, that is, has allocate-free storage, if-then-else, file read-write, arithmetic, string manipulations, etc. This computer program my_math.tex is not Knuth's program tex.exe.

Yes, maybe not all TeX users have their TeX input files, say, my_math.tex, do file reading or writing, but such file manipulations are just routine usage of TeX that I do nearly always. And I have some TeX macros I wrote that do storage allocation-freeing. Maybe not all TeX users do such things, but they are routine usage of TeX, and I do them.

To be more clear on just why file my_math.tex is a computer program, when Knuth's tex.exe runs file my_math.tex (interpretively), the program my_math.tex can read files. Then the output my_math.dvi can vary depending on what was in the file, say, my_math.dat that program my_math.tex read.

Well, there can be no file my_math.htm that will read a file my_math.dat, that is, read the file and process it like my_math.tex can.

So, if only for this reason, as a result, program my_math.tex can never be translated to a file my_math.htm. And program my_math.tex can't be translated to my_math.pdf or my_math.ps either.

But a file my_math.dvi, from my_math.tex and a particular my_math.dat, can be translated to a file my_math.pdf or my_math.ps.

And in this thread I have been suggesting that there could be a program that would translate my_math.dvi to my_math.htm.

> TeX -> PDF can be done without an intermediate DVI stage using pdftex.

Although this is a small point, for pdftex, I am quite sure that internally a DVI file is generated if only because that is what Knuth's program tex.exe generates and rewriting Knuth's TeX code, likely now in C, say, tex.c, would be both unnecessary and the difficult approach. Just generating the DVI file is the easy approach, even if don't have the user aware of the intermediate DVI file.

What PDFTEX does I do frequently by putting in the extra step of going to DVI and then from DVI to PDF. Fine.

I want the DVI file because I like the DVI preview program I have and like it much more than than using a PDF viewer. When I get something that looks good with my DVI preview program, then usually I go ahead and make the PDF file.

However, what I am doing getting a PDF file and what you are talking about with pdftex are not, in the sense I am discussing, a translation of TeX to PDF. Not at all.

> Any reasoning about TeX being able to do things that HTML can't is irrelevant.

True for what you are talking about. False for my point that a file my_math.tex can't be translated to a file my_math.htm.

Or, for a short explanation, you are saying that a file my_math.dvi can be translated to file types PS and PDF and maybe also HTM, and I agree. But I am also saying that a file my_math.tex cannot ever be translated to a file my_math.htm.

To be still more clear, HTML is a mark-up language, and TeX looks like it is also a mark-up language, so one might try to translate TeX mark-up to HTML mark-up. Well, such a translation is just impossible, and will always be.

DanBC · on Oct 26, 2014

>HTML is free flowing; if the user resizes the window the layout should adapt, they layout has to work on mobile devices, etc. The HTML way asks for a completely different way of designing layout, and Latex is simply not the right tool for that job.

I agree that HTML should re-flow. Let's try with the OP article.

<http://imgur.com/LWrS6pn>