Highlight.js – Syntax highlighting for the Web

rlx0x · on Feb 20, 2015

It should be mentioned that this does not support and will indeed never support line numbers due to some strange opinions of the lead developer:

http://highlightjs.readthedocs.org/en/latest/line-numbers.ht...

dingdingdang · on Feb 20, 2015

I think this comment/discussion ends up over-emphasizing the 'strangeness' of the line-number situation with this library, the page linked ends with: 'This position is subject to discuss. Also it doesn’t stop anyone from forking the code and maintaining line-numbers implementation separately.', which seems pretty reasonable to me!

ajsharma · on Feb 21, 2015

Yeah, this is pretty much the behavior that I want to see from open source maintainers: make decisions and build conventions but be open to having your mind convinced of alternatives.

pothibo · on Feb 20, 2015

http://prismjs.com/ is an alternative that supports line numbers.

luddypants · on Feb 20, 2015

And in particular prismjs lets you select lines of code without selecting the line numbers, which you probably do not want.

codexon · on Feb 21, 2015

Why wouldn't I want that?

If I am copying and pasting code from a website, it usually goes to my editor and I don't want to have to delete the line numbers from it.

d0vs · on Feb 21, 2015

You don't want the line numbers.

codexon · on Feb 21, 2015

Oh if that's what he meant then I read it the wrong way.

financequoll · on Feb 21, 2015

Prism is pretty damn good. Although the name sounds like an NSA plan to intercept all the javascript libraries in the world.

yellowapple · on Feb 20, 2015

I already like this one better, to be honest. It also already has Julia and Rust support. If it were to include Elixir support as well, it'd be perfect.

ericHosick · on Feb 20, 2015

Line numbers are really helpful when presenting code at an event, in a meeting, etc.

gelatocar · on Feb 20, 2015

Genuine question, why?

zyxley · on Feb 20, 2015

"So if you look at line 7, you'll see..."

shironinja · on Feb 20, 2015

... that the presenter should buy a laser pointer!

alblue · on Feb 20, 2015

Laser pointers don't work if the video is being recorded or screencast at the same time. They only work when the viewer is in the same room, and even then, they are typically pretty poor.

hayfield · on Feb 20, 2015

Out of several dozen different uses of laser pointers, I've only ever been able to (easily) see perhaps 3 or 4 of them, plus another small portion if focusing entirely on the game of spot-the-laser-pointer. It's related to my colourblindness.

As such, verbal cues (or a long stick) are preferable to a laser pointer. Of course, use a laser pointer to assist those who can see it, but it's better assume half the people in the room can't and use words to this effect.

luddypants · on Feb 20, 2015

This also applies to comments within the presentation, not just the verbal part of the presentation.

jimmytucson · on Feb 20, 2015

Is line numbers considered "highlighting"? I mean, I get why you would want it but I never would have thought to ask for it.

awhitty · on Feb 20, 2015

It does seem more appropriate to have a separate module for line numbers. I would definitely understand the separation of concerns argument, but the maintainer instead makes some sort of emotional appeal about clutter and simplicity to explain why there is no line number support.

jnbiche · on Feb 21, 2015

> the maintainer instead makes some sort of emotional appeal about clutter and simplicity to explain why there is no line number support

Clutter and simplicity is not an emotional argument. It becomes verifiably more difficult to maintain a code base with each new line. And since he's the one doing the maintaining, his position seems quite reasonable.

Secondly, he seems to have a strong aesthetic aversion to line numbers, and I can totally respect that. Frankly, among popular open source projects, those whose developers have the strongest sense of style and design and usually (not always) the best.

lelandbatey · on Feb 20, 2015

Many popular syntax highlighting libraries do include this functionality. For example, the popular Python syntax highlighter "Pygments" supports line numbers[0].

[0] - http://pygments.org/docs/formatters/#HtmlFormatter

pelhage · on Feb 20, 2015

I personally use Prism.js for my blogging, which is what Smashing Magazine and CSS-tricks uses. Prism includes line numbers.

I like Prism so far and don't have any issues with it yet

Octplane · on Feb 20, 2015

Of course, YMMV

http://uu.zoy.fr/p/T4RwhQ#x=2zHnAK7/HQAxDmEA

yellowapple · on Feb 20, 2015

I understand the author's point-of-view on this, but disabling line numbers sounds like something that should be user-controllable. Right now it feels like a misfeature, whereas "we don't implement line numbers by default, but if you want them, do this and that and you can have them" would feel way more like a good feature to have.

cbhl · on Feb 20, 2015

Has anyone actually tried submitting a patch that implements it yet?

https://groups.google.com/d/msg/highlightjs/UVJaQcQNC1c/1C6U...

aw3c2 · on Feb 20, 2015

It's free software so that's no problem if you feel the need.

chromakode · on Feb 20, 2015

This is my favorite syntax highlighting JS library, by far.

The approach this library takes to building the highlighted DOM tree is very clever. It represents the original DOM and the highlighted code output as separate streams of tag open / close events and then merges them together. This allows the highlighter to maintain the pre-existing DOM structure of the highlighted code! The language auto-detection and sub-language handling are also very neat.

If you like interesting codebases, I'd recommend giving it a quick read. :)

ioquatix · on Feb 21, 2015

Sounds exactly like what I implemented 4 years ago.

Here is where you extract the existing HTML tags: https://github.com/ioquatix/jquery-syntax/blob/97e38d08924d7...

When you brush the code (extract highlighing information) you provide the set of initial matches which are converted into a tree: https://github.com/ioquatix/jquery-syntax/blob/97e38d08924d7... - there is no merge step, they are merged in place in the tree which is then used to generate DOM (or whatever you like really) output.

I never implemented language auto-detection but I'm sure I've seen that implemented before prism before, perhaps Google's Prettify might have been one of the first to do it. It basically highlights using all available brushes and looks at which one matches the most tokens - not sure if this is how it is currently implemented though. My feeling is that you usually know what language you are highlighting, the auto-highlighter wouldn't always get it right, and the cost of loading all brushes/languages is pretty high (imagine for example you have 100 different languages supported, you'd have to do all of them)..

The sub-language handling in jQuery.Syntax is as precise as possible: https://github.com/ioquatix/jquery-syntax/blob/97e38d08924d7...

Basically, if you match some sub-portion of the code, you then highlight that using a different brush. Sometimes it's almost impossible to know though (e.g. diffs which may be incomplete).

accatyyc · on Feb 21, 2015

In my opinion it's irresponsible to use highlighters like these. At least when you use them like advertised. Why does no one bat an eyelid at letting the clients handle the highlighting?

The highlighting (lexing) could (and should) be done ONCE at the server. Imagine some high traffic blogs with hundreds of millions of hits combined - all these clients downloading the highlighting libs + lexers for 50 different languages that will never be used on the blog anyway. The wasted energy here is probably even measurable.

Cons when letting the clients handle syntax highlighting:

- Much slower load times

- Wasted energy

- Can fail because of JS crashing

- Noscript users won't get any highlighting (and your blog targets technical users, right? They are more likely to use no script)

- Slow mobile devices will have a harder time loading the site (or depending on javascript engine - failing highlight)

- (if no site optimization is made) the client will download lexers for languages not used on the blog

Cons when the server do the lexing ONCE:

- NOTHING

Note that this rant isn't targeted at Highlight.js. It's a good highlighter - I use it myself at my blog. Except I use it on the server via a CGI script. It took me no more than 2 minutes to set up.

WA · on Feb 21, 2015

Except I use it on the server via a CGI script. It took me no more than 2 minutes to set up.

How about you write a tutorial about this and submit it to HN? This is probably a better way to make people adapt your approach than a rant, no matter how valid your points are :)

accatyyc · on Feb 21, 2015

That's a good idea! I should make a blog post about it. And maybe be less ranty. ;)

sneak · on Feb 21, 2015

Computing is cheap. Setting up and maintaining that is more human seconds than doing it clientside. I dare you to calculate the amount of energy this wastes on a blog that gets a billion hits.

This is a battle you cannot win and should not attempt to fight.

accatyyc · on Feb 21, 2015

Sure, the energy argument is probably the least interesting of the ones listed above. Don't ignore the rest of them!

Confusion · on Feb 22, 2015

In my opinion it is irresponsible to post strongly worded opinions such as these, because you forget you may not be seeing the whole picture.

  Cons when the server do the lexing ONCE:
  - NOTHING

You are assuming a specific use case, where static code originally exists on the server and the highlighted version is included in a page that is viewed many times.

My use case for these libraries is different: users actually write 'code' in my app, which I wish to highlight as they type it. Of course many users will write the same fragments and energy could be saved by caching the highlighting of those on the server. However, the latency of that solution (roundtrip to server needed to highlight code) and the code complexity (cache expiration is one of the two 'hard' problems in SE) are much worse than for the client-side highlighting solution.

accatyyc · on Feb 22, 2015

That's an entirely different problem I'd say, and the perfect use case for these libraries on a client! If you'd do it server side it would be like a local IDE sending the code to be lexed remotely...

My rantiness (I apologize for that) targets programming blogs/static pages.

civilian · on Feb 20, 2015

At one point I got really into the idea of ASTs for languages. I've used pygments, the python syntax highlighter, with the intention of parsing the meaning of code.

These syntax highlighters are great but I think that underlie the lack of a defined and accessible AST parse definition for most languages. Highlight.js and others kind of just rely regexes-- https://github.com/isagalaev/highlight.js/blob/master/src/la...

It'd be great if we could parse through programming languages to get their meaning. I want to get tuples back!

``` a = 23 ``` would give

(variable(name=a), assignedTo, number(23))

This does exist for some languages, but mostly compiled ones. But it would make the syntax highlighting even more robust! Antlr is the best one around at the moment. http://www.antlr.org/

ioquatix · on Feb 21, 2015

I believe CodeMirror does something like this. It uses restartable parsers (or it used to) to make parsing fast while editing.

I experimented editing code using jQuery.Syntax. I used the match tree it generated to figure out where to restart parsing and it was pretty fast, it would only re-evaluate the current line in most cases.

cben · on March 1, 2015

CodeMirror still uses the restartable parsers. But they produce a flat sequence of styles, not a hierarchical AST. E.g. `foo bar* baz` in markdown becomes foo bar baz*. Also, most parsers reuse a small set of style names (that are covered by themes) without much regard to semantic appropriateness. E.g. markdown lists cycle through 'variable-2', 'variable-3', 'keyword'.

sarciszewski · on Feb 20, 2015

We use highlight.js at Resonant Core for our blog posts. After reading the comments here, I'm considering forking it to support line numbers :)

ericHosick · on Feb 20, 2015

It would be really helpful when presenting at Events.

A lot of events record the presenters screen and the presenter but expect the presenter to stand in one spot so they don't need a cameraman.

So, it does not work very well to point at a screen when presenting.

sarciszewski · on Feb 20, 2015

It should be a trivial patch. I'll see about getting it done tonight.

Gigablah · on Feb 21, 2015

Famous last words ;)

ioquatix · on Feb 21, 2015

jQuery.Syntax supports line numbers and is generally pretty awesome *

* I'm the author so of course I'd say that :)

greggman · on Feb 21, 2015

Anyone know of one of these that supports some kind of callouts?

Basically I want to be able to annotate groups of 1 or more lines, something like this, probably using some kind of inline comment

http://reference.bitreactive.com/reference/images/tutorial/s...

ioquatix · on Feb 21, 2015

jQuery.Syntax does support preserving the original DOM elements, so you could do this.

lukevers · on Feb 20, 2015

I've tried Highlight.js before and I was never really a big fan of it. It seemed too heavy last time I tried it. I've been using rainbow.js for a year or two now.

http://craig.is/making/rainbows/

vonklaus · on Feb 21, 2015

Rainbow doesn't seem to have line numbers. Is there another module available that allows integration of lone numbers?

robbles · on Feb 21, 2015

What's the recommended way of using something like this or prism.js to highlight a simple textarea containing code?

I'm not talking a full-on programming editor, that's a much larger scope. Rather, I have an admin tool with a <textarea> that allows entering short code snippets, and it would be handy to see highlighting as you type. Even at this low level of complexity, is it simpler to just embed something like Ace, or is there an easy way to use a highlighting library?

SloopJon · on Feb 21, 2015

Have you looked at CodeMirror? It might be a better fit for highlighting editable text.

lfpa2 · on Feb 20, 2015

I like it, however, it is possible to make a "click all" checkbox for the "other" languages. You now, I mean: "click, click, click, click ..." ;)

infogulch · on Feb 20, 2015

Paste in console: $('input[type=checkbox]').prop('checked', true);

dstroot · on Feb 20, 2015

You sir are brilliant!

lfpa2 · on Feb 20, 2015

yea! thx!

adamkittelson · on Feb 21, 2015

This isn't really related to the act of highlighting syntax, but I noticed the example code for Elixir at https://highlightjs.org/static/demo/ is quite out of date. Records generally aren't used anymore, and the syntax to send a message to a process has changed from `pid <- message` to `send(pid, message)`

coolmitch · on Feb 20, 2015

Used this in a project[0] a little while ago and found it really easy to work with. The syntax detection was an added bonus that I had expected to need another library for.

http://cmdv.io

edit: I should mention that this might be a cool example for someone looking to use hljs in React/Flux-- I tried to make it fairly clean. There's a github link on the bottom-left of the website.

mmebane · on Feb 20, 2015

Out of curiosity, are there any decent libraries you looked at for doing language detection? Thhat's the main reason I'm using Highlight.js in one of my projects instead of Prism or Rainbow.

Unfortunately, the tradeoff is no support for generating a display with line numbers. In my case, that was the lesser of two evils, but I wouldn't mind using two libraries if it meant I could get everything I want.

coolmitch · on Feb 21, 2015

Just reading out of my notes, the only non-highlighting detector I was looking at was https://github.com/blakeembrey/node-language-detect, but it's not widely used or developed as far as i can tell.

If you have Ruby in your stack, it's easy -- use github's own https://github.com/github/linguist

You might also be able to use just the language definition files from https://github.com/syntaxhighlighter

tzm · on Feb 20, 2015

Here's a hosted version of the HighlightJS developer tool: http://highlightjs-developer.appspot.com/

Planning to open source an http API (similar to http://markup.su/highlighter).

esprehn · on Feb 21, 2015

This is my favorite syntax highlighting library. The best feature is that it returns a continuation of the code being highlighted which means you can highlight code that's being streamed to you line by line. I couldn't find another library that does that.

dougbarrett · on Feb 20, 2015

I use this on my personal site, seen here:

https://www.dougcodes.com/go-lang/building-a-web-application...

Dead easy to implement, just set it and forget it.

adamkochanowicz · on Feb 20, 2015

Is this news? I've been using this for quite a while.

jrvarela56 · on Feb 20, 2015

So, you using it seems like validation that this is a good contribution?

Kiro · on Feb 20, 2015

It's like posting jQuery.

stavros · on Feb 21, 2015

Why prefer this over something server-side, like Pygments, when you can use the latter? I don't see many advantages to the client-side approach.

Confusion · on Feb 22, 2015

You may have users actually writing 'code' on the client side. For those cases a client-side solution gives a better experience that requires less code (no roundtrip to server needed).

ausjke · on Feb 21, 2015

Used this along with markdown to html conversion, works pretty well. will try prismjs.com mentioned here, was unaware of that though.

smoyer · on Feb 20, 2015

How does this compare with Prettify?

jongalloway2 · on Feb 20, 2015

I did a comparison of Syntax Highlighter, Prettify and Highlight.js for weblogs.asp.net. I ended up preferring Highlight.js. My notes:

• SyntaxHighlighter

generates tons of nested tables, doesn’t work well with Bootstrap (overflows into right rail)

• Google Prettify

Worked okay, but kind of ugly. Had to use jQuery to apply “prettyprint” class to all pre elements.

• Highlight.js

Easy to set up, lots of themes, seems pretty quick, regular updates

Sample post: http://weblogs.asp.net/jongalloway/looking-at-asp-net-mvc-5-...

Highlight.js + styles are hosted on cdnjs, which makes it easy to host on a blog.

tomahunt · on Feb 20, 2015

This looks really useful. I'm wondering if there is a way to get modern fortran in the fold.