Hacker News new | past | comments | ask | show | jobs | submit login
PDF Viewing (github.com/blog)
192 points by gjtorikian on March 17, 2015 | hide | past | favorite | 60 comments



Now they are going to show json, xml, yaml, graphviz, audio files, video files, display docx, xlsx, pptx, prettify your minified code, build a database for you from SQL backups and let you query it, set up an environment from your .env files and give you a shell prompt, show you a rendered DOM from your HTML files and virtual dom declarations, display structured data from microformats, microdata and json-ld, let you query structured data from inside files so GitHub can be really used as a database, deploy your app, run your code and search for actual and potential errors, find your lost TV remote, wash your car and cook your dinner.

This is partly serious and I think GitHub is awesome.


Zawinski's law - "Every program attempts to expand until it can read mail. Those programs which cannot so expand are replaced by ones which can." Humorous and relevant.


Is that the same Jamie Zawinski of the "Some people, when confronted with a problem, think 'I know, I'll use regular expressions.' Now they have two problems." [1] fame?

[1] http://regex.info/blog/2006-09-15/247


Yes. https://en.wikiquote.org/wiki/Jamie_Zawinski

Following his blog is a unique (but not always work-safe) pleasure. http://www.jwz.org/blog/


I see the future of GitHub as a new type of operating system. Git as the file system for an OS isn't too surprising but starting with an open, highly collaborative network of users with many forks and interdependent repos at the file-system level makes for an OS that could barely have been imagined on paper.

If they integrate Atom online like Cloud9 then with automatic build\hosting its basically the full stack for developing and using the web where the current GitHub repo viewers are the equivalent of Finder/Explorer. The Google/MS capitulation to GitHub will be seen as a catastrophic handover of power if they can build a moat around this. Maybe the distributed and open nature of both git and the web will prevent any moat? Interesting times.


>Git as the file system


I'm not sure whether this is due more to the efforts of the PDF.js team or the Spidermonkey team, but I'm very pleased at the progress the project has made over the past year. Upon reading this story I happened to realize that I've had a 57-page PDF open in a separate tab for several days now on this five-year-old netbook with no performance or memory problems whatsoever (something that couldn't be said a year ago). I can't even remember the last time that I've had to resort to Foxit. Here's hoping that Shumway can do the same for Flash!


Woo!


That gif went by way too fast.

If anyone is confused like me: It looks like Github will now show PDFs inline.

It took me a minute to figure out that the GIF was a demo (and this had nothing to do with rendering PDFs into realspace using a 3d printing robot), and the page was from github's official blog.


There really needs to be a standard "pausible gif" format for stuff like this


Isn't that a benefit and reason to use html5 video over gif formats?


You mean webm? :)


I actually mean a format where (1) the number of frames is quite small and (2) the pauses between frames are there by default and transitions have to be triggered by a user action.


Sorry about that. I've updated it with a slowed down version :)


If GitHub continues on this track of making content available directly in the browser, the use cases are endless as a service. When someone, and someone will, makes git accessible to the general public, features like these will propel GitHub into the general consumer space.


As it stands, git is apparently the only barrier to GitHub being used widely as a backup, file transfer, and website hosting service.


That's what we attempted to solve with LetsGit[1], a hacky way for non-developers/beginners to use Git to track their work.

[1] http://LetsGit.herokuapp.com https://github.com/xasos/LetsGit


That, and the terms of service.


I've been using GitHub a lot lately for storing my class notes and this is really awesome. I use an app which allows Markdown/mathematical LaTeX and renders it to a PDF and previously they were impossible to view on mobile because of how raw assets are served.


Which app do you use?



I use MacDown for the same purpose - keeping class notes. I write in markdown and export to pdf. I would definitely recommend it to anyone.


Checkout the pdfjs test suite, if you want to try it:

https://github.com/mozilla/pdf.js/tree/master/test/pdfs


I don't understand it... what's the benefit of this as opposed to just embedding an iframe? And whats the role of pdf.js here? They seem to just show images of the pages - non interactive and non-selectable. When you are using pdf.js, you can actually get a nice embedded viewer exactly like firefox has (and similar to chrome). And the embedded pdf.js viewer would also be more secure than the iframe, if that's your concern.


My guess is that it can then show deltas. I like how they show changes in graphics in e.g. PNG files


No its actually pdf.js rendered on a canvas [0]. But the text is non-selectable it seems?

[0] see https://github.com/papers-we-love/papers-we-love/blob/master...


Wow, i don't know if it's because the pdf here is big but my computer froze when I opened this link.


That should definitely not happen. Does that consistently happen when you click that link?


Perhaps PDF.js is being used on the backend to render the images of each page.


If was so, Poppler or MuPDF were better solutions for better rendering engine and much better performances.

But Github uses client side the PDF.js, that draw the content of parsed PDF file in a canvas.


I was not aware of github's new STL viewing ability. Viewing models from your tools (such as from the source code that mentions them) is something TempleOS has been able to do for a long time now [1].

1. https://youtu.be/naORDnsiht4?t=41s


Any sane reason to store PDF files into git repositories? Why would you version binary files; version LaTeX, or whatever's generating your PDFs.


I'm writing my dissertation (music theory) with LaTeX; the text is all stored in git, and all of my examples are PDF files in a submodule (so the main repo doesn't get huge). I need to have the examples tracked, and the submodule allows me to pin a particular version of the examples with a particular version of the text. If I need the PDF of a chapter I sent to my advisor, for example, I can just check out that tag and run xelatex again to generate the real document.

The example PDF files aren't easily generated with TeX, so I make them in Illustrator and store them with Git. They change a little less frequently than the text itself, and usually once an example is done it doesn't get updated very frequently. In my case, at least, I can't easily version the "raw source" for the examples, as the Illustrator binary files are ~3x larger than the resulting PDFs.

I suppose I could use something like git-annex, but every time I've tried I haven't really been able to get the hang of it, and the time I'd spend learning it is better spent actually writing the thing.

(Incidentally, this new Github feature makes it much easier to browse my examples repo!)


Versioning binary files is a wasteful, but people still frequently need to do it (or at least put a static, unchanging PDF into a repository). For example, I might get set of feature requirements in PDF form and include that in my repository just so anyone who clones it can have the same requirements file that I do.

In cases where you have to include a PDF, it's very helpful to have it be viewable inline on a page.


The alternative is to store pdf-from-client-v1.pdf, pdf-from-client-v2.pdf, etc


That doesn't address the proposed alternative: version the actual source file (eg: lated?) rather than the PDF file.


I was really hoping for them to implement this, because I like to keep relevant scientific papers next to the code. If for example I were to implement an algorithm described in a publicly available paper, I'd much rather have it sit right there in the repo instead of having a link to some other site (which may or may not be reachable at any future point in time).

Also I view PDFs as only "half binary". It's just a bunch of text streams, some of which may be compressed with deflate. (Granted, there's also stuff like embedded images, fonts, etc.)


Design comps/wireframes are frequently shared as pdfs. Designers do not like storing their source files in git.


They seem to be rendering the documents at 2x, which is good for retina and 4k displays, but means that it's slower on low-res displays, and the text's anti-aliasing is at the mercy of the image scaling algorithm used.

Instead, they should render the documents at the actual resolution it's being displayed with.


This seems potentially useful for people on some platforms, but speaking as someone who uses Safari on OS X, I'd really much rather have the PDF be embedded as an <object> or <iframe> and let OS X do the rendering.


It would be nice if I could make the README a PDF file, so it would render the PDF on the front page of the repo. Like so: https://github.com/brenoc/testrepo

Spoiler: It didn't work


Did you really just grap a HTML-file und renamed it to .pdf? Surely this doesn't work.


Unfortunately, I can't get a single paper to load on https://github.com/papers-we-love/papers-we-love, at least on my laptop. I guess PDF.js is limited to smaller PDF files or it just requires more GBs of RAM to support all file sizes.


https://github.com/papers-we-love/papers-we-love/blob/master... loaded fine here. My computer is almost 10 years old.


No bueno! Any specific PDF file? We only render a small number of pages by default to ensure browsers can handle it :)


I'm curious, which pdf's are you referring to?


Works perfectly here. Firefox 36.0.1.


Works on mac too.


This is unrelated to the news, but what's the best way to create this sort of screencast in OS X?



With ffmpeg. It's the swiss army knife of media tools, and it can capture from your desktop [1].

[1] https://trac.ffmpeg.org/wiki/Capture/Desktop


I used Quicktime to do a screen recording before taking that to GifBrewery to make the gif.


I do this too- works great and GifBrewery is a nice tradeoff of quality and usability.


Glad to see that they've adopted pdf.js!


Congrats to the PDF.js team for the adoption.

I'm curious as to how the pdf2htmlEX project [0] compares to PDF.js. Instead of client-side conversion this is a server-side one-time conversion from PDF to HTML5.

[0] https://github.com/coolwanglu/pdf2htmlEX


I tried in past and I can tell you that pdf2htmlEX don't have acurate rendering, even if is based on Poppler, because generates HTML code at absolute position that looks different in every browser and have problems with index of layers.


is there a way to render pdf into html that can be viewed in a browser?


pdf.js. It's the same thing github is using (as stated on the article), and it's also what firefox uses to render PDFs (yes, firefox renders PDFs with html+js).


Doesn't Chrome also use html+js to render PDFs? (but something different from pdf.js)


Chrome uses a pdf plugin embedded in Chrome.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: