You are right on the loss: it's purposefully introduced as a quantization step a...

smith7018 · on Aug 1, 2013

Man, I hope I'll know as much as you do one day... Congrats on the app! I just bought it and I'm really liking it! One suggestion; would outputting to a gif file be too difficult? I could really see people using this to create gifs of their lives that they could share to Tumblr or Facebook.

jpap · on Aug 1, 2013

Thanks! :-) To be honest, none of my formal training prepared me for what went into SnappyCam---only the "how" to go about learning to do so. I'm sure you could pick up a few tricks in the same way by working on a cool pet project or two.

With so many recent requests for AGIF, I'm absolutely going to add it to the app. (It's been on my list for a while, but lower priority than getting up a solid core product and the social sharing that is in development at the moment.)

zxcdw · on Aug 1, 2013

Out of curiosity, what kind of educational background do you have if you don't mind the question? As a highschool dropout it interests me. :)

jpap · on Aug 1, 2013

I'm Aussie, so I did my schooling in Melbourne, Australia.

I don't recommend the same path for everyone, but my ugrad at RMIT University, dual bachelors of EE and CS. I then went on to do a Ph.D at the University of Melbourne in EE. My dissertation was on mathematical optimization of wireless and wireline DSL. (Prof Jamie Evans is an awesome guy if you're on the lookout for an advisor!)

I've been in SFO for just over 5.5 years, and started SnappyLabs after winning the "greencard lottery".

SnappyCam, you might say, is the embodiment of both a very practical ugrad and somewhat applied but very theoretical pgrad.

prawn · on Aug 1, 2013

Congrats fellow Australian. Two of us here in an office in Adelaide just bought the app.

Guy behind me said "Hey, this is pretty cool. Have you seen... oh, you're looking at it already."

megablast · on Aug 1, 2013

Hi, what is Adelaide like to work in? I assume you guys are in IT? From what I hear, it is mostly government work.

prawn · on Aug 2, 2013

I run a two-person web development business and have worked out of Adelaide for myself for 15 years. Always seems to be enough work around and a decent amount of variety.

The other SnappyCam purchaser I mentioned is an iOS/Android developer who is a sub-tenant in my office. We actually have a cheap, spare desk in this room at the moment if you wanted to visit and work for a while. Email me if you want to ask any questions.

nl · on Aug 1, 2013

Adelaide is good. Plenty of gov & defence work if that is what you are after, but I've worked here nearly 15 years and never done either (ok, my current place is quasi-government, but still..)

dorian-graph · on Aug 1, 2013

I think all us Australians are coming out right now to congratulate him, haha. /Brisbane

jpap · on Aug 1, 2013

Onya mate! That's bloody awesome. :D

lukego · on Aug 1, 2013

Impressive stuff! And just in time to capture some action shots of my puppy :-) -Appreciative fellow Australian, in Switzerland :)

jpap · on Aug 1, 2013

Oi oi oi! :D

wluu · on Aug 1, 2013

high five

Another fellow Australian and Melburnian.

Well done on the app John!

danpat · on Aug 1, 2013

Hah, I told you getting your PhD was a waste of time ;-)

jpap · on Aug 1, 2013

LOL! Thanks @danpat. :-) How are you going these days? I heard you might be moving to the Big Apple?

mansr · on Aug 1, 2013

Motion estimation/prediction is used video coding because it minimises the compressed size. However, it is incredibly expensive to perform the motion search. A typical video encoder spends well over half its CPU time in this stage. After motion estimation, the residual image is encoded in the usual way, so speed-wise the motion search is a complete loss.

portmantoad · on Aug 2, 2013

One thing I've gotten very into recently is multishot techniques, where I take multiple shots in burst mode, align them, and then average them to reduce sensor noise. Similar to what http://www.photoacute.com/ does but they do more advanced superresolution stuff that your invisible noise might preclude, but a simple average or median in areas where there isn't too much motion really improves the quality quite dramatically in some cases, particularily in low lighting situations. If your frame rate is that high it probably wouldn't be hard to get a really good alignment between frames, so I thought I'd bring this up as a thought for a future feature (as in - you select the frame you like, then the program grabs a few frames immediately before and after and uses them to increase the image quality of the final output).

svantana · on Aug 1, 2013

SIMD, you say? Are you relying mainly on NEON optimizations or are you also doing encoding stuff on the GPU? Very impressive performance I must say!

jpap · on Aug 1, 2013

Thanks! :-)

I first tried using the GPU, using old school GPGPU textures and OpenGL ES 2.0 shaders, but unfortunately the performance wasn't there for a variety of reasons given in [1].

SnappyCam has since been making extensive use of ARM NEON for the JPEG codec and a bunch of other image signal processing operations, like digital zoom. It's a great instruction set!

[1] http://www.snappylabs.com/blog/snappycam/2013/07/31/iphone-k...

igravious · on Aug 1, 2013

Just curious, I know nothing about low level ARM stuff. I was wondering is this iPhone/Apple specific tech or is the work you've done portable to other mobile platforms? Congrats on what you've done. I couldn't quite work out whether you've optimized the hell out of the standard DCT algorithms or whether you've come up with new algorithms. If it's the latter would you be able to publish them or would that give away too much secret sauce? ;-)

nine_k · on Aug 1, 2013

No, NEON is an ARM-specific tech, and is widely available and used e.g. on Android smartphones.

It's like MMX / SSE of x86 world, a set of extra instructions to process many small integers in parallel in one instruction. Since image data are usually independent 3-byte pixels (or 3 planes of 1-byte subpixels, one per color channel), NEON is great for many image-processing tasks.

See e.g. https://en.wikipedia.org/wiki/ARM_architecture#Advanced_SIMD...

lcrs · on Aug 1, 2013

Re video's "studio swing" dynamic range, the YUV components do have a different encoding range to those in JPEG, but if you expand them back out to 0-255 the image is in fact the same - you lose a lil' fraction of your bit-depth but no dynamic range.

I think you definitely made the right choice though - it's interesting that the obvious delta-coding and motion compensation tricks to reduce bandwidth are rarely used for video acquisition apart from the most limited devices like phones, stills cameras and the GoPro. Everything that can afford to uses per-frame coding like ProRes, REDCODE, AVC-Intra, DNxHD, Cineform... being able to seek quickly is important!

In fact Canon's 1DC 4k camera uses (dun dun duhhhhh...) motion JPEG :)

revelation · on Aug 1, 2013

It's just bizarre that you would be doing the complete JPEG process at the instant you get the image from the sensor. As you note, there are a plethora of steps that JPEG performs, from color space conversion, to DCT transformation (essentially a gigantic matrix multiplication), Huffman coding, quantization, arithmetic coding and encoding as JPEG bitstream.

The only reason would be that you are pressed for memory or bandwidth, but certainly you have the resources to store one full frame and produce deltas, or just apply part of the JPEG chain, enough to remedy memory pressure. You can always encode it to an actual JPEG after the process.

And yes, pulling single frames from a completely encoded video isn't helpful, because they can get away with more compression. But there are very sophisticated algorithms for eliminating the redundancy between frames, which would have been my first avenue in attempting to do something like this.

gruseom · on Aug 1, 2013

which would have been my first avenue in attempting to do something like this.

Have you attempted to do something like this? Because he not only has attempted, he's done it. Therefore I think you should stop talking down to him ("completely pointless", "it's just bizarre", "my first avenue"). It comes across as wanting to prove how smart you are instead of seeking to learn from someone who has done incredible work and—lucky for us—is bursting at the seams with enthusiasm to share it.

Oh and congratulations jpap on what's looking like the most successful and technically solidest HN launch in quite some time! I hope your hard work pays off.

eclipxe · on Aug 1, 2013

Thank you for saying what I'm sure a lot of us were thinking.

zimpenfish · on Aug 1, 2013

wild applause

tptacek · on Aug 2, 2013

This whole thread should be framed and hung in the HN lobby.

jpap · on Aug 1, 2013

Oh, I actually do both---see the other thread on the topic.

I buffer as much as possible, while also encoding on any other cores that are available.

There's a reason why a "simple camera app" can total some 80 kLOC. :)

philhippus · on Aug 1, 2013

Why is this comment getting downvoted? I see that it adds insight to the subject and makes some good points, which OP even acknowledges. The written complaint is that revelation is not being deferential enough, which is bullshit on an in-depth technical discussion.

kelnos · on Aug 1, 2013

There's a difference between "not being deferential" (which I don't think the complaint is) and talking down to someone who's clearly built something very cool. Put another way, there's a difference between dismissing someone's approach and asking questions to try to understand it better.

gruseom · on Aug 1, 2013

"Deferential" has nothing to do with it. In a technical discussion, the focus should be purely on the content. Inserting one's self into it (such as by being supercilious) detracts from that.