Hacker News new | past | comments | ask | show | jobs | submit login
Scientific Breakthrough Lets SnappyCam App Take 20 Full-Res Photos Per Second (techcrunch.com)
460 points by Osiris on Aug 1, 2013 | hide | past | favorite | 253 comments



DCT is already lossy [1], so the statements around 8 megapixels are completely pointless, and worst of all, its 1990 lossy technology. Wavelet transformations completely destroy any DCT.

That said, if their emphasis is on producing pictures with minimal time delta at highest resolution, algorithms used for still pictures are out of place. Video compression algorithms still use DCT and wavelets, but they do so only after they have reduced redundancies between series of pictures, a process that tends to work significantly better than anything you can get out of these lossy transformations when you want to preserve quality.

Of course, eliminating redundancy in a series of pictures might have tipped them off to the fact that the image sensor isn't actually producing fresh pictures at the rate they want.

1: as used in JPEG. The transformation itself is perfectly invertible, assuming infinite precision arithmetic.


You are right on the loss: it's purposefully introduced as a quantization step after performing the DCT, and before losslessly compressing the resulting coefficients with Huffman and encoding to the final JPEG bitstream.

Despite all of that, JPEG has now become computationally tractable. I remember the days where it took tens of seconds to encode a JPEG on a commodity machine. Now, with the help of SIMD, we can encode a high quality image in msec on a mobile device.

Fortunately you can choose the quantization matrix that determines the amount of loss. Even if you were to choose a unitary matrix, no human, not even superman with his laser eyes, can "detect" the quantization noise.

For SnappyCam, I chose to invest in JPEG a little more because it's a ubiquitous standard for still image compression.... and with the right hardware and algorithms, quite tractable.

I'll consider adding a JPEG "quality setting" so you can choose the amount of loss introduced... sounds like a great idea to me.

The idea behind SnappyCam was also to code each picture independently, and not rely on motion prediction or video codecs. If you try and pull a single frame from a HD video you might be disappointed: they compress the YUV dynamic range (studio swing) and it looks washed out, even if you land on an i-frame.

Lastly, as far as I can tell, the image sensor is yielding complete scans with each frame. I'd hazard a guess to say that any motion prediction or frame deltas might actually slow the whole chain down.


Man, I hope I'll know as much as you do one day... Congrats on the app! I just bought it and I'm really liking it! One suggestion; would outputting to a gif file be too difficult? I could really see people using this to create gifs of their lives that they could share to Tumblr or Facebook.


Thanks! :-) To be honest, none of my formal training prepared me for what went into SnappyCam---only the "how" to go about learning to do so. I'm sure you could pick up a few tricks in the same way by working on a cool pet project or two.

With so many recent requests for AGIF, I'm absolutely going to add it to the app. (It's been on my list for a while, but lower priority than getting up a solid core product and the social sharing that is in development at the moment.)


Out of curiosity, what kind of educational background do you have if you don't mind the question? As a highschool dropout it interests me. :)


I'm Aussie, so I did my schooling in Melbourne, Australia.

I don't recommend the same path for everyone, but my ugrad at RMIT University, dual bachelors of EE and CS. I then went on to do a Ph.D at the University of Melbourne in EE. My dissertation was on mathematical optimization of wireless and wireline DSL. (Prof Jamie Evans is an awesome guy if you're on the lookout for an advisor!)

I've been in SFO for just over 5.5 years, and started SnappyLabs after winning the "greencard lottery".

SnappyCam, you might say, is the embodiment of both a very practical ugrad and somewhat applied but very theoretical pgrad.


Congrats fellow Australian. Two of us here in an office in Adelaide just bought the app.

Guy behind me said "Hey, this is pretty cool. Have you seen... oh, you're looking at it already."


Hi, what is Adelaide like to work in? I assume you guys are in IT? From what I hear, it is mostly government work.


I run a two-person web development business and have worked out of Adelaide for myself for 15 years. Always seems to be enough work around and a decent amount of variety.

The other SnappyCam purchaser I mentioned is an iOS/Android developer who is a sub-tenant in my office. We actually have a cheap, spare desk in this room at the moment if you wanted to visit and work for a while. Email me if you want to ask any questions.


Adelaide is good. Plenty of gov & defence work if that is what you are after, but I've worked here nearly 15 years and never done either (ok, my current place is quasi-government, but still..)


I think all us Australians are coming out right now to congratulate him, haha. /Brisbane


Onya mate! That's bloody awesome. :D


Impressive stuff! And just in time to capture some action shots of my puppy :-) -Appreciative fellow Australian, in Switzerland :)


Oi oi oi! :D


high five

Another fellow Australian and Melburnian.

Well done on the app John!


Hah, I told you getting your PhD was a waste of time ;-)


LOL! Thanks @danpat. :-) How are you going these days? I heard you might be moving to the Big Apple?


Motion estimation/prediction is used video coding because it minimises the compressed size. However, it is incredibly expensive to perform the motion search. A typical video encoder spends well over half its CPU time in this stage. After motion estimation, the residual image is encoded in the usual way, so speed-wise the motion search is a complete loss.


One thing I've gotten very into recently is multishot techniques, where I take multiple shots in burst mode, align them, and then average them to reduce sensor noise. Similar to what http://www.photoacute.com/ does but they do more advanced superresolution stuff that your invisible noise might preclude, but a simple average or median in areas where there isn't too much motion really improves the quality quite dramatically in some cases, particularily in low lighting situations. If your frame rate is that high it probably wouldn't be hard to get a really good alignment between frames, so I thought I'd bring this up as a thought for a future feature (as in - you select the frame you like, then the program grabs a few frames immediately before and after and uses them to increase the image quality of the final output).


SIMD, you say? Are you relying mainly on NEON optimizations or are you also doing encoding stuff on the GPU? Very impressive performance I must say!


Thanks! :-)

I first tried using the GPU, using old school GPGPU textures and OpenGL ES 2.0 shaders, but unfortunately the performance wasn't there for a variety of reasons given in [1].

SnappyCam has since been making extensive use of ARM NEON for the JPEG codec and a bunch of other image signal processing operations, like digital zoom. It's a great instruction set!

[1] http://www.snappylabs.com/blog/snappycam/2013/07/31/iphone-k...


Just curious, I know nothing about low level ARM stuff. I was wondering is this iPhone/Apple specific tech or is the work you've done portable to other mobile platforms? Congrats on what you've done. I couldn't quite work out whether you've optimized the hell out of the standard DCT algorithms or whether you've come up with new algorithms. If it's the latter would you be able to publish them or would that give away too much secret sauce? ;-)


No, NEON is an ARM-specific tech, and is widely available and used e.g. on Android smartphones.

It's like MMX / SSE of x86 world, a set of extra instructions to process many small integers in parallel in one instruction. Since image data are usually independent 3-byte pixels (or 3 planes of 1-byte subpixels, one per color channel), NEON is great for many image-processing tasks.

See e.g. https://en.wikipedia.org/wiki/ARM_architecture#Advanced_SIMD...


Re video's "studio swing" dynamic range, the YUV components do have a different encoding range to those in JPEG, but if you expand them back out to 0-255 the image is in fact the same - you lose a lil' fraction of your bit-depth but no dynamic range.

I think you definitely made the right choice though - it's interesting that the obvious delta-coding and motion compensation tricks to reduce bandwidth are rarely used for video acquisition apart from the most limited devices like phones, stills cameras and the GoPro. Everything that can afford to uses per-frame coding like ProRes, REDCODE, AVC-Intra, DNxHD, Cineform... being able to seek quickly is important!

In fact Canon's 1DC 4k camera uses (dun dun duhhhhh...) motion JPEG :)


It's just bizarre that you would be doing the complete JPEG process at the instant you get the image from the sensor. As you note, there are a plethora of steps that JPEG performs, from color space conversion, to DCT transformation (essentially a gigantic matrix multiplication), Huffman coding, quantization, arithmetic coding and encoding as JPEG bitstream.

The only reason would be that you are pressed for memory or bandwidth, but certainly you have the resources to store one full frame and produce deltas, or just apply part of the JPEG chain, enough to remedy memory pressure. You can always encode it to an actual JPEG after the process.

And yes, pulling single frames from a completely encoded video isn't helpful, because they can get away with more compression. But there are very sophisticated algorithms for eliminating the redundancy between frames, which would have been my first avenue in attempting to do something like this.


which would have been my first avenue in attempting to do something like this.

Have you attempted to do something like this? Because he not only has attempted, he's done it. Therefore I think you should stop talking down to him ("completely pointless", "it's just bizarre", "my first avenue"). It comes across as wanting to prove how smart you are instead of seeking to learn from someone who has done incredible work and—lucky for us—is bursting at the seams with enthusiasm to share it.

Oh and congratulations jpap on what's looking like the most successful and technically solidest HN launch in quite some time! I hope your hard work pays off.


Thank you for saying what I'm sure a lot of us were thinking.


wild applause


This whole thread should be framed and hung in the HN lobby.


Oh, I actually do both---see the other thread on the topic.

I buffer as much as possible, while also encoding on any other cores that are available.

There's a reason why a "simple camera app" can total some 80 kLOC. :)


Why is this comment getting downvoted? I see that it adds insight to the subject and makes some good points, which OP even acknowledges. The written complaint is that revelation is not being deferential enough, which is bullshit on an in-depth technical discussion.


There's a difference between "not being deferential" (which I don't think the complaint is) and talking down to someone who's clearly built something very cool. Put another way, there's a difference between dismissing someone's approach and asking questions to try to understand it better.


"Deferential" has nothing to do with it. In a technical discussion, the focus should be purely on the content. Inserting one's self into it (such as by being supercilious) detracts from that.


> Wavelet transformations completely destroy any DCT.

It's not quite that simple. They have different strengths and weaknesses, so you can't say one is categorically better than the other. DCT has uniform frequency resolution, which sounds desirable but isn't once you start quantizing due to ringing, etc. See slide 25 or so of [1]. Wavelets overcome this problem by adapting the resolution in opposite ways at the extremes. This works fantastic for medium signal rates, but can have severe low passing at low rates. For low rate coding DCT with the bells and whistles (deblocking, etc) will typically win.

[1]: http://people.xiph.org/~tterribe/pubs/lca2012/auckland/intro...



Wow, I'm usually pretty skeptical about these new niche social networks, but I've gotta say that is really really cool.

Great work on the "living photo" idea -- that kind of adds this magical sort of feeling to it, which, I've gotta say, I felt.

I hope your hard work pays off. You've built something special here :)


Thanks mate---comments like yours make my day. :-)

I'm not actually planning on creating a new niche social network. I think, given the vastness of some of the ones that've emerged of late, it would be more fruitful to leverage the ones that exist today.

Can't wait for you to see it!


Given that niche social/mobile networks for short videos are so hot right now (ie. Vine, Instagram Video), it might be more fruitful for you to plan on creating a new one.


+1 I think this could actually take off as a niche social/mobile network. (Despite how ridiculous that sentence sounds.)


The Cinemagram guys started off down that path, an Instagram clone for animated GIF cinemagraphs, then quickly pivoted toward short videos to compete with Vine and Instagram on similar territory.

I'm still not convinced. :-)


Get convinced. These could be a lot of fun. You just need some help with the design/interface, but the tech is great.


Yeah, I think there is a niche for this. One of the main criticisms of vine is that there is a certain magic to pictures compared with video which comes off as more real. We don't want to save the reality of our lives, just the filtered moments.

I really think this could fill a space in between the perfect single shot and the realness of video.


Glad to hear it. I feel the same on the format: it's closer to a photo on the left of the spectrum between photo and video, where vine/instagram/cinemagram are on the far right.

You've put it very nicely--there's definitely some magic about a "silent moving picture".

It requires and sparks the imagination in the viewer, evoking emotion perhaps as easily as a carefully crafted cinematic short.


It's also a photo with more context. If the presentation allowed the owner to pick the starting point, and users to slide forward and backwards, commenters could suggest other funny/interesting frames.

The work required to set up a basic social network around this would surely be insignificant when compared with what you've done to date! Yell out if you need help on the design side of things.


It honestly looks like the photos from the Harry Potter movies. Love it! Great work!


Nice find! It's going to form part of the Facebook Timeline integration that's coming soon... as well as embeds, though Josh already linked to the "press samples" page in the article. :-)


Just for your info, the top link actually crashed my Windows Phone 8's Internet Explorer. Clearly that's an IE bug and nothing else, but I thought you should know :-)

Keep up the good work!


That's unfortunate. I suspect there's a JS memory leak in the viewer, as it will cause a crash on iOS Safari after prolonged use. It's on the list.

Thanks for letting me know!


Do these work for you? When I load the site, I see the photo scrub on the right hand side of the screen - and the slider moves up and down...but I don't see the big image loaded in the center of the screen where I expect the image to be.

Unless it takes a while to load - in which case I was just being uber impatient.


I saw this a couple of days ago. Are you using Chrome?

It might be yet another Chrome canvas bug. :-(

Try Safari and let me know if you can? :-)


Does not work in FF 22.0, neither in Chrome: 28.0.1500.71 or Opera. (OS: Linux Mint)


Ouch. Looks like some work for me ahead.

The problem with Chrome 28.0.1500.x (.95 here) is troubling me. It seems a more recent problem that I'm convinced is another browser bug.

Thanks for the detailed version report, that's really going to help. :-)


Hm. Works on Chrome 30.0.1582.0 (Canary) - after repeated page refreshes. Was stuck on 98% several times. Disabling the cache doesn't allow me to repro it, emptying the cache doesn't either.

Works with 28.0.1500.95, too.

However, looking at the console, I see occasional instances of Resource interpreted as <blah> but transferred as MIME type <foo>. Not a big deal, but maybe a pointer.


Works fine in Firefox, FWIW.


OMG, it works on IE 10! Congrats!

But not working on Chrome or FF :|


What version(s) are you on? I spent a lot of time checking in more recent versions of FF after developing the site.

I do pretty much everything on OS X these days, but I did fire up a couple of VMs with Win XP, Win 7, and Win 8 to test.

I did my entire dev on Chrome, and apart from what appears to be a new canvas issue at present, I'm not having any major issues. (I'm looking into the canvas problem.)


It also doesn't work for me in Chrome on Win8. It does work in IE10, though.


It shudders me to think that IE won over Chrome. :(

I saw an issue also reported on here earlier today a few days ago. Sounds to me like a (new) canvas bug in Chrome.

There are unfortunately several workarounds for browser bugs in the HTML5 viewer. The AS3 Flash port was surprisingly very solid. I can't wait to share more info on that when it's time for another major release.


Chrome on Mac here. I was also worried it might not work, as after 100% it still took a long time (didn't measure, but maybe 10 seconds?) to actually show anything. Almost closed the tab, lucky I did non.

Cool app + site, congrats! Would love to use this to analyze my disc golf throws, and share with my fellow disc golfers.


Chrome 28 on Win8 here - works great, thank you for making it! Even though I personally can't use it since I'm on Android.


Impressive job! But I confirm the webapp does not work for me either, it stays blocked at 0%. I'm using FF (Nightly) 25.0 on Ubuntu 13.04 x86_64.


This is neat tech and works pretty much as advertised, but man, this UI is pretty rough. The blue background and curvy borders are strangely superfluous; tapping the left-bottom corner controls pops up an intermediate selector but the right-bottom controls work in-place; taking a shot produces a big "infinity" symbol that fades in and out of view -- I don't know what it means.

Good work on tech, please hire a UX specialist :-)


Fair comments, and much appreciated.

I did all of the graphics design myself, in the app and on the web. :-)

The infinite sign you see does require an explanation. I'll take your advice and think about how it can be done more simply.

It's basically telling you, the user, that the capture buffer has filled, and you're now dropping (some) shots.


Have you thought about displaying a semi-transparent bar to show the buffer? Or maybe a one or 2 px white mark creeping up the side of the screen (turning to red as it get towards the top)?

Just some thoughts. If you have a buffer, and I'm gonna get fubarr'd if I hit the limit, you should probably show me the buffer (not just a warning that it's too late).


Versions 1.x.x of SnappyCam had a linear buffer [1] but I felt it was distracting.

I generally can see the "end" of the circular buffer around the shutter button, so it doesn't seem to be an issue for me. Perhaps I tend to touch it on the lower-right instead of dead-center.

I made an effort to support lefties in the UI (see Advanced Settings), but the buffer doesn't spin the other way just yet. (To be honest, I've had to deprioritise that in favour of other features.)

Are you left handed?

[1] Yes, that's me jumping near the GG bridge. I'm quite good at it now, as you can imagine: http://a3.mzstatic.com/us/r1000/085/Purple/v4/c5/06/d5/c506d...


No, not left handed, but I didn't get the infinite visual cue until you explained it here or the border.

The red bar on the bottom woul probably be fine of you could make it like 70% transparent until it gets toward the end, the vacillate it between 0% and 50% so it looks like its flashing. Some visual indicator that I should be paying more attention to it.


I hadn't realized the line around the shutter button was supposed to change at all. That's definitely not where I'd put it.


Nice crotch shot jpap :-)


A red thing might look too much like a "Recording" dot, though.


I think it's fantastic that you've managed to turn a long, hard optimisation slog into a real product win. Add me to the list of Australians willing to buy you a beer - but not back home, I live in SF at the moment :)

I'm curious about the low-quality preview you get when scrolling through all the shots. Are you storing low-quality data separately or do you also have a fast, low-qual JPEG decoder? (Is the Huffman encoding between blocks independent?)


Hey Andrew, would love to catch up over a beer. :-) Drop me a note via email: jpap {at} snappylabs.com

You've got a good eye: as part of the JPEG image compress, I also generate a low-resolution thumbnail that's embedded into each file as Exif metadata (along with geotagging, and other camera settings that define the shot, like exposure).

They are used as a "first-in" placeholder for an image.

The full image is then downsampled and decompressed simultaneously [1] exploiting the fact that the (Retina) screen resolution is often much lower than the full JPEG resolution.

As soon as you start zooming, the image is decompressed yet again at the full resolution and replaced in-place as quickly as possible so hopefully you won't see it. :-)

[1] As outlined in http://jpegclub.org/djpeg/ the technique relies on the fact that the top NxN DCT block of MxM coefficients, N < M, can be inverted to form a NxN pixel lower-resolution image of the original MxM block. When N is {1, 2, 4} a fast inverse DCT algorithm can be used with great success.

In fact, N == 1 is a trivial inversion and it might be tempting to use it as the low-resolution image instead of a thumbnail, but you still have to unpack all of the DCT coefficients to get to it, which can be expensive (Huffman).


I have a feeling that soon SnappyLabs is going to have Apple knocking on their door with a very nice offer.

Kudos to them, sounds like they deserve it.


Thanks!

I just hope Apple's engineers don't get pissed off by the press. SnappyCam is built on their hardware, which can do remarkable things.

Though we as app developers don't get access to a lot of their smarts, e.g. hardware JPEG codecs, I'm sure there's even more innovation in their work that often goes unacknowledged.


You'd think so, but they take 2-10s to recover from a photo and you ... don't. Clearly there's some room for optimization there. :)


Nice to see Jpap continuing to push the boundaries of what's possible.

Aussie maths whiz supercharges net http://www.smh.com.au/articles/2007/11/05/1194117915862.html


@jpap, are the results shown in the article being used today?


You'd have to ask Ericsson. ;-) I certainly hope so!


I vaguely remember reading that they (the Ericsson patents) were included in the latest VSDL2 specs.

Could be rolled out as part of the NBN pending Australia's election result.


Thanks Raphael! :D

It's been an awesome, surreal experience, very much reminiscent of 2007. Fun times, relived. :-)


Wow, amazing performance tuning, so rare these days!

However, you should be careful with this online ARM simulator. It simulates Cortex-A8 while iPhone 5 runs on Apple Swift, two generations ahead. It very likely has different instruction timings compared to Cortex-A8. I didn't have a chance to test Swift, but here is a list of what might be different, judging by Qualcomm Krait and ARM Cortex-A15, which are in the same generation:

- Instead of 2-cycle latency on Cortex-A8 simple ALU instructions might have 3-cycle latency on Krait (this is the case on Krait and Cortex-A15).

- Cortex-A8 can issue only 64-bit SIMD multiplication per cycle, Swift probably can do 128-bit VMUL.Ix each cycle (Krait does).

- Cortex-A8 can issue only one SIMD ALU instruction per cycle, Swift probably can do more (Cortex-A15 can issue 3 128-bit VADD/VAND/etc in 2 cycles).

- Cortex-A8 could issue one SIMD ALU + one SIMD LOAD/SHUFFLE per cycle, Swift could be less restrictive (and probably even can issue 3 NEON instructions per cycle, like Cortex-A15).


That's really cool, Marat. Thanks for the additional info on the A15 and Swift.

It's a lot of work to optimize the assembly code to each ARM variant, but glad to know that Swift will generally run the same code at the same or faster speeds as the Cortex-A8.

The 3-cycle latency on simple ALU instructions is a bummer, but fortunately I use them sparingly for computation as compared to NEON. (They're great for pointer arithmetic and computing image row strides.)

The multiple issue of an ALU + LOAD is awesome. That would definitely help some of my routines.


The 3-cycle latency refers to simple NEON ALU instructions (VADD.Ix, VORR, VAND, etc). Scalar ALU instructions are still single-cycle. Note that these numbers are from Cortex-A15 and Krait which are expected to be similar to Swift, but I didn't measure Swift itself to know for sure.


This looks fantastic. Watching people's reactions in that example image was really interesting, and it occupied me for a good few minutes. "Why can't you do the same thing with video?" Because rewinding video is really painful, especially online video.

Criticism:

I use my thinkpad's pointer stick to move the mouse cursor. It's impossible to keep the cursor inside the "control strip" while moving it up and down and also looking away from the strip (and at the image). Too much accidental x motion is introduced.

It would be better for me if you were to enable the scroll wheel (which I can simulate on my pointer) as an alternative time control, or perhaps let me click on the control strip and then hold down mouse1 for as long as I want my y motion to control the position in time.


@gosu, despite what Josh wrote, you can traverse your pointer across any part of the living photo online. :-)

Love that you picked up on the expressions! It wasn't until I got the photos out of the app was I fascinated by the same thing. I really can't wait to enable this functionality for everyone soon. :-)

More elaborate mouse movements are possible, only in HTML5 full screen mode; required to "capture" the mouse (think a game).

The problem with that, too, is that instruction or a tutorial is required. (I'd try to make things as intuitive as possible, despite the failure in the other thread RE UX and the infinite shutter.)


facepalm

What was happening is that I was trying to keep the mouse in the control strip, and it would go off the right side of the image.

Thanks a lot, Josh.

Edit: By the way, the fullscreen functionality isn't launching. But I do have a weird browser (conkeror on xulrunner 22.0).


haha, no worries.

In the app, you need to start with your finger near the thumbnail strip. (But can move it away for fine-grained scrubbing if you wish.)

It's no surprise that the learned behavior is transferring to the web viewer.


jpap, I don't quite fully understand the implementation (though I'd love to one day be proficient enough to). But maybe you can explain how the format compares to motion JPEG. Or maybe it's very similar? About 15 years ago I dabbled in live video recording on old Pentium II hardware with an old BT878 video input card. Motion JPEG was the only feasible option to obtain relatively high quality (for the time) results albeit at the cost of disk space.


There are a lot of similarities actually.

In SnappyCam, each photo is compressed to a separate JPEG file. There's no inter-frame compression, no motion vectors, etc. The same as mJPEG.

The main differences are:

* Each photo is stored in an individual file. This makes seeking through the living photo blindingly fast. (I guess you could do this with motion JPEG by utilizing an index.)

* Each photo also has full metadata. Try rotating the camera as you shoot. It will follow you. :-) Same goes for the geo-tagging: included are a bunch of timings that aren't normally included, so you can know the "precise" usec when you took the photo.

* Each photo has it's own thumbnail. That allows me to cheat a little bit in the photo viewer: you will see a flash from blurry to clear as you scroll around.

(There are more cheats in the viewer for decoding and downsampling at the same time before you zoom, to make the photo load faster as well. One of the handful of reasons why I rolled my own decoder as well.)


This is amazing work. Could you explain why you decided to go with many individual stills rather than filling in the gaps in a video codec? It's a really counterintuitive approach.


Good question!

Several reasons:

* Video codecs are much more complex.

* Random access seek is a lot slower, unless you're using all I-frames. (That's now a codec option on iOS, but not when I started.)

* "Studio swing" reduces the dynamic range of the YCbCr components so the quality suffers.

* Each frame lacks their own thumbnail, unless you maintain an adjunct "thumbnail video"

* Each frame might(?) not be able to have attached separate metadata, like geotagging, sensor settings at time of capture, etc.

* Deleting one frame causes a "hole" and headache.

* Standards compliant JPEG means export is super easy.

* Anything above full HD video is difficult to deal with in 3rd party software.


Wow, I never thought I'd see a software optimization be talked about in such breathless amazement.


"discrete cosine transform JPG science"

Here's a more interesting link directly from the app developers: http://www.snappylabs.com/blog/snappycam/2013/07/31/iphone-k...


Thanks, that's a much better article.

> extended some of that research to create a new algorithm

> 10,000 lines of hand-tuned assembly code

> optimized out pipeline bubbles using a cycle counter tool

Color me impressed. It sounds like they really pulled out all the stops.


The Android bashing in that article is unfounded: http://www.eggwall.com/2011/09/android-arm-assembly-calling-...


It's not Android bashing - he's managed to make the older, slower iPhone hardware perform better than the current high-performance kings (the S3 & S4) through smart software optimisation.

There's no reason he couldn't do the same on Android and see similar gains. It would just be a lot of work..


He only implemented his optimized software on one platform and somehow starts to compare it with different software on another platform. How is that relevant for that other platform's performance?


Umm.. it shows the power of software optimisation?


I assume you never went to a apple conference.


Yeah, I'm kinda skeptical of the "science" here.

Edit: A new algorithm counts as science, but the TechCrunch article really gave no justification for the claim.


I've given a bit more background to the fast JPEG codec on my engineering blog: http://www.snappylabs.com/blog/snappycam/2013/07/31/iphone-k...

If you like signal processing, fixed point arithmetic, SIMD cores, and assembly, then this is for you. :-)


So the summary is "JPEG encoder written in assembly with NEON instructions saves images faster than Apple's encoder."

That's a cool feat and is a little damning for Accelerate.framework, although the way techcrunch writes it I expected a new kind of fast cosine transform.


Don't forget that SnappyCam pumps both CPU cores when available.

The actual DCT algorithm created and used in the app is different to the typical AAN (Arai, Agui, Nakajima) DCT algorithm that's used in JPEG codecs, at least all the ones I've seen.

It's all about doing as little work as possible to achieve the end result. That's why there's so much asm implementation, with carefully chosen NEON instructions for each step.

Think of it as a cross-layer optimization between algorithm and implementation... done by hand. :-)


Really interested in the nuts and bolts - are you optimizing specifically for one quality setting (in which case I'm guessing you could probably do the quantization as part of the dct and throw away some calculations)? I played with a realtime jpeg compression implementation back in college on transputers (yes I'm that old). Fun stuff, nice to see there are still places where going right down to the metal can make a real impact on a product...


Oh that's awesome and a lot of fun!

While SnappyCam has been the most difficult, complex, piece of software I've written since I started coding in my early teens, it's also been one of the most satisfying technically.

I'd love to disclose the many, many optimizations baked in, but as this is a commercial app I must keep much of it as a trade secret.

I will say though that a lot of precomputation was involved, both for the encoder and decoder. Jumped at the chance to avoid computation, memory reads, etc., as much as possible. :-)


One of my colleagues at work (Bart Smaalders) is known for the saying (paraphrasing?): "The easiest way to go faster is to do less work."

Well done on realising something that seems obvious in retrospect, but most people still miss.


haha, very cool. Smart man! :D


I find it amazing how you share your know-how so freely. This is the first app I ever saw that made me think of an iPhone as a potentially desirable thing... not enough to make me get one, but a big compliment to you. Never change (unless it's for the even more generous and clever of course :P)


Having shared a few beers with John Papandriopoulos (at an AusCTW workshop), I can vouch that he is capable of doing great things in signal processing. He's a smart guy [1].

G'day from across the ditch John and I'm glad to see things are going well! (from John D. in Sydney)

[1] http://www.rmit.edu.au/browse/Current%20students%2FAdmin%20e...


Hey, thanks!

The Australian Communications Theory Workshop was such a long time ago--what great memories. :-)

What are you up to these days?


Turning another incarnation of CSIRO's wireless research into a product, and still dreaming of the Free Space Optics stuff. Happy to buy you a beer if you are passing though SYD! Keep well.


Oh wow, that sounds really cool! :D

Will definitely look you up when I'm down next. Trying to drop by more consistently during the summer these days. I spent five weeks in Melbourne last Jan and loved every minute of warmth. :)


How come it doesn't let me downvote you?


This looks similar to what Microsoft Research's BLINK [42] does on Windows Phone. Alas I wasn't able to find any publications on what they are doing (which is strange for MSR). As I don't have my phone currently I can't even look whether they are doing full resolution too or whether they are dropping down to smaller sizes.

[42] http://research.microsoft.com/en-us/um/redmond/projects/blin...


Any chance of this coming to Android soonish? This is seriously cool!


The fast JPEG codec was written for the ARM NEON SIMD coprocessor found in the iPhone. Most Android devices also sport the same architecture, so it is indeed possible.

The code for the codec is written in mixed C and assembly, so it can be "easily" ported to Android by making use the JNI.

While the R&D for the fast JPEG codec took about a year to perfect, the iOS app took just about the same time to get polished (including the NodeJS backend work, the HTML5 website and embeddable widgets in AngularJS).

Writing the rest of the app would take a few months of full time work, and it's not yet clear if that might pay off at this stage.

We'll see... and glad to hear there's interest! :D


Don't overlook the fact that the source for the stock Android camera is available under a commercial-use-friendly open source license and has a quite nice native android UI. You don't have to reinvent all the wheels unless you're stubborn.

https://android.googlesource.com/platform/packages/apps/Came...

I would buy that in a heartbeat.

Unrelated, how quickly can you alter exposure settings? Can you get 30 pictures per second with three interleaved exposure brackets? (i.e. burst of 10 HDR photos / second) That would be very, very, very, very cool.


That's really interesting. I wasn't aware of that. I'll have a look at it once social sharing is out the door.

I did consider getting into other aspects of iPhoneography, like HDR, etc. The trouble with HDR in particular is that there's no API access to direct the sensor into each of the bracketing modes.

In the case of HDR, it might be more fruitful to attempt some kind of image signal processing, similar to "Clarity" on Camera+.

I looked into that for a while, and I figured that Camera+ might be using some version of the Contrast Limited Adaptive Histogram Equalization (CLAHE) algorithm. In any case, what they've done is really neat from a DSP perspective. :D


Hi,

There's also a cool technology allows you to save near the same jpeg with much, much smaller file size.

https://news.ycombinator.com/item?id=2940505


If I was you, if you're unsure it might pay off, I'd go to the phone makers and offer them licenses for just the encoder.


There's similar app on Android for years

https://play.google.com/store/apps/details?id=com.spritefish...

The claimed speed is 30 fps, the moar RAM the better I think.


I thought the app was pretty cool, just super slow to save out the JPEGs.

That's one of the reasons I spent a lot of time to make sure SnappyCam could compress these images, thumbnails and Exif metadata included, at a ridiculous speed.


Yes, SnappyCam has a faster algorithm advantage.

Android on the other hand, has more open and better hardware, e.g. larger full-res camera, much larger RAM, faster external storage or even OTG, more CPU/GPU cores. In theory if SnappyCam is full ported to Android you can make it faster than 8M pixels @ 20fps


The possibilities are exciting for sure! :-)


I might point out that, depending on the device, you could have many more cores to work with in Android-land.


We should create a "marketing core" term the same way we have marketing HDD size. I have seen big.little advertised as 8 cores.

I think the main advantage of the androids will be the fact that high end devices have generally more RAM than iOS counterparts. So even if the codec cannot be pushed as far as on the iOS the bigger possible buffers can help.


Yes, more RAM definitely helps.

On SnappyCam, I had to arbitrarily limit the size of the buffer to a fraction of the system memory because there's no way to know "how much" RAM can be allocated to avoid the dreaded memory warnings until you hit one; and then it's a three strike's out policy: you get two and the third kills the app.

The first two are "soft" warnings, but I suspect have a lower threshold than the "hard" one that sends SIGKILL.

In setting the limit arbitrarily, I unfortunately have no choice but to select it rather conservatively: it might otherwise be (a lot?) higher.


is there any value in having buffer size selectable for advanced users so they can play with it and see where the sweet spot is on their hardware?


Purely wishful thinking: how about this and the latest lumia camera?


I don't know much about their device (maybe Nokia might send me one?!? hehe)...

... but they must have an awesome JPEG encoder. I'd assume 41Mpx stills would need to be compressed in no more than 1 second for a reasonable UX. That there is a 41+Mpx/sec encoder.

I've also noticed they are using higher quality chroma sampling (4:2:2) so their encoder is actually doing a lot more work than say SnappyCam.

But I bet they're not doing it in software, either.


They will send you one if you send them 670 USD back...


I might try to (naively) implement something similar for Android just to see how fast it goes without too much low-level fiddling. It really is pretty darn cool.


Awesome! Drop me a line, would love to see how you go. :-)


Why not have a deferred compressor? I assume that just straight-up saving the raw data in memory would be much faster than compressing every frame as you get it.

Couldn't you get significant FPS increases (given that you still had free space/memory available)?


Actually, I do both on dual core devices.

One core is dedicated to host the capture/buffer, the other will encode shots in the background.

When you see the big circle percent animation, both cores are dedicated to compression to clear the encoder queue so you can take back to back living photos quickly.


I've just gotta say:

This is one of the main reasons I keep coming back to HN. A story gets posted about some cool new tech, and the creator is in the comments answering questions. Simply awesome.


haha, cool. :-)

To be honest, I don't often post here because I'm busy working, but am enjoying the discussion on a baby I've nursed for two years now. :-) Thanks for your post!


I read the TC article, bought the app, and came here in the hope that the developer would be here. I was not disappointed!


Very cool :) Thanks for the quick reply.


Assuming approximately 8MB worth of uncompressed raw data from the image sensor, 20 frames per second would require writing images to flash storage at 160MB/sec, which no iPhone can do.


But writing to RAM (at 3200+MB/s) is certainly possible. You could cache around 2-3 seconds (on lower-end iPhones) as you compress. Again, the idea wouldn't be that you could indefinitely do this, but merely that the compressor would be deferred (and would lag at a ratio that would still yield x FPS).

That's how I would do it. But apparently they already do this :)


That's how it's done, where on dual core devices, a JPEG encode can be done in parallel to capture.

Try it on an iPhone 5 with "infinite shutter" disabled and you will see the dropped frames in the filenames once you import them to your machine from iTunes App File Sharing (or over SSH if you're jailbroken).


Science vs engineering distinctions aside, it is pretty cool to see the attention to detail + effort put into solving this problem.


This is quite remarkable. I just tested it and works even better than advertised. I hope you become rich and famous for this. And I really hope there's not a hidden gotcha I haven't seen yet.


Thanks! :D I'm just very happy to have more people try the app.

It's been a hard slog working 7 day weeks for just over two years now. Feels great to receive some kind of recognition for the work.


I too have purchased the app, and initial testing on a 4S seems to show it works exactly as advertised. This is a really great app, and an astoundingly low price. You should be very proud I think, I'll definitely be using the app as my goto in the future.

(Hmm, after writing that, I somehow feel it sounds like it should have a reference to my rural, folksy, respected job that clearly makes me qualified to discuss such things. Unlike those Amazon reviews I'm refering to though, I mean every word.)


That's abs awesome to hear! Thanks for the wonderful complement.

I played with price. Until now, most of my sales were word of mouth, and the $1.99 price hindered "growth".

It's been an interesting game. Many people whom I demo the app to in person love it, then when they reach into their pocket to download it, realize it's a paid app, they place their phone back into the pocket.

Still have lessons to learn in sales and marketing... but am enjoying the schooling.


Well as a point of reference, I will never pay more than $0.99 for a camera app unless a friend has specifically shown me how it works. I have been burnt on too many photography style apps that end up either not doing what I expected from the pics and description, or just sucking in general.

For me $0.99 just breaks that psychological barrier into 'who cares if it sucks'.

I gotta say though, after playing with snappycam it's definitely worth it. I bet being cheaper will end up with easily more than twice the sales


I understand where you're coming from; social proof removes a massive barrier to conversion, even in my own experience.

I found that having it at $1.99 most definitely improved sales after it had been at $0.99 for about a week; after another week, it started to degrade again.

You're spot on in saying that $0.99 is a good price to get "disconnected" users who might experiment. If they like the app, they might make the personal recommendation to their "connections" where price is less sensitive.

After a while, that social proof and networking effect wears off and it's time to reset the price down to the "discovery amount" of $0.99.

To be honest, I'd love to flip SnappyCam over to freemium; but I feel that can't happen until the social sharing is bolted in and the app has a chance to sell itself organically.


The techcrunch title sounds like taken from an infomercial. Or "one weird trick..."


To be clear, using SIMD for JPEG encoding is not new. I'd be curious how this JPEG encoder compares to libjpeg-turbo's NEON encoder.

http://libjpeg-turbo.virtualgl.org/


Hey @jlebar, you're right--it's existed on the desktop for some time (MMX, SSE). When I first started, libjpeg-turbo never had an ARM port, which was part of the motivation to do it myself.

See my post in another thread here on the same topic.


I take a fair number of casual action shots – mostly of the kids. To get something to come out I often take a handful of pictures in a row; even that's often not enough, or the "right" scene happens in between these slowish frames. This could be cool for those cases.

Except... I also get annoyed sorting through those pictures afterwards. It would be interesting if with some post-processing it could sort through the pictures some for me, identifying distinct pictures, or filtering out ones that are clearly bad (mostly too blurry), or if fancier maybe doing eye or smile detection. I want to capture the moment a person looks up, before they think about the camera.

Another cool case would be taking photos of movement. If I can track the movement with the camera the picture can come out surprisingly well. But tracking movement is hard. If I had several seconds of pictures, over the course of that time probably I'd track the movement well enough for a few of the photos to come out.


If I remember correctly the automatically sorting through your pictures and picking the best is exactly what Google announced for Google+ at their last I/O keynote.


That's a cool feature, and not easy to implement. It generally ends up being a machine vision problem. (Google has both great talent and a lot more resource than a single-founder self-funded engineer like me.)


Hi Guys. This is Fast Camera. I'm callin' out SnappyCam!

Are you up for an old fashioned DUEL to see which app can shoot the most "native camera quality" 8MP images per second in 60 seconds without crashing?

On an iPhone 5 with all apps closed, SnappyCam manages to save only about eight 8MP per second over 10 seconds on average and loses the other 12 per second. And these are not 8MP images at least as far as comparing resolution against the native camera app or Fast Camera. All of this technical discussion sounds great but is anyone actually testing this like I am? Just download a stopwatch app with hundredths of seconds and burst for 10 seconds. You'll see. Then shoot something with a LOT of detail at 8MP in both SnappyCam and Fast Camera.

Fast Camera is capable of 10-12 native quality 8MP images per second (more than SnappyCam) We throttle it back on purpose.

And what's with camera-shutter.caf John? ;)

Michael Zaletel Founder, i4software Fast Camera, Vizzywig, Video Filters


Michael, thanks for making contact by e-mail, outside of these public forums.

As discussed over e-mail, I've created an in depth report showing that SnappyCam indeed takes full quality 8 Mpx shots on the iPhone 5.

With the amazing discussion and interest here on HN, I thought to share it with the community here as well:

http://www.snappylabs.com/blog/snappycam/2013/08/03/snappyca...

I'm off the grid on a hiking vacation for the next 2.5 weeks, back in late August and look forward to the discussion then.

jpap


Bug report -

On the first launch, if I quickly press the Setting button (bottom-right) it starts the flip animation and still shows the handwritten overlay explaining where to tap for manual focus and all whatnot. After the animation is complete, the overlay is still shown, so it looks like a mess. And it's also not obvious how to get the overlay back, because I haven't seen what it actually said.

Congrats on the TC cover and a very nice app. Get rich! :)

(edit) A nitpick - "Warm-up", not "Warmup"

(edit) Report Usage = On. Seriously? Who on Earth in their sane mind would actually want this, except for you? Next thing you tell me is that you have some "app analytics" library linked in and it's always on. Please don't be evil.

(edit) The same goes for "Send Crash Reports = Always". It should be "Ask". Respect your users and they will help.


Thanks for the suggestions:

1. Looks like a race condition for the settings button tap. Does it happen if you wait a second before pressing the settings button?

2. You can re-enable the tutorial (overlay) screens from the bottom of the settings menu.

3. On the usage/reports, I hear you. I won't give you bullshit on "standard industry practices" here, but I will say that I had to hack a well-known closed-source library to give you that opt-out from usage reporting. I really do value your privacy. (I've already requested the library developer fix it, and will try and write a blog post on how other developers can provide a kill switch, too.)

4. The default is there because many people don't like to configure apps, they just use them as-is. In that light, the default configuration is the one I felt was best for general use.


"Ask" should be the default, I have to insist.

Just tried the actual functionality and it gives the machine gun sound effect, showing a counter going up to 50-60, then I release the button, the blue stripe around the button shrinks back, it adds a photo to the bottom-left area, but when I tap it, there are just 3 frames. What am I missing? Is it adaptively trimming bad frames (I am shooting in a low light conditions)?

(edit) Just tried again and this time after I release the shot button, it showed a big circle overlay with the "JPEG" in the middle that counted up to 100% and the resulting photo had the right amount frames. It didn't do that on the first try. It's either a bug ... or you are missing a helpful hint that explains what's going on :)


Yes, that does require some explanation:

1. The receding circle is the capture buffer being processed. When you're tapping on the thumbnail, SnappyCam sees the start of the living photo being available and shows it. It does not, unfortunately, refresh the thumbnail list as more shots complete processing.

This is a (feature) bug and I'll work to address it.

2. The circle with percent progress is what I call "turbo rewind", where the camera is shut down so that all CPU cores can be applied to compression so that you can take back-to-back living photos quickly.

You can select the buffer "threshold" for when this kicks in under the advanced settings: look for Turbo Rewind.


It takes time and lots of effort, and ill argue is easier on a quasi standard platform (processor wise) but apps like this show how much juice can be squeezed out of the existing hardware by handcrafting the code.

Kudos, I just bought the app!


Thanks for the download! Let me know if you've got any feedback, I can be easily contacted through the app. :D

I'd just like to add that in addition to handcrafted code, choosing the right algorithm and always trying to "do less work" (less cycles, less data IO, better use of registers) makes a big difference.


I always say a good programmer has to be "lazy"!

Some feedback, the default exposure settings showed my room as pitch black (I have it very dim now), the native iphone5 camera adjusted automatically. I was able to snap a shot by pointing to the light. Personally I prefer not to crank the gain on the sensor.


haha, yes, if only it pays to be lazy. :) Sometimes doing "less work" means more up-front planning and thinking. Not a bad thing necessarily.

Interesting on the native camera adjustment. SnappyCam will use the "low light boost" high ISO capabilities of the camera. I'll have a play around with it.

Otherwise, does the continuous flash help you much?


The continuous flash didn't fire. I'm running the stock settings.


Oh, it's a manual flash.

Enabling that automatically is an interesting problem in itself: I'd have to estimate the light level based on the camera preview... or perhaps from the preview metadata.

Will think about how that might be done. Thanks for the thought. :-)


I'm impressed as hell by all of this, the fast DCTs and the crafting of the entire process to build something so far beyond anything else on the market is great.

Bought this!


I'd love to make the jump from "just" web development to some proper embedded development. Any pointers?


Practice. A lot.

A good start is actually the ARM processor; since it's a RISC instruction set, it's quite simple.

I've done lots of assembly in my ugrad days, even writing a Motorola HC11 micro-controller emulator, but ARM would be a much better choice right now.

I found the "Tonic: Whirlwind Tour of Assembly" [1] site invaluable to get me started for SnappyCam, as it covers a lot of the ARM ISA.

For iOS-related assembly, I'd recommend [2].

And for a taste of ARM NEON SIMD, have a look at [3]. The one thing that "clicked" for me on SIMD is that you should look at each register "lane" as trying to unroll a loop. I initially dived in thinking I'd just make a sequential algorithm parallel, which is often too difficult to arrange.

[1] http://www.coranac.com/tonc/text/asm.htm [2] http://www.shervinemami.info/armAssembly.html [3] http://hilbert-space.de/?p=22


Just adding my vote for android version! Great job !


It looks great but I have a few questions/comments.

1. What is the difference in quality between using this and the video capture mode? I.e. if what I really want is a high quality video, would this get me a better result than the built in programs?

2. Seeing as how you've done all this work (and how Android apps can be compiled from C) how difficult is it to port this to Android so that the rest of us can get in on it?

3. Is it just me, or can anyone else not change the settings / look at the other demos on the samples page?


1. It really depends on what you're after: are you looking for a video sequence that plays back, or an individual still? Video is better for the former, SnappyCam for the latter.

2. It's a lot of work, hinted at in another thread here on HN. The entire "app" build on top of the JPEG codec needs to be built from scratch; new artwork is required, etc.

3. I just tried it from another machine and works for me.

My backend API is being hammered at the moment, which is awesome, but it doesn't appear to be overloaded. (Gotta love NodeJS!)


I have the same issue (3), on Safari and Chrome: mouse clicks in the menus after they're opened are ignored, but the keyboard works to select a video.


Weird. Could be a bug in the dropdown component I wrote in AngularJS. :( Glad the keyboard still works.

Will look into it...


> To put the speed in perspective, SnappyCam is about 4X faster than the normal iPhone 5 Camera app, and more than twice as quick as the Samsung Galaxy S4′s 7.5 shots per second.

Does it mean that S4's hardware is faster than iPhone 5 given they're using similar algorithms, and if you'd make the same app for Android it could get even better results?


It's unclear to me, as there's a lot more going on when taking a photo than you might think. :) (I originally thought I could knock together a basic SnappyCam app on top of the JPEG codec within a week or two, it took months.)

If SnappyCam can do it on hardware that is older than the S4, then I can't see why technically Samsung can't lift their game.

And judging by how quickly they've been chasing Apple, and sometimes stepping ahead, I wouldn't be surprised to see a bit of leap-frogging for some time to come.

Let's see what the 5S/C brings in a few months! I'm excited.


Looks like the most interesting part here is "living photo" that instantly responds to interactions. Can this be standardize as new video format? It would be very cool to have all cameras be able to save video in this format. @jpap should consider formalizing this format, produce viewers on different platforms and license this tech to manufacturers of point-and-shoot cameras, GoPro, WebCams, camcorders etc. This feature could make camera an instant hit. It is a real value add for customers. I can also envision movies getting recorded in this format and available on Blue Ray so people can instantly interact with the cool fast action videos in HD. I think the great insight here is the awesome coolness of instantly interactive video that is ready to be unlocked inside current camera hardware.


I'm really glad to read this! :-)

I had similar thoughts myself, and forms a part of what I have in mind for the next major SnappyCam release (a taste is what you see on SnappyCam.com today). My thoughts are perhaps more web-focussed that what you describe, but the thought is really encouraging!


Some questions:

Instead of doing full resolution at 20 fps, can you do a smaller resolution at, say, 160 fps?

If the next generation iPhone processor is faster (a safe bet), do you think your software would allow at least 24 fps, and you could use the iPhone to shoot a 10+ megapixel movie?

Shouldn't Apple have hired you already?


It all comes down to what the hardware supports, ultimately.

I'm not performing any true miracles here: I'm just making best use of the hardware resources available, with some clever software tricks and algorithms.

The iPhone 5 actually supports 60 pictures/sec capture, for example, but Apple has decided for whatever reason, to disable it on iOS 6. If the iPhone 5 ran on iOS 5 (surprise?!) then it would likely run at 60 pictures/sec.

On iOS 7 that all changes: so you'll soon be able to capture at 60 pictures/sec, which is rad.

The rollerblader shown on the TC article was shot at Sunday Streets in the SF Mission District on my iPhone 4S at 60 pictures/sec. The photo quality is somewhat degraded for the web, but it still looks awesome full screen (from the SnappyCam website; the TC embed is in a restricted iframe and can't go full-screen).

I know a couple of great engineers that work at Apple, but haven't spoken with them for one or more years. Sounds like a cool place to work, but so can be working for yourself.

It's been a hard slog--I quit my last full-time job in March 2011--but I'd love to see SnappyCam through and bring to life another startup idea I have in mind. (Some of the YC partners have already seen me pitch it; SnappyCam has been a rather good distraction of late.)


At 20fps, could you make a 3d camera app by the user moving their camera in space and then correcting for stabilization with the accelerometers etc telling you point in space and using the multiple view points as individual cameras.


That's a really interesting machine vision problem and a lot more complex than a JPEG codec. :-)

I wonder how long before we start to see Kinect-like infrared cameras mounted on phones to make the depth problem easier to solve. That would be cool!


I want to see if you could use all the rapid frames, plus a variant of that cool Adobe image de-blurring tech [1] that was shown a while ago to produce a clearer, sharper image during motion?

[1]: http://prodesigntools.com/photoshop-cs7-image-deblurring.htm...


There's still much innovation left in image signal processing... and fortunately much interest in taking good photos!

This reminds me of research into superresolution, an area that's "super interesting" :-) as well.

The guys who started Occipital (360 Panorama), I believe, tried to dabble in that with ClearCam many years ago... but I honestly don't know much about it. Anyone from Occipital here on HN?


Or you could just use individual frames and bundle adjustment.... even via free online software like 123D Catch. ;) http://www.123dapp.com/catch


You should port it to several platforms and license it as a library. I would think there are many companys interested in a fast jpeg encoder that is not embedded in an iPhone App :)


This is pretty cool. You got my buck!

I was kinda hoping I could also turn the speed down to multiple seconds per photo, since it talks about doing time-lapse shots. One of my major uses for my phone's camera is selfies for art reference, currently done with Genius - which annoyingly won't do repeated shots at anything less than 10 seconds. Being able to take one shot every 1-3 seconds would be pretty damn cool for me.


Thanks!

You can reduce the capture rate in the app settings, down to 1 photo per {1, 5, 10, 30, ... } seconds.

Move the slider toward the turtle under "Camera Lens".

jpap


Oh durf, I fail at exploring UIs. Thanks!


I must say that this is one of the most interesting apps which I had found in last few weeks. You should get yourself a beer as this is a neat feat to accomplish:)

Also, some people were saying that webapp wasn't working for them on some chrome versions. As for me - I've got the 28.0.1500.95 - the culprit was Disconnect extension, which when disabled, allowed the whole application to behave as expected.


That really helps, thanks for letting me know about the disconnect extension. I've never used it, will check it out.


That's fantastic, and a very cool demo.

How does the encoder performance compare to libjpeg-turbo? That also has some SIMD work for NEON.


Yes, Nokia contributed the NEON code for the DCT in libjpeg-turbo.

I haven't had a chance to do a side-by-side comparison as yet, but I suspect the SnappyCam encoder is faster for many reasons, including choice of algorithm and the way they use two multiplies (low, high) at times, and their image row-by-row nature with function call overhead in favour of code maintainability.


I was involved in some NEON work on libjpeg-turbo, and I can confirm that the image buffer management there is hell, as are some other aspects of the design. A from-scratch implementation with performance in mind should easily be quite a bit faster.


Looking forward to taking these pics and testing out http://research.microsoft.com/en-us/downloads/69699e5a-5c91-... Image composite editor with things like photosynth


[deleted]


1. Instragram is only shown if you actually have Instagram installed on your device. ;-) As you might know, Instagram guard their API carefully: we don't yet have general access to it.

2. E-Mail is also only shown if your device has built-in e-mail accounts set up.

3. iTunes App File Sharing is accessible by connecting your device to your Mac/PC via USB and using the iTunes app.

Drop me a line jpap {at} snappylabs.com if you're still having issues. I'd love to help! :D


It crashes for me every time I take somewhere between 60 and 75 frames with the main camera. With the front-facing camera, I can shoot forever. In the iPhone Settings (somewhere called Diagnosis & Usage), I have a bunch of LowMemory warnings. I'm using an iPhone 4S.


Thanks for reporting it in!

It seems I enthusiastically chose a larger buffer size that appears to be having some issues on some devices that have a lot of memory pressure.

If you reboot your phone, as awful as that sounds, it will likely fix the issue.

EDIT: I've just submitted an update to Apple that uses a more conservative buffer size.

This aspect is hard to get right: I once used an adaptive buffer size that heeded to memory warnings, but that meant that the buffer filled to lower levels than a conservatively sized buffer.

If only iOS had an opt-in for an *alloc returning 0 instead of these warnings, or at least notifying us of how much space left before we're SIGKILL'ed.


Thank you, I'll try rebooting the phone.

I recently read an article here on HN that briefly touched on memory management under iOS and especially the problem of apps getting killed, maybe it is interesting for you: http://sealedabstract.com/rants/why-mobile-web-apps-are-slow... (scroll down to "How much memory is available on iOS?")


Beautiful app, jpap. Well-done! I can't wait to do some side-by-side comparisons between this and video stills, and see what kind of image quality differences there are. My overall impression of the app itself is that it's incredibly solid. Keep building apps!


Thanks!! :D

Would love to see some real world comparison examples. Drop me a line when you've got something, would love to check it out. :)


I'm interested to know how their method compares to how dedicated digital cameras and DSLRs do it? are cameras running dedicated hardware/firmware to achieve the same result? Or have they optimised their software in the same way that SnappyCam has done it?


I can't say, as SnappyCam is my first foray into image signal processing. (Though DSP isn't new to me.)

I'd guess that DSLRs use a combination of hardware acceleration on the "tricky" bits (like DCT) with firmware to control the compute hardware.

Huffman is a particularly difficult beast, as it can't be parallelized. The JPEG bitstream is inherently serial, though there has been some proposals to improve that.

If you run a SnappyCam JPEG that you pluck from iTunes File Sharing through djpeg (from libJpeg) you will notice that each of the YCbCr planes are not interleaved.

I once experimented with a parallel JPEG encoder, encoding the Y, Cb, and Cr planes in parallel but the threading overhead was more than just queuing up each JPEG encode separately in a multithread queue.

Bonus points if you notice another marker in the JPEG. That's intended for parallel JPEG decoding but hasn't yet been implemented in SnappyCam as yet. (The existing decoder is fast enough for 8Mpx shots.)


Interesting stuff, thanks for the info. Much respect for doing so much optimisation in assembly (my limit is C, even for embedded work).


When you read about 'DIGIC', 'BIONZ', or 'EXPEED' in dedicated digital cameras, you're on the spot with custom processors. (In some cases, more than one) They're often multi-chip modules that can be scaled depending on the level of camera and can be based on standard SoC or embedded macros.

On top of that, you'll have the 'firmware', which is where projects such as Magic Lantern/CHDK make things fun.

Some reading on the Canon gear:

http://cpn.canon-europe.com/content/education/infobank/captu...

http://magiclantern.wikia.com/wiki/Datasheets


Love the "we'll iMessage you a download link" feature on the web page. Are you using a service for this? Note it doesn't seem to work for me in Germany, it doesn't change the country code, it leaves it at +1 (instead of +49)...


It's a webservice I hacked together that sends iMessages from my old MacBook Pro. :-)

It was my understanding that German mobile numbers are written locally starting with 01? [1]

e.g. in Australia, my mobile number would be 040x-xxx-xxx. The international version is +61 40x-xxx-xxx. When you select Australia, it will show 04.

(OK, I now see how this could be confusing; my apologies.)

[1] http://en.wikipedia.org/wiki/Telephone_numbers_in_Germany#No...


Phone numbers in Germany work the same way like you describe with Australia. Internationally you'd have +49 (area code without leading zero) (number) and within Germany you can use (area code with leading zero) (number).


Awesome. If you type in your number as if you were local, do you get the iMessage?

(Internally I add the international prefix. As you can imagine it took a while to find all of the local prefixes and create number masks!)


More technical details on SnappyLabs blog: http://www.snappylabs.com/blog/snappycam/2013/07/31/iphone-k...


there is a lot said about the assembly code. I wonder whether it make sense to code it in LLVM IR?


There's definitely improvements being made to LLVM to automatically parallelize code (esp unrolling loops) to SIMD.

I haven't personally tried it, but would love for it to match the code quality of hand-cranked assembly... writing it is tedious, error prone, but you do get control over when you preload the cache, the stack, and you can do really cool things with the CPP and macros to "manually inline" things. :-)

And who doesn't like writing a good 'ol fashioned jump table?!


Thanks, btw there was a thread on this topic recently: https://news.ycombinator.com/item?id=6096743


Amazing work, and the living photo thing could be a hit.

Out of curiosity and a bit unrelated, I've been craving for real raw capture on the iPhone (before bayer interpolation, white balance, noise removal). Is it possible?


It may not be possible, even for Apple.

If you have a look at the data sheets, for example [1], you'll see YCbCr or RGB output formats being listed.

I guess it would make sense for as much signal processing to be done as early in the chain in the interests of lowering power consumption. (Less data to transfer over a serial bus into more circuitry, at the very least.)

[1] http://www.ovt.com/products/sensor.php?id=134

(Sony also make the sensors for Apple, apparently.)


In general sensors output raw and the ISP does the image pipeline including demosaicing, so a phone could support raw with different ISP firmware. (Reportedly Nokia wrote custom ISP firmware for their fancy camera, for example.) I wouldn't be surprised if one of the upcoming Samsung or Sony phone-cameras supports raw.


Wait, how does this work with the Apple frameworks? I assume you can't go faster than what Apple gives you. If you were to discard every photo, how fast could you theoretically go?


They simply are not using what Apple gives them. They wrote their own JPEG encoder which side-steps any limitations that Apple's own implementation has.


jpap, this is the best HN thread I've seen in a while. I never comment, but I'm compelled to now, because it's not often I see a hack this mesmerizing and exciting. For a moment, I almost wanted to drop everything and dive into JPEG myself, something I don't think I've felt since reading about John Carmack and his game engine hacks. Even though I understand <10% of the details being discussed, I'm compelled to learn more. Thanks, jpap. :)


I'm so very happy to read your post! :-)

I'm not usually one to post publicly either, but with practice I'm finding it comes more naturally.

I do hope you dive in---the devil's details in image signal processing is really interesting.


Please tell me there's an ipad version in the pipeline. This is the second non-free app I have on my ipad - it doesn't disappoint at all. Great work!


I have it in mind at a lower priority. To be honest, my main impetus is for better discoverability on the App Store from an iPad device.

They really like to hide those iPhone apps! ;-)

It will be nice to play with the interactive living photos full-screen, though iOS 7 makes iPhone apps look amazing on the iPads in any case.


I do not agree with the use of the word "scientific" in this context. Specially since it appears to be a shameless plug for a product.


This app is a lot of fun! Thanks for making it.


Any papers on the subject? I'd love to dig deep into some of the technical details behind this.


I'd love to disclose the implementation details, and even release it on GitHub, but unfortunately I have to keep it as a trade secret for obvious reasons.

Perhaps one day! :D It was a tonne of work that I'd love for fellow engineers to take a look at. I learned a massive amount from reading the likes of libTurboJpeg and other OSS implementations; though none of them are using the same DCT as the one I developed for SnappyCam. (They weren't a good fit for the ARM NEON ISA.)


WHY DONT YOU CURE CANCER INSTEAD


Actually, cancer research can benefit (admittedly by a long shot, but still) by these kind of improvements in image processing. Don't forget that medical imaging and - thereby indirectly - the recognition/detection of cancer is one of the first steps in curing said disease.


You are disrupting my narrative with facts.


Very well said.

I suspect Clarity on Camera+ is some form of Contrast Limited Adaptive Histogram Equalization (CLAHE).

CLAHE came about from digital image cleanup of medical scans for human analysis.


Now, a CLAHE implementation on top of SnappyCam .....


I considered doing this over a year ago. I was really intrigued by the algorithm and declared it was a distraction over getting the basic app solid first.

I'd love to revisit it. It really works wonders. I've played with the CLAHE implementation in Fiji/ImageJ and while it can produce really good results, it does require some tuning.

I really admire the Camera+ guys for creating an auto tuning algorithm that is quite similar. (I'm unsure if they're using CLAHE or a variation.)


What's the diff between this and video shooting? Isn't that 25fps?


Video gives you still images that are one quarter the pixel dimensions (1920 x 1080 = 2.07 megapixel) and presumably more highly compressed.


Size and quality. We've touched on it in a few other threads here if you'd like some further background.


This may be a really cool technical achievement, but the title is misleading. It is not scientific - that is, the scientific method was not applied to increase our understanding of the Universe. No, it is just a really cool testament to how cool engineering really is.


This is a great app. What you need to do is market it to skateboarders.


Agreed! I've got a good friend who is a skater and he reminds me of this often.

Any good skating sites I might want to contact?


Good work, jpap. I wish this were posted during the day though ;)


This kind of functionality is standard on OMAP4 devices.


ARM actually publishes reference code under the library name "OpenMAX" for their mobile processors.

I've read ARM's source code and found they they too use the AAN algorithm for the DCT. (They provide a tonne of code for other multimedia related stuff too.)

I learned a lot from their code, even though my implementation is completely different and original.

I would also dare to say that my asm source is maintainable. I had a very hard time understanding their code as it wasn't very well documented or laid out... but nevertheless a valuable learning tool.


jpap, if you ever find yourself stranded in Canberra, I'll buy you a beer (a proper Australian one).


Cheers mate, I'm from Melbourne and might just take you up on that! (I live here in SFO at the moment, but try to get back as much as I can.)


I'm over in the States in a month-ish but not over your way (Seattle, NYC).


jpap,

Could you take the 'trimmed' section and create a looping GIF from that? (Can I do that already?)


The biggest problem with GIFs would be the colorspace: GIFs are usually limited to 256 colors. Color quantization from a JPEG to GIF would kill the photo.

The whole "living image" sounds nice. If it was possible to create a format, say "AJPEG", out of it, it would be awesome.


It's now on the list. Had a few requests for it, and agreed, it'd be cool. :)


Great, I'm not actually sure how the iPhone photo library handles GIFs.. but I'd much rather be able to choose 'export as GIF' & 'export this frame' than to save all 100 photos.

great app man, thanks


The Camera Roll will apparently host them, but not show them as animated.

Unless you roll your own viewer, many devs suggest using an embedded UIWebView to animate it.

Otherwise it kinda gets treated as pass-through for most of iOS. The Messages app apparently animates them nicely, and for some is a real motivation to include the feature. (So they can send animated GIFs to their friends.)

Glad you like the app! :D


What's the breakthrough? My GoPro can take 120 photos per second.

Futhermore, what happens if you point this at a device that can affect each pixel on the phone's image sensor 20 times a second? Is all the information preserved? If so, this is an interesting hardware hack. If not, this is an interesting shell game. But I don't see how it's a scientific breakthrough.

(It sure is good for sales when TechCrunch prints your press release verbatim, though!)


A GoPro can do 720p (1 megapixel) at 120 FPS; this app can do 8 MP at 20 FPS. Also, it's using JPEG instead of a video codec so when you want to pick out a single frame it's already in the format you want. Likewise there are no motion artifacts because it doesn't use a video codec, so every frame should be equal quality.


The breakthrough is in some very fast signal processing to allow the images to be taken on an iPhone, not a GoPro. Both are impressive.


I agree that this is likely not a "scientific breakthrough". Every time I write some really cool software, I don't claim a new scientific breakthrough. I guess I could, but I'm pretty sure my coworkers would grow tired of my antics. Perhaps some graduates with CS degrees actually believe they are scientists.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: