A few weeks ago, we released a lightweight, minimalistic JPEG encoder in C (JPEC), that we use in our SDK to compress frames. There are already so many libraries dedicated to JPEG encoding/decoding that even finding an available name for it was not an easy task… so why bother to implement a new one?
The answer holds in one word: simplicity. We just didn’t need 90% of the possibilities offered by most of these libraries. No decoding, no color, no such ‘expert’ options as progressive encoding or variable block sizes: we wanted to be able to compress grayscale images, no more, no less. Sure, we could have used one of them anyway, as they all perfectly do their job… but we just didn’t want to burden our SDK with these unused functions. The other option was to use platform-specific dynamic libraries such as Apple’s Image I/O framework, but they are, well… platform-specific. So, we decided to implement our own, customized, vertical encoder. And yes, it was worth the effort:
- the result holds in about 600 LOC,
- compiled for ARMv7 architecture using LLVM-GCC, it weights no more than 12kB,
- it compresses a 360×480 image on an iPhone 4 in 35ms, versus 22ms for Apple’s Image I/O framework… and can probably still be optimized using NEON instruction set.
But let’s go back in time a little, and give a little hindsight on the sneaky little traps that made all the fun of this task.
The JPEG compression could in fact be split into two main steps: a more ‘mathematical’ part that compresses an image into a series of numbers, and an ‘encoding’ part where we store these numbers as efficiently as possible in a file. The mathematical part was far from being the hardest to implement: basically, following Wikipedia was nearly enough to implement it, and even provided some quite efficient tricks such as the use of Chen’s algorithm to compute a Discrete Cosine Transform with a reduced number of operations. The difficulty laid in the encoding part.
Strangely, that’s when Wikipedia (as well as a good part of the other resources on JPEG available on the web) becomes far more blurry on how to proceed. What makes this step tricky is that, to minimize the size of JPEG files, data is stored using Huffman encoding bit after bit, regardless of the usual byte structure. It looks like a good idea (and it is, in the end, a very powerful idea), but it makes debugging your output JPEG files absolutely awful. Some part of the data may take 3 bits and the next one 17, etc… which makes absolutely impossible to track the origin of your bugs.
And then, after a few hours (not among the most pleasant of my life) of bit by bit debugging using an hexadecimal editor, I discovered JPEGsnoop. This small software, whose highest flaw is probably to be available only for Windows, simply did in a handful of seconds what had caused my headaches for hours. It analyzes the structure of your JPEG file, and gives you some great hints on where and why it ended up corrupted. It simply saved my day, and in a matter of hours we had a functional library.
From ‘functional’ to ‘releasable’, there remained a last difficulty: clearly separate the different steps to make all of this understandable… and maintainable. Once the maths, the encoding and the actual bit by bit file writing were not a great soup of mixed functions anymore, we had our minimalistic encoder. And now you can have it too, you lucky ones!