The goal was not only to support high write volumes ( >10k/s) but also to support fast lookup of similar images (around 1-2s for over 1B images). Though similar paid services and free image hashing libraries exist, this may be the first complete free open-source solution. Available at: https://github.com/ascribe/image-match
image-match started as an internal project. We needed a way, given some target image, to find similar images downloaded by our web-crawler (think Tineye).
So not only did we need to support fast, accurate lookup for millions or even billions of images, we also needed to facilitate very high volume insertion -- around 10k images per second.
In my talk, I will cover: