Author: Andrew Sumin
CTO of mail and portal services at Mail.ru.
Twitter: Follow Andrew Sumin on Twitter
We use ad-blockers as well, you know. We gotta keep those servers running though. Did you know that we publish useful books and run friendly conferences — crafted for pros like yourself? E.g. upcoming SmashingConf San Francisco, dedicated to smart front-end techniques and design patterns.
When the Russian ruble's exchange rate slumped two years ago, it drove us to think of cutting hardware and hosting costs for the Mail.Ru email service. First, we had to take a look at what email consists of. Indexes and bodies account for only 15% of the storage size, whereas 85% is taken up by files. So, optimization of files (that is, attachments) is worth exploring in more detail.
At the time, we didn't have file deduplication in place, but we estimated that it could shrink the total storage size by 36%, because many users receive the same messages, such as price lists from online stores and newsletters from social networks that contain images and so on. In this article, I'll describe how we implemented a deduplication system under the guidance of PSIAlt.Read more...