Four years ago, Jason Grigsby asked a surprisingly difficult question: How do you pick responsive image breakpoints? A year later, he had an answer: Ideally, we’d set responsive image performance budgets to achieve “sensible jumps in file size.”
Cloudinary built a tool that implements this idea, and the response from the community was universal: “Great! Now, what else can it do?” Today, we have an answer: art direction!
Since its release earlier this year, the Responsive Image Breakpoints Generator has been turning high-resolution originals into responsive <img>s with sensible srcsets at the push of a button. Today, we’re launching version 2, which allows you to pair layout breakpoints with aspect ratios, and generate art-directed <picture> markup, with smart-cropped image resources to match.
Responsive images send different people different resources, each tailored to their particular context; a responsive image is an image that adapts. That adaptation can happen along a number of different axes. Most of the time, most developers only need adaptive resolution — we want to send high-resolution images to large viewports and/or high-density displays, and lower-resolution images to everybody else. Jason’s question about responsive image breakpoints concerns this sort of adaptation.
When we’re crafting images that adapt to various resolutions, we need to generate a range of different-sized resources. We need to pick a maximum resolution, a minimum resolution and (here’s the tricky bit) some sizes in between. The maximum and minimum can be figured out based on the page’s layout and some reasonable assumptions about devices. But when developers began implementing responsive images, it wasn’t at all clear how to size the in-betweens. Some people picked a fixed-step size between image widths:
Others picked a fixed number of steps and used it for every range:
Some people picked common display widths:
At the time, because I was lazy and didn’t like managing many resources, I favored doubling:
All of these strategies are essentially arbitrary. Jason thought there had to be a better way. And eventually he realized that we shouldn’t be thinking about these steps in terms of pixels at all. We should be aiming for “sensible jumps in file size”; these steps should be defined in terms of bytes.
For example, let’s say we have the following two JPEGs:
The biggest reason we don’t want to send the 1200-pixel-wide resource to someone who only needs the small one isn’t the extra pixels; it’s the extra 296 KB of useless data. But different images compress differently; while a complex photograph like this might increase precipitously in byte size with every increase in pixel size, a simple logo might not add much weight at all. For instance, this 1000-pixel-wide PNG is only 8 KB larger than the 200-pixel-wide version.
Sadly, there haven’t been any readily useable tools to generate images at target byte sizes. And, ideally, you’d want something that could generate whole ranges of responsive image resources for you — not just one at a time. Cloudinary has built that tool!
So, we had built a solution to the breakpoints problem and, in the process, built a tool that made generating resolution-adaptable images easy. Upload a high-resolution original, and get back a fully responsive <img> with sensible breakpoints and the resources to back it up.
That basic workflow — upload an image, get back a responsive image — is appealing. We’d been focusing on the breakpoints problem, but when we released our solution, people were quick to ask, “What else can it do?”
Remember when I said that resolution-based adaptation is what most developers need, most of the time? Sometimes, it’s not enough. Sometimes, we want to adapt our images along an orthogonal axis: art direction.
Any time we alter our images visually to fit a different context, we’re “art directing.” A resolution-adaptable image will look identical everywhere — it only resizes. An art-directed image changes in visually noticeable ways. Most of the time, that means cropping, either to fit a new layout or to keep the most important bits of the image visible when it’s viewed at small physical sizes.
People asked us for automatic art direction — which is a hard problem! It requires knowing what the “most important” parts of an image are. Bits and bytes are easy enough to program around; computer vision and fuzzy notions of “importance” are something else entirely.
For instance, given this image:
A dumb algorithm might simply crop in on the center:
What you need is an algorithm that can somehow “see” the cat and intelligently crop in on it.
Here’s how it works: When you specify that you want to crop your image with “automatic gravity” (g_auto), the image is run through a series of tests, including edge-detection, face-detection and visual uniqueness. These different criteria are then all used to generate a heat map of the “most important” parts of the image.
A frame with the new proportions is then rolled over the image, possible crops are scored, and a winner is chosen. Here’s a visualization of the rolling frame algorithm (using a different source image):