1. Introduction
The Responsive Images community has been trying for some time to come up with a syntax for specifying responsive images
that satisfies multiple use-cases.
Their most popular attempt so far,
the <picture>
element,
only hits 2 of the 3 major use-cases,
and has certain aspects that implementors seem to be rather unhappy with.
This document defines a new attempt at the problem, sketched out in collaboration between me and John Mellor, which solves all three of the major use-cases, while avoiding implementor concerns, and hopefully being very easily usable.
2. The Problem Statement
To make things somewhat clearer, I’ll outline here the three use-cases that the responsive images community has been attempting to address.
- Resolution-based discrimination - providing the same image in multiple resolutions, so that high-res devices can get the prettiest picture, while low-res or low-bandwidth devices can avoid wasting time and bandwidth with overly-large files.
- Art-direction discrimination - as the screen size changes, so might your page’s design. A large, detailed image might be appropriate for the desktop design, but just scaling it down for a small phone’s screen results in a tiny, cluttered image. Instead, you may want to provide alternate images, cropped differently to better fit the small screen and still show the most important parts of the image at an appropriate size.
- Variable-size resolution discrimination - sometimes the image you’re serving is variable-sized: either based on the size of the viewport or just different sizes at different breakpoints. A combination of the solutions to the above two problems can address this, but very verbosely - you have to do math to figure out breakpoints (not intrinsically related to your site’s breakpoints) and repeat urls multiple times. For example, a 1000px wide image might be appropriate as a 1x image when used to fill the background of the page on a desktop screen, but it’s far too large to use for the same purpose on a 320px wide screen. On a screen that small, it’s more like a 2x or 3x image.
In addition, there’s a strong requirement that the solution be friendly to the browser’s preloader, which scans the document for urls to start downloading quickly, as the connection delay is a huge factor in the feeling of "slow" sites on mobile devices, and so starting the connection as early as possible is a big win in perceived performance. This limits what kind of information you can rely on, as the preloader only has access to a small amount of data from the page.
3. The Syntax Proposal
Add a set of attributes to <img>
,
named src-1
, src-2
, etc.
Collectively,
these are the src-N attributes
When loading an image,
these attributes are consulted first,
in numerical order.
If none of them are valid or match,
then the plain src
attribute is used to load the image.
Note: It’s possible to integrate srcset
into this if necessary,
but it would be nice to avoid doing that.
This completely replaces the srcset
funtionality.
The grammar for the attributes is:
<src-n-attribute> = <media-query>? [ <x-based-urls> | <viewport-urls> ] <x-based-urls> = [ <url> <resolution>? ]# <viewport-urls> = <size-viewport-list> ; <size-based-urls> <size-viewport-list> = <image-size> [ ( <viewport-size> ) <image-size> ]* <image-size> = <length> | <percentage> <viewport-size> = <length> <size-based-urls> = [ <url> <integer> ]#
The above grammar must be interpreted per the grammar definition in [CSS3VAL]. For the purposes of the above grammar, the <url> production is simply any sequence of non-whitespace characters that does not end in a comma or semicolon. All other terminal productions are defined as per CSS.
The terms and use of this grammar are explained further in the following sections.
Note: Any explanations of how the attributes and their values are processed that appear in this section are non-normative. The normative definition of the processing model is in the "Processing" section.
3.1. Art Direction
To solve the basic art-direction use-case, the src-N attributes allow a media query to be provided at the beginning of their value.
I want just the second clause of the media_query
grammar in Media Queries 4,
where you have "(foo:bar) and (baz:qux)" and that’s it.
I should go fix the grammar section there to expose this more cleanly.
Each valid src-N attribute is checked in numerical order,
and the first one to have a matching media query
(or no media query at all)
is chosen as the source of candidate urls for this <img>
.
These could be referenced like:
< img src-1 = "(max-width: 400px) pic-small.jpg" src-2 = "(max-width: 1000px) pic-medium.jpg" src = "pic-large.jpg" alt = "Obama talking to a soldier in hospital scrubs." >
Note: Putting the final url into a src
attribute like that isn’t required;
in fact, if using any of the more advanced pieces of this feature,
like multiple resolutions,
it would have to be in a src-N attribute (using src-3
would be most appropriate here).
Still, having a final fallback in src
is a good idea,
as it means that down-level browsers will still be able to correctly download the image.
Note: This feature is intended to be used with distinct images. Look at the "Viewport" subsection about choosing among multiple copies of the same image based on viewport size.
3.2. Resolution
To solve the resolution use-case, the src-N attributes allow multiple urls to be provided, each with an indicator of their resolution (the ratio of image pixels to CSS pixels).
Instead of a single url, simply provide a comma-separated list of urls/resolution pairs, where each pair is a url, followed by whitespace, followed by a CSS <resolution> value. A url provided without a <resolution> is assumed to be at 1x resolution.
Assume that I’ve already edited Images 4 appropriately so that x is a valid resolution unit, equivalent to dppx.
< img src-1 = "pic.png, picHigh.png 2x, picLow.png .5x" >
The choice of which image to load is left to the user agent, based on its knowledge of the screen’s pixel density, the device’s bandwidth, and whatever other factors it deems relevant to the decision.
The intrinsic size of the chosen image is equal to the actual number of image pixels in each dimension, divided by the chosen resolution multiplier, in CSS px units. For example, if "pic1.png 2x" is chosen, and is 100 pixels wide, its intrinsic size is 50px.
3.3. Variable-Sized Images
The previous section on resolution discrimination had a hidden assumption which may not always be true: that the image being presented is meant to be a single, static size. That is, regardless of the size of the screen, the image will always be, say, 400px wide.
This assumption is not always true. There are two major reasons why this may be so:
- The image’s size may be specified as a percentage of the viewport’s width. For example, it may be screen-filling (100%), or it may fill a column in a two-column layout (50%).
- The image’s size may vary with your page’s breakpoints, as it gets placed in different layouts. For example, you may have a single-column layout on small screens, having the image fill that column (100%), but switch to a grid with fixed-size items in it on larger screens (400px).
Either of these issues can be addressed with Media Queries, but it gets complicated when resolution discrimination is mixed in - if you’re displaying the same image at a variety of sizes, a particular image file may be appropriate as a 1x image on large screens, but would also serve perfectly well as a 2x image on smaller screens. Dealing with this requires you to repeat urls multiple times, and can require some non-trivial math. (See the next example for a simple demonstration of the code bloat.) Further, the code so produced is not actually forward-compatible, either - it’ll act badly when even higher-density screens arrive, unless you further bloat the syntax by pre-emptively writing out higher-density versions.
To avoid these issues, this specification defines a shortcut syntax to address the case of a variable-sized image.
For the first case,
of an image that is sized as a fraction of the viewport,
simply provide the target <img>
size as a percentage,
followed by a semicolon,
followed by a comma-separated list of image urls and the widths of the images in image pixels.
Using this information, the browser can determine how wide the <img>
will end up being,
and convert the image widths into effective densities.
< img src-1 = "100%; url1 400, url2 800" >
,
and the viewport’s width was 320px,
this is equivalent to specifying < img src-1 = "url1 1.25x, url2 2.5x" >
.
On the other hand, if the viewport’s width was 800px,
it would be equivalent to specifying
.
Regardless of the viewport’s size, the browser will understand which url is appropriate to download without you having to do any of the math yourself.
< img src-1 = "100%; pic1.png 160, pic2.png 320, pic3.png 640, pic4.png 1280, pic5.png 2560" >
With this one declaration,
a high-res phone 320px wide can correctly choose to download pic3.png
(an effective 2x resolution),
while a large desktop screen with 96dpi will correctly choose to download pic4.png
(approximately a 1x resolution).
Anything at higher, lower, or in-between sizes and resolutions will also be appropriately catered for,
without the author having to explicitly figure out reasonable breakpoints
and categorize each image appropriately for each.
Using just Media Queries, the markup would instead look something like:
< img src-1 = "(max-width: 400px) pic1.png .5x, pic2.png 1x, pic3.png 2x" src-2 = "(max-width: 800px) pic2.png .5x, pic3.png 1x, pic4.png 2x" src-3 = "(max-width: 1600px) pic3.png .5x, pic4.png 1x, pic5.png 2x" src-4 = "pic4.png .5x, pic5.png 1x" >
This example is obviouslly substantially more verbose,
and also less powerful.
For example, when screens reach 3x or 4x density,
those devices will still be stuck downloading 2x resources,
even though a 3x or 4x version exists for most screen sizes,
unless the author comes back and updates every <img>
element in their page.
Further, the breakpoints chosen above were simply guessed at, and are likely not optimal. Doing the math to find the optimal breakpoints isn’t hard, but is definitely non-trivial.
For the second case, when the size of the image varies based on breakpoints in your layout, the syntax is slightly more complicated. The first and second parts are still separated by a semicolon, and the second part is still a list of urls and image sizes.
The syntax of the first part, though, is slightly expanded. Rather than being simply an image size, it’s an alternating list of image sizes and viewport breakpoints, with the breakpoints in parentheses to help separate them visually. The breakpoints must be in ascending order, as the image size is chosen by finding which two breakpoints the viewport’s size sits between, and selecting the image size between those two.
Assuming that the same image is supposed to be used at all of these layouts (that is, you aren’t doing art-direction cropping to optimize the display of the image for a given size), then all of these cases can be addressed by a handful of images at various sizes, and the following code:
< img src-1 = "100% (30em) 50% (50em) calc(33% - 100px); pic100.png 100, pic200.png 200, pic400.png 400, pic800.png 800, pic1600.png 1600, pic3200.png 3200" >
The first part of this attribute sets up the layout breakpoints at 30em and 50em, and declares the image sizes between and around these breakpoints to be 100%, 50%, or calc(33% - 100px).
The six images automatically cover every reasonable possibility. For small screens (phone size, or even smaller, like watches), anything from the 100 pixel wide image to the 800 pixel wide image may be downloaded, depending on screen size and density. For medium and large screens, anything from the 400 pixel wide image and up may be chosen. The author doesn’t have to do any math or complex figuring, just provide the image in enough sizes to cover everything they believe reasonable.
Again, doing the same thing just with Media Queries is much more verbose.
Note: Notice that the full set of CSS <length> values are actually available for image sizes,
including things like calc().
Using this, you can get as close to the precise size that the <img>
element will be as you wish,
though just getting “close enough” as I did in these examples is more than sufficient in most cases.
Similarly, viewport sizes can be specified with the full set of <length> values,
which are interpreted in the same way that they would be in a Media Query like min-width.
For example, em units are interpreted relative to the user’s default font size, etc.
Note: Also, all of the examples given here size the various images as powers of 2, doubling in size as they get larger. This is merely for convenience, as it’s easy to downsample an image by powers of 2, but is not a limitation - feel free to provide images at any size you desire. This will become more important in time, as 3x screens come into use and you wish to give them a well-targeted image to download, rather than having to decide between the 2x and 4x versions.
The intrinsic width of the image is equal to the image size chosen from the provided list. If the image has an intrinsic ratio, the intrinsic height of the image is its intrinsic width multiplied by the ratio. Otherwise, it has no intrinsic height.
4. Processing Model
This section describes the processing model for images.
As each <img>
element is encountered on the page,
run the following steps for it:
- Let candidates be the result of obtaining the image candidates from the element.
-
If candidates is empty,
abort this algorithm.
Is this the right place to say "fire a load error" or whatever?
-
Update the image data, choose the right candidate, etc, etc, hook up to the right terms in HTML. The actual choice is UA-specific.
4.1. Obtaining the Image Candidates
This section describes how to obtain the image candidates from an HTML <img>
element.
The input to this algorithm is an HTML <img>
element.
The output of this algorithm is a (possibly empty) list of image candidates,
where each candidate is a pair composed of a url and a resolution.
-
Let candidate attributes be the list of attributes on the element who satisfy the following conditions:
- The attribute name is at least 5 characters long.
- The first four characters of the attribute name are an ASCII case-insensitive match for "src-".
- The fifth character of the attribute name is a non-zero digit (1-9).
- The remaining characters of the attribute name are digits (0-9).
- The attribute value matches the <src-n-attribute> production.
- If candidate attributes is empty, return the result of obtaining a candidate from src from the element and abort this algorithm.
- For each candidate attribute, let its index be the result of removing the first four characters from the attribute name, and interpreting the remaining characters as a base-10 number.
- Sort the candidate attributes by their index in ascending order.
-
For each candidate attribute:
- If the attribute’s value contains a media query, evaluate that query. If it returns true, let winning value be the value of this attribute following the media query and abort this sub-algorithm. Otherwise, if it returns false, abort this sub-algorithm.
- Otherwise, let winning value be this attribute’s value and abort this sub-algorithm.
- If there is no winning value, return the result of obtaining a candidate from src from the element and abort this algorithm.
- Let image candidates be an initially empty list.
-
If the winning value conforms to the <x-based-urls> production,
then for each set of values between commas:
- Let candidate be an image candidate with its url being the <url> from the current set of values.
- If the current set of values contains a <resolution>, let candidate’s resolution be that resolution.
- Otherwise, let candidate’s resolution be 1x.
- Append candidate to image candidates.
Return image candidates, and abort this algorithm.
- Otherwise, the winning value conforms to the <viewport-urls> production.
- Let viewport data be the portion of winning value that conforms to the <size-viewport-list> production. Let unprocessed candidates be the portion of the winning value that conforms to the <size-based-urls> production.
-
Divide viewport data into adjacent pairs of values,
and a final lone value.
For each pair of values in viewport data:
- Let candidate viewport width be the result of intepreting the second value as a <length>, using the same rules as a <length> in a min-width media feature.
-
If the viewport’s width is less than candidate viewport width,
then:
- If the first value is a <number>, let winning image width be a length equal to that number of pixels.
- Otherwise, the first value is a <percentage>. Let winning image width be a length equal to the given percentage of the viewport’s width.
Abort this sub-algorithm.
-
For each set of values between commas of unprocessed candidates:
- Let candidate be an image candidate with its url being the <url> from the current set of values.
- Let candidate’s resolution be the result of dividing the <integer> from the current set of values by the winning image width, as an x unit.
- Append candidate to image candidates.
Return image candidates.
To obtain a candidate from src from an element, follow these steps:
- If the element has a
src
attribute, return a list consisting of a single image candidate, where that candidate’s url is the value of thesrc
attribute and its resolution is 1x. - Otherwise, return an empty list.
In the event that this proposal "wins",
but browsers have already shipped "basic" srcset
(just support for resolution discrimination)
and uptake is high enough that they can’t take it back,
then we can easily integrate srcset
into this fallback behavior.
In other words, srcset
doesn’t hurt this proposal
(though it would be ideal if it didn’t exist alongside).