The Web Content Accessibility Guidelines 1.0, published by the World Wide Web Consortium, states, as its first rule, that the web designer should

Provide a text equivalent for every non-text element (e.g., via "alt", "longdesc", or in element content). This includes: images, graphical representations of text (including symbols), image map regions, animations (e.g., animated GIFs), applets and programmatic objects, ASCII art, frames, scripts, images used as list bullets, spacers, graphical buttons, sounds (played with or without user interaction), stand-alone audio files, audio tracks of video, and video. [Priority 1]

While this seems a relatively straight-forward concept, a practical application has proven to be quite difficult. Most authors are able to identify the non-text elements of a page, but many seem to have difficulty identifying what constitutes a text equivalent. In some cases, such as when a web page is created by importing a word processing file, it may not be clear that content is not text without careful examination.

The fundamental concept of equivalent-text is that a reader who does not have access to a non-text element of a document should have access to the information of the document. While the principles are specifically applied to documents on the World Wide Web, they apply to all electronic documents, and for those with specific learning disabilities, might apply to print documents as well. The reader may not have access to a non-text element for a number of reasons.  The reader may have a slow connection to the Internet, and have turned graphics off in the browser to allow faster document loading. The reader may be retrieving the document via a web-reader on the telephone, so that graphical elements are not available, or on a device that cannot generate sounds, so that narrations are not available.  Or, the reader may have sensory or cognitive deficits that make perception of visual or auditory elements impossible.

Text-equivalents are required to accommodate all of these conditions.  Electronic text has the compelling advantage that it can be delivered, without loss, to a variety of senses.  Text can be presented as print, which can be rendered in a variety of sizes, colors, and backgrounds.  This allows the text to be optimized for individuals with a variety of visual limitations.  The same text can be converted to speech, which can then be read faster or slower, louder or softer, and in a variety of synthesized voices.  Or text can be presented tactilely, as Braille, to accommodate the reader who has difficulty with both sight and sound (or who simply prefers Braille).  Because of the ability of text to be presented so that virtually anyone can benefit from it, it is the lingua franca of electronic information.

Classes of non-text elements

Non-text elements, which can be visual or auditory, are included in documents with a number of different purposes, and the text equivalents for each may be different.  In designing a text equivalent, it is important to know the purpose of the element, and its contribution to the document.

Decorative ElementsSmall dog running back and forth demonstrating the concept of eye-candy.

Commonly called “eye candy” or “ear candy,” decorative elements are included primarily to produce a more pleasant reading experience.  This might include a decorative border, an “official seal” or a company logo.  In electronic documents, these can be static or dynamic, such as snow flakes drifting down a page.  “Ear candy” might include recorded sound effects or music, a spoken welcome message, or other sounds to attract the reader. 

The key feature of decorative elements, for this discussion, is that they do not contribute information to the document, but are included to enhance the sensory experience. For example, the heading for this section includes an animation of a small, frenetic dog running back and forth on the page. It does not add to the information of the page, other than to serve as an example of the concept of eye-candy.

Document Information

For purposes of this discussion, we are considering electronic documents as a means of transferring information from the author to the reader.  This might be a technical treatise on materials hardness written for a professional journal, a family newsletter communicating important family events and achievements, or a diary entry written to a future self.  Regardless of the nature of the document, the author is attempting to communicate something to someone.

Non-text elements are included in documents because they allow some types of information to be conveyed more efficiently than text.  For example, a bride planning her wedding might want inform other members of the wedding party of the colors being selected for the dresses.  While there are specialized descriptive methods that will allow this to be done through text (spoken or written), the inclusion of a fabric sample (either an image or actual cloth) is much more compact.  Manufacturing processes can be described in language, or demonstrated through animation. An heart pathology might best be conveyed by the sounds heard through a stethoscope and the EKG tracings.

One consideration in creating and linking appropriate text equivalents is the relationship of the non-text element to the document.

In-line information

Some types of non-text element are crucial to understanding the meaning of a sentence.  For example, in a document describing a mathematical relationship, it is common to include an equation, then several statements defining the terms of the equation.  These might say something like “φx is the force along the horizontal axis.” While “φx” is arguably text, it may not be interpreted when the document is rendered as spoken language.  A computer manual may includes a statement such as “To undo the last action, hold down the ⌘ key, and press ‘Z.’ In all such cases, the meaning of the sentence is lost if the text equivalent is not provided immediately and seamlessly.

Supporting Information

Ratings of Vocational Counselors Figure 1: Counselor Perception of Their Role in Providing RT and Perception of Their Capacity to Perform These Roles d (Click image for larger view) Image from ProfResources/Publications/ Proceedings/2006/ Research/PP/Noll.php


Some non-text elements are compact ways of presenting complex information, and require significant mental effort to process, but are not part of the flow of the document.  These elements may expand on statements made in the flow of the narration, or may enhance the understanding of the reader.  A bar chart showing counselor perceptions of their role in providing rehabilitation technology might be included in a professional policy article.  Within the article, it might say, “Figure 1 provides a convenient comparison of the counselor's perception of their roles and the comfort they had in their competence to perform these roles.” Examination of the chart might show that the counselor's perception of responsibility is generally slightly higher than their perception of capacity, but the meaning of the sentence is not lost if the reader chooses to ignore the graph for the moment.  In magazines like National Geographic or Scientific American, examining the illustrations and their captions may provide all the information that the reader wants, and the text of the article may be ignored or postponed.  In either case, the information of the non-text elements enriches the document, but is not immediately essential to understanding the information presented.

Document meta-information

The meta-information of a document is information that describes or contributes to the document rather than the topic of the document. Some elements may be included to set the context for a document.  For example, many editorial or opinion documents include a picture of the author.  This does not provide information about the topic of the article (e.g. rent control), but may provide the reader with information about the probable attitude. Because we are able to recognize faces faster than read text, such images can facilitate finding articles by that author while flipping through pages of text. If the document includes an image of the facility housing the author, it provides information about the setting.

Constructing and Presenting Effective Text Equivalents

Decorative Elements

It has been argued that decorative elements do not contribute to the information of a document, and should therefore be presented with an empty text equivalent. This position has been argued by web designers and accessibility organizations, and is widely accepted. Hear the heading including eye-candy read by Jaws with no alt-text, or read by Zoomtext with a blank alt-text in place.)

It is, however, possible to argue that, in at least some cases, eye-candy can be useful to the reader, by helping to locate the document rather than enhancing the document content.  Consider, for example, helping someone locate a book on your bookshelf, when you are not present.  If you are abnormally organized, you might have your shelves organized in alphabetical order by author or title, and can simply tell the searcher that.  Or, you might say that the book is about two inches thick, and the back is dark red with gold print.  Similarly, a sighted person might say that the article that a friend is looking for has a picture of an elephant in the top right, and of a zebra about half way down the page.

Now, consider the case of a blind person guiding a sighted person to the document.  If no information about the page decoration is available, such cues cannot be provided. 

Some, but not all, users who depend on text equivalents argue that they should have access to all of the information about a document, including descriptions of the decorative elements presented.  However, this information should not interfere with access to information of the document.  Further, since the decorative elements are often placed by the publisher rather than the author, the author should not be held responsible for such descriptions, unless the document is self-published.

These considerations suggest that a “fully accessible” electronic document should include a link to a “document description.”  The user who desires a description of the document would be able to follow this link to find a description of the appearance of the document.

Such a description might include page layout information, such as the margin size, line spacing, font, and other aspects of the document.  It would also include a description of the placement of non-text element (and possible text elements such as headings or tables) on the page.  

Document Information

In-line information

Any inline non-text elements (those that are incorporated into the flow of a sentence) should be accompanied by a text equivalent that contains the words that would be used when reading the sentence aloud.  Thus, if the document includes the equation “aSuperscript2 +bSuperscript2 =cSuperscript2” as created in an equation editor, (and hence, existing as a graphic in the document) the text alternative would be ‘… alt=”aye squared plus bee squared equals sea squared”’. (Note: depending on the symbol, this might also be “a squared plus b squared equals c squared,” but many speech engines would then place the emphasis on the word “squared” rather than the symbol “a”, so that it would be spoken as “uh squared”.) (Hear this paragraph read without alt-text and with alt-text.)

One interesting limitation of many text-to-speech systems is that they do not provide all of the information even for keyboard input.  Many Greek letters used in mathematics are simply ignored or are named as their Roman equivalent, changing "alpha" to "aye."  Similarly, positional information such as superscript and subscripts are not announced. Because these are not graphical images, not even technically non-text elements, there is neither a mandate nor a standard method to provide text equivalents, though the information is essential to understanding the document.  One “repair” strategy (a means of fixing omissions after the fact) that is available is to place an invisible graphic (usually a one-pixel, transparent .gif image) in the document flow, and then use the alt-tag for that element to provide the missing information. 

When constructing a fully accessible document, in-line information should be supplied without any overt action on the part of the document reader.

Supporting Information

For any non-text, out-of-flow element of the page that is intended to provide information about the topic, the author must provide the intended information in a text equivalent.  However, there are considerations that control how this is done.

Graph of the perceived role and capability of vocational counselors.Figure 1: Counselor Perception of Their Role in Providing RT and Perception of Their Capacity to Perform These Roles d (Click image for larger view) Image from ProfResources/Publications/ Proceedings/2006/ Research/PP/Noll.php

When a person is reading a document with full access to the sights and sounds of the page encounters a graph (such as the one to the right), animation, or other non-text element, s/he is free to skip over the element for future consideration. One of the ways that this is supported in web documents is to represent a graphic as a "thumbnail image" in the flow of the document, which acts as a link to the full-scale image, video, or other non-text element. This allows the reader to skip over the content for later examination in detail.  When traditional alt-text is used, the text equivalent is provided in the flow of the document, and no such choice is available.  This difference provides an inherent difference (or so it would seem) for the consumer of alt-text.

The web content guidelines recommend that an “alt-text” be limited to 50 to 70 words, which may not provide adequate space to convey the information of the element to which it is attached.  We say that a picture is worth a thousand words, and a complex picture might require many more than a thousand to describe, but suggest that alt-text can do the job in just 50.  To allow for a more complete text equivalent, the HTML standard specifies a “long description” attribute that allows an arbitrarily long equivalent text to be supplied in a separate document.  Because the longdesc is not fully supported, many authors recommend the “d-link,” a letter “D” (upper or lower case) that is a link to the longer description.

In combination, the alt-text and longdesc/d-link offer a means of providing text equivalents for supporting information that is equivalent to access by those with full multimedia access.  The proposed approach is to include, for all supporting non-text elements (those that are not embedded into a sentence) an alt-text that describes, briefly, the non-text element.  This approach violates the usual rule that says that an alt-text that includes the words “picture,” “graph,” or “image” is probably not effective.  We are recommending that the attached alt-text inform the reader that the non-text element is a “graph of the perceived role and capability of vocational counselors,” but not try to provide the information of the graph. (Hear the above paragraph with appropriate alt-text)

The alt-text description of the purpose of the graph should be immediately followed by a link to a long-description conveying the information of the graph.  This long-description would include all of the information of the graph, as well as a summary of the graph.  Thus, for the “counselor perceptions” graph, the long description might include a listing of the categories rated and the average ratings, and a summary paragraph that is similar to the caption of the graphic in the main document. At the end of the long description, a link should be provided that returns the reader to the point of departure in the original document. In a complex document, returning the reader to the location of the non-text element rather than the top of the document is vital to usabiltiy. (Hear an appropriate long description for the above graph)

This strategy allows the reader to receive, in the flow of the document, the fact that a graph is available providing the counselor perceptions.  At that point, the reader has to option to follow the link to the long-description and receive the information, or to continue with the document, and return to the perception information at a later time.

Element Meta-information

Just as information about the visual representation of a document can be beneficial to the reader who cannot see it, information about a non-text element can be useful, for many of the same reasons.  However, because this information is not desired by all readers, and constitutes cognitive noise to some, it should be provided in a way that allows the reader to access it electively.

We suggest that authors include an element description that is similar to the document description described earlier.  This element long-description might be provided through a link at the end of the element information long-description. 

The element meta-information might include a wide range of information.  In a description of a photograph where the document information is of an interaction in the foreground (“A man helping a child get a drink of water from a drinking fountain”), the meta-information might include the information that the drinking fountain is near the edge of the Grand Canyon, that it is a sunny day, and that a flock of birds is flying over.  This information might be useful to a person seeking a description of weather in the American Southwest, the migratory habits of birds, or the geology of river canyons.  None of this is relevant to the surrounding document on intergenerational assistance, which is the purpose of the image in the enclosing document.

One complex example of element meta-information can be found in the inclusion of institutional or corporate logos as decorative elements in a document.  Such an element might carry the meaning that “this document is supported/hosted by XYZ corporation,” or that it discusses XYZ corporation.  The logo might well have a great deal of symbolism to the company.  In such a case, the best text equivalent for a logo might well be the name of the company acting as a link to the corporations “About Us” page.


The proposed standard for text equivalents recognizes three categories of alt-text: Inline or Descriptive, Document Information, and element meta-information. 

Information that is required to interpret a sentence in a document should be provided seamlessly and immediately. The person using an alternative representation should be essentially unaware of the change.

Support information, such as graphics, images, sound-bites, or animations should, likewise, have a short description, provided as “alt-text” that provides the reader with enough information to decide if an immediate branch to the detailed description is needed, or if the information can wait.  The text equivalent of a non-text element should be provided in a linked document, using a d-link as well as the longdesc attribute for electronic documents, and other navigational aids for print documents.  This equivalent text should provide the information that the author wanted to convey by the inclusion of the image, animation, sound track, or other non-text element.  It should not include information that can be extracted from the non-text element, but that is not pertinent to the current document. 

Finally, a document/element description should be linked that provides a more complete description of the non-text element. For images, this might be a description of the setting, background actions, and location. For voice tracks, it might include descriptions of the accent of the reader, levels of background noise, or other aspects that might be of interest.  In many cases, a description of how the element was created might be significant.  This could include the camera settings, animation program, recording rate, or other information that a reader might find helpful in creating similar elements in their own work.

Information that is essential to the meaning of the document must be supplied at the time the document is created.  This includes the first two categories, and this information is the responsibility of the author.  The document description information has a shared responsibility.  Some can, and should be provided by the author, but some can only be provided by the publisher.  Some information, such as logo meta-information, is best provided by the owner of the logo.