How to Consider Images, Video, and Non-Text Information|An Introduction to Alternative Design in the WCAG 2.2 Era
When discussing web accessibility, attention often tends to focus on text, forms, and keyboard operation. However, actual websites use a great deal of non-text information, including images, diagrams, icons, videos, audio, and figures inside PDFs. When the necessary consideration is missing here, important information itself may fail to reach users. For example, guidance text may exist only inside an image, video content may be impossible to understand without audio, conversations may be hard to follow because captions are missing, or the meaning of a chart may not be explained in the body text. In such situations, users are left behind not only because they “cannot see” or “cannot hear,” but because they “cannot grasp the meaning.” In Part 6, we will focus on accessibility for images, videos, and non-text information, and organize how to design alternatives from a practical perspective.
What You Will Learn in This Article
- Why non-text information needs alternatives
- How to think about alternative text according to the purpose of an image
- How to handle charts, icons, and decorative images differently
- How to think about captions, transcripts, and audio descriptions needed for video and audio
- Compatibility with UUU Web Accessibility Service and differences in roles
The first thing to understand is that providing alternatives does not mean “explaining every image entirely in words.” What matters is making sure that the information or function users should receive from the content can also be received in another form. The W3C image tutorial also states that images need text alternatives according to their purpose. In other words, a photo should be replaced with the meaning it conveys as a photo, a button with the function it provides, and a chart with the information it is meant to communicate. Rather than describing every visual detail, it is far more important to consider “why this image is placed here.” (w3.org)
A useful perspective for understanding this idea is to classify images by purpose. W3C organizes images into categories such as informative images, decorative images, functional images, and complex images. For example, product feature photos, maps, and warning illustrations are images that carry information. Background patterns and decorative lines that have no meaning are decorative images. Icons such as a magnifying glass for a search button or a PDF download icon are functional images when activating them performs an action. Charts, flowcharts, organization charts, and similar visuals that cannot be sufficiently expressed in a short sentence are treated as complex images. Simply being able to make this classification makes it much easier to organize how to create alternatives. (w3.org)
For images that carry information, the basic approach is to briefly and accurately express the meaning conveyed by the image. For example, for a photo of people, write something like “Three employees having a meeting in a conference room,” within the range needed by the context. For a product photo, it is more helpful to write toward the features users want to know, such as “A black folding umbrella with waterproof specifications.” A common misunderstanding here is the idea that everything visible must be explained. However, W3C emphasizes that alternative text for informative images should express the meaning or content conveyed by the image, rather than transcribing visual details word for word. In other words, the goal is for users to receive the necessary information; making the description long is not the goal. (w3.org)
Decorative images require different handling. If images that carry no meaning and are only used to improve appearance are also exposed to screen readers, they can actually increase the burden on users. W3C’s decision tree and tips also show the idea of hiding decorative images from assistive technologies by using an empty alt attribute, alt="". Examples include decorative leaves next to headings, icons used as separators, and background pattern images. If such images are read aloud as “image” every time, it becomes harder to reach truly necessary information. In practice, it is also important to express decorative elements with CSS as much as possible and avoid mixing too many meaningless images into the body content. (w3.org) (w3.org)
For functional images, it is necessary to communicate “what happens,” not what the image looks like. For example, if a magnifying glass icon image is used as a search button, the appropriate alternative text is “Search,” not “magnifying glass.” For a PDF icon, instead of simply reading “PDF,” it is more practical to use wording that explains the result of the action, such as “Open company brochure PDF” or “Download application form PDF.” What users want to know is not the picture itself, but what they can do by using that element. When using images as interactive components, it is essential to check not only the beauty of the design, but also whether the function name is conveyed correctly. (w3.org)
For complex images, it is important not to try to explain everything with short alt text alone. Graphs, comparison tables, organization charts, procedural diagrams, maps, and structural diagrams easily lose information if only a short explanation is provided. W3C also recommends providing sufficient explanation in body text or nearby text in addition to concise alternative text for complex images. For example, for a bar chart, briefly state “Bar chart showing sales trends for fiscal year 2025,” then write the main points in the body text, such as “Sales increased from April to August, temporarily declined in September, and then rose again.” For a route map, also provide the main routes and transfer conditions in text. This is useful not only for people who cannot read the image itself, but also for people who have difficulty enlarging images or want to understand the content quickly. The Digital Agency’s guidebook also presents the idea of fully reflecting necessary information for charts and diagrams through alternative text or body text. (w3.org) (digital.go.jp)
One point to be careful about is “text inside images.” If important notices, campaign conditions, phone numbers, event dates and times, and similar information are embedded as images in banners or figures, that information will be missed unless alternatives are provided. In addition, when text is turned into an image, problems are more likely to occur: it is harder to enlarge, harder to translate, and harder to read depending on the display environment. Except for cases where it is unavoidable for expression, such as logos, important text information should ideally be provided as HTML text. Even when text is included inside an image, that text information must be obtainable through body text or alternative text. Images are convenient, but it is important not to make them the main container of information. (w3.org) (digital.go.jp)
When it comes to video and audio, designing alternatives becomes a little broader. Guideline 1.2 of WCAG 2.2 requires alternatives for time-based media. W3C’s materials on media accessibility explain that videos need at least captions, and depending on the content, transcripts and audio descriptions that supplement visual information are also important. Captions are useful not only for people who have difficulty hearing audio, but also for people watching videos in silent environments, people who want to follow Japanese text while listening, and people who want to confirm technical terms. Transcripts also have benefits for searchability and reuse. It is important not to assume that simply publishing a video completes information delivery, but to make the content available through other paths as well. (waic.jp) (w3.org)
Captions are often thought of as sufficient if “what is being spoken is converted into text,” but in practice that may not be enough. W3C shows the idea that captions should include not only speech, but also non-speech audio information needed to understand the content. Examples include laughter, applause, warning sounds, phone ringing, and sound effects important for understanding scene changes. Even in dialogue-centered videos, it can be hard to understand if it is unclear who is speaking. In explainer videos, correspondence with on-screen text and operations is also important. Captions are not merely “turning a script into text”; they are the task of appropriately converting information received through audio into text. (w3.org)
Furthermore, if important information exists in the visuals, audio descriptions or descriptive transcripts should also be considered. For example, in a tutorial video, the narration may say “Click here” without saying where on the screen the user is clicking. In an interview video, titles may appear only on screen. In a promotional video, there may be many scene changes with no explanation. In such cases, audio alone does not sufficiently communicate the information. In its guidance on description of visual information, W3C explains the idea of supplementing visual information necessary for understanding, such as actions, scene changes, on-screen text, facial expressions, and positional relationships. Not every video needs large-scale audio description, but checking once whether the content can be followed with audio alone is very effective in practical video production. (w3.org)
Transcripts are valuable because they allow video and audio content to be reused as text. W3C’s explanation of transcripts states that in addition to transcribing audio content, descriptive transcripts that include visual information as needed are important for certain users. For example, if seminar videos are transcribed, it becomes easier to search for content later. Podcasts can be re-edited into article summaries. Transcripts are also useful for people in environments where it is difficult to watch videos, or for people who understand better by reading. In other words, alternatives are not “extra things reluctantly added”; they are also a way to increase the value of information assets. (w3.org)
In practice, the difficult question is often “how much should we explain?” Here, it is effective to start from the purpose. For product photos, prioritize the features needed for decision-making. For employee photos on a recruitment page, it may be enough to convey atmosphere. For charts, restate conclusions and trends in the body text. For operation explanation videos, check whether the on-screen steps can be followed through audio alone. For event recording videos, check whether speaker names and slide titles appear only as on-screen text. In short, it is important to ask, “If users cannot see or hear this content, can they still obtain the information they need?” You do not need to copy every aspect of the expression, but you should avoid losing the meaning.
It is also worth organizing compatibility with UUU Web Accessibility Service. Services like UUU, which provide features such as text size adjustment, contrast adjustment, reading aloud, translation, and furigana display, are compatible in the sense that they help users understand body explanations, captions, and surrounding alternative text. For example, browsing support may help make descriptions placed near images easier to read, improve caption readability, and make supplementary text easier to understand. It may also reduce the burden of reading information around videos.
However, the alternatives themselves must be prepared in the original content design. Adding appropriate alt text to images, excluding decorative images from screen reader output, placing chart explanations in the body text, preparing captions and transcripts for videos, and supplementing necessary visual information through audio or text cannot be automatically solved by browsing support tools alone. In other words, services like UUU have strong compatibility in terms of “making alternative information easier to receive,” but they cannot take on the role of “designing the alternative information itself.” Understanding this difference helps prevent overexpectation of tools and makes it easier to carry out necessary accessibility work during production.
This topic is especially useful for people involved in public relations, editing, design, and video production. Public relations staff can avoid trapping important information inside image-based notices and banners. Designers can organize the roles of icons and charts and more easily design explanations linked with body text. Video staff can incorporate captions, transcripts, and screen descriptions into the production process. Engineers can ensure that alternatives are correctly usable through HTML implementation and embedded media settings. Accessibility for non-text information is not a task for one person at the end of the process; it is much easier to implement smoothly when considered from the planning stage.
Here is the summary of Part 6. What matters in accessibility for images, videos, and non-text information is not explaining every visual detail, but making sure that the information or functions users should receive can also be received through another means. For images, consider their purpose: provide meaningful alternative text for informative images, avoid unnecessary reading for decorative images, and provide sufficient explanation in the body text for complex diagrams. For video and audio, captions, transcripts, and audio descriptions when necessary increase the ways users can understand the content. Services such as UUU Web Accessibility Service are compatible in helping users read and receive such alternative information more easily, but designing the alternatives themselves remains the responsibility of content creators. In the next part, we will look at how to create testing, improvement, and operational systems so that accessibility does not end at publication.
Reference Links
- W3C WAI: Images Tutorial
- W3C WAI: An alt Decision Tree
- W3C WAI: Informative Images
- W3C WAI: Tips and Tricks for Images
- W3C WAI: Making Audio and Video Media Accessible
- W3C WAI: Description of Visual Information
- W3C WAI: Transcripts
- WAIC: Web Content Accessibility Guidelines (WCAG) 2.2 Japanese Translation
- Digital Agency: Web Accessibility Introduction Guidebook
- Digital Agency: Web Accessibility Guidebook for Public Relations

