-
Notifications
You must be signed in to change notification settings - Fork 289
Formal definition of "descriptive identification" missing (SC 1.1.1) #1084
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@brennanyoung, it's a best practice with standards writing to not define terms that use their common or dictionary meaning. For your poor quality audio, I would use for the text alternative just what you wrote in your comment. |
The Understanding document does explain what is needed, albeit briefly. It says, about time-based media (e.g. audios): "...it is important that users know what it is when they encounter it on a page so they can decide what action if any they want to take with it. A text alternative that describes the time-based media and/or gives its title is therefore provided." That's for the audio component itself, in addition to any transcript. The same document has Example 2 for audio tracks: "An audio recording of a speech: The link to an audio clip says, "Chairman's speech to the assembly." A link to a text transcript is provided immediately after the link to the audio clip." That explains it well. However, I would say there is no reason why this audio clip cannot be given a transcription like any other audio, as required by SC1.2.1. You say it is used in coaching about 911 call handling, and the object is for students to understand it. Presumably they will do this mostly by listening to the intelligible bits and deducing from that what the garbled bits are probably saying. This is not what SC1.1.1 means by a "test". It says "...a test or exercise that would be invalid if presented in text...". That means cases where the very act of describing something in text would provide the answer to the question. In other words the answer is contained in the content, and if a screen reader announced that answer then the point of having the test is lost. For instance in quizzes where you might be shown shapes that you have to identify. Or those eye tests for colour blindness where the shape of something is contained in multiple coloured blobs and you have to identify the item. For a screen reader to state what the shape is would answer the question and invalidate the test. Or the famously and outrageously inaccessible Google reCaptcha where you are asked to say which of twelve images contain a certain item, and the only way to describe the images in an alt text would name that item. The audio alternative consists of distorted audio that is also a test that could not be described without giving the game away - and, incidentally, is one that even people with good hearing cannot understand! None of this applies to your audio clip. It can be given a transcription for deaf people, of all the bits that are intelligible, including any part-words discernable in it, and with "[garbled]" or similar inserted in the places where it is not intelligible. Now the deaf person is in almost as good a position as anyone else, they can see the words that are available and from them can make deductions about what might be in the garbled bits, which I assume is the intention of this coaching. (Note I say "almost", because hearing people might still gain some slight advantage from emphasis and intonation in the recording, but that would not be sufficient reason for entirely excluding deaf people from the whole exercise.) To make this clearer, suppose your audio clip and surrounding content is used in a class setting. From the transcription a deaf person can make the same logical deductions from the intelligible bits that others in the class may be making, and can participate in the discussion about it and see what everyone is referring to, even if they cannot get all the nuances of intonation that their classmates will hear. There is no reason to exclude them from the entire discussion, unable to take part! This does not seem to me to be a test that "would be invalid" by having a transcription so cannot be excepted under SC1.1.1. |
Thanks. We arrived at a similar position after a lot of discussion. What I think is missing is some mention that the content might be fragmentary and incomplete in its very nature, - and this may happen any time the more 'true' description is the ambiguous one. The docs do not really give any guidance about this kind of content, except in the discussion about how a text might make the test invalid. I maintain that a text alternative which is too explicit would indeed invalidate the exercise in our case, so the notes about 'tests' are perfectly relevant. The coach really is testing the trainee's ability to gather information rapidly. So this is certainly a test, and certainly a "specific sensory experience", hence my effort to understand what a "descriptive identification" might be in such cases. To my mind, a kind of transcription is called for, but it should somehow capture the noise and chaos of the original content, rather than offer a false or 'groomed' clarity. It would be helpful to know for sure that this kind of solution was viable, especially when we come to our accessibility audit. From my experience, I have low confidence that 3rd party auditors will understand this subtlety in the same way, unless I can point to an explicit item in the docs. With the current spec, it's our word against theirs about what "descriptive identification" really means. I can appreciate that it's challenging for the docs to give content authors license to offer ambiguous transcriptions within certain appropriate limits without letting them off the hook in the majority(?) of situations where clarity is preferred, but the current docs strongly imply that there is only one meaning (e.g. "provide an equivalent" - singular), which is simply not true in various cases. This thread by itself shows that standard dictionary words like "test" or "identification" can be interpreted in various ways! One further implication (from the current docs) is that an unambiguous text alternative which would 'give the game away' may be replaced by a descriptive identification. Yet I am still in doubt that there is any consensus about the exact meaning of this supposedly transparent term. This guidance proved quite unhelpful, even misleading, so I hope it will be addressed. From another perspective, something which is garbled might yet contribute to a correct interpretation, after sufficient garbled fragments accumulate - a constructivist view, perhaps. In our case it would be counter-productive to use a blanket pseudo transcription fallback such as "[garbled]" because we wouldn't even be offering the fragments. For the sake of a professional sheen, we'd be cheating the deaf users of a training/coaching opportunity. Anyway, we have decided to provide an ungroomed machine transcription of the audio, 'warts and all' (and we hope some of the warts will be fragments of meaningful content). This will inevitably be uneven, and will not express what the 911 caller 'really meant' to say, but it will model the live telecommunication sensory experience very closely. We then intend to explain the rationale behind this choice in accompanying docs. A deaf user might still prefer to hook up their own preferred auto-transcription AT to the audio output, instead of relying wholly on whatever automatic transcription we choose to offer. They may even glean extra meaning from comparing their own AT output from the automatic transcription we have provided. (A Batesonian 'double description', providing additional qualities not found in either of the data sources, rather like a hologram or interference pattern). This seems like a rather obvious solution now, but I believe some gentle guidance on the topic of expressing ambiguity from the WCAG docs (specifically under SCs 1.1.1, and 1.2.x), perhaps even mentioning an example of this type, would have abbreviated the long journey and extensive head-scratching we undertook before we arrived at this decision. |
I think this has been a very interesting discussion, but I am inclined to close this issue. I will not do so until Monday 4/6 at least. |
@brennanyoung The difficulty is that of all the hundreds of thousands of audio clips on the web, how many are for your scenario of teaching how to understand one with garbled content? - only a very few, not enough justification for amending the documentation that otherwise covers audios well. I would suggest this query would probably attract more interest and discussion on the WebAIM discussion forum - including from many of the same people as on here, but it's a different setting where questions of how to do things as opposed to updating the WCAG are discussed in depth. |
@guyhickling ambiguous is much, much more important than garbled ! |
I have been unable to find a formal definition of the phrase "descriptive identification" (which appears in the spec for SC 1.1.1
I am in a position where we need to make an exception for missing audio transcription (SC 1.2.x) on the grounds mentioned under SC 1.1.1 that the audio is used for a test and because a specific sensory experience forms the basis of that test. (The specific case is coaching 911 emergency call response, where audio quality is poor, speech is often unclear or garbled, and where the learning goal is to make correct interpretations given poor audio).
So, what exactly is a "descriptive identification"? Are there some examples somewhere? Is there a best practice for how this may be provided?
The text was updated successfully, but these errors were encountered: