Tucked away in a small corner of Towson University’s Liberal Arts
Building, a team of students and faculty is changing the way radio is heard.
Or rather, seen.
It’s the latest research breakthrough in the Towson University/NPR accessible radio partnership known as the International Center for Accessible Radio Technology, or ICART. And it has big implications not just for the deaf and hard-of-hearing, but for anyone who relies on the Internet for news and information. In other words—everyone.
Called the ICART Captioning Center, the initiative delivers captioning and transcripts for radio broadcasts. The center currently provides captioning for pre-recorded programs such as Latino USA and the TED Radio Hour, and is expanding
into live-caption coverage. The center provided real-time captioning for NPR’s coverage of the 2012 presidential debates, and has recently hosted technical demonstrations for top-tier NPR programs such as Morning Edition.
Unlike closed-captioning for television, which is often prepared in advance, radio captioning presents challenges because of radio’s high percentage of live programming. Most captioning for television can be pre-scripted; live radio doesn’t have that luxury.
“In the past, your choices for live captioning were limited to stenographers, who are expensive and in short supply, or machine translation, which is often incomprehensible,” says Ellyn Sheffield, associate professor of psychology at Towson University and co-director of ICART. “We offer an approach that takes the best of both.”
Here’s how it works: Seated in sound-proof booths, specially trained student “voicewriters” listen to an audio stream and repeat everything they hear into speech-to-text software calibrated to their voices. The rough text is then patched to an editor outside the booth who cleans it up with proprietary software developed by NPR. The clean text can then be pushed to captioning devices in near real-time or further edited to create a transcript of the program.
Of course, anyone who’s ever had a mix-up with Siri or other speech-to-text gadget knows that voice recognition is far from perfect. According to Melinda Hines, lab manager for the Captioning Center and a graduate of Towson’s clinical psychology master’s program, the key is human intervention to correct the computer’s mistakes.
“The software is about 50 percent accurate alone. It becomes 95 percent accurate with voice writers. You need a human to follow along and clean up that remaining five percent,” says Hines.
For example, she explains that the software once interpreted Iranian President Mahmoud Ahmadinejad as “all my dinner John.” The phrase “jury trial and preliminary hearing” became a “juice trail and blueberry period,” and host of NPR’s All Things Considered Audie Cornish was summarily renamed “Will.I.Am on the corn.”
Hilarious as they are, these mistakes can wreak havoc on comprehension in a caption or transcript. That’s why ICART has worked so hard to develop a better way.
Adds Sheffield, “If you’ve ever gone to the gym and watched TV news with captions on, you’ve probably seen that it can be riddled with errors. You can sometimes make sense of it with the visual context that TV provides, but we don’t have that in radio. Our captions have to be near-perfect.”
Captioning for the Classroom
Although ICART set out to provide emergency alerting and accessible radio programming across the country, the Captioning Center’s biggest current client is still close to home: the university's Disability Support Services (DSS) office. DSS is the office on campus responsible for providing accommodations for students with disabilities. And it turns out that the process developed at ICART is naturally applicable to distance learning courses accessible over the Internet.
“We’ve very successfully adapted our process for lecture transcription to help students with auditory disabilities,” explains Sheffield.
Before partnering with the Captioning Center, DSS worked with a third party that required up to 72 hours to deliver a transcript. Rush orders are expensive and, according to Sheffield, the costs may be difficult to afford for smaller colleges and community colleges that require captioning to comply with requirements of the Americans with Disabilities Act (ADA).
Now, transcripts take less than 24 hours, which means students receive their study aid much sooner.
Sheffield hopes to soon give professors mics to record their audio in class so that voicewriters can caption classroom lectures in real time, just like the live radio model. She also sees the technology impacting not just those with auditory impairments, but also students with learning disabilities or non-native English speakers who could benefit from a written transcript.
Ultimately, she hopes the center can export this process and technology to colleges and universities across the United States, much in the same way she hopes to distribute the radio model across a radio network.
“We’ve created a center that can caption for the classroom,” says Sheffield. “But it would be impossible to keep up with demand on a national scale. We hope to train other people and take our knowledge to other institutions. There’s a definite need, and we hope to fill it.”
She laughs, “Let’s call it franchising.”
By Dan Fox. Photos by DeCarlo Brown.
Cover photo: Steve Inskeep and Renee Montagne, NPR Morning Edition hosts (Stephen Voss/NPR)