Carnegie Mellon University (CMU), in collaboration with the Hear2Read project, has developed a text-to-speech (TTS) software that helps the visually impaired listen to text in native languages. Tamil is the first language being offered, with subsequent releases of seven major languages — Hindi, Bengali, Gujarati, Marathi, Kannada, Punjabi and Telugu — expected over the remainder of 2016.
Image credit- CMU and Hear2Read
In a blog post on CMU's website, the team noted that four out of five people in India speak one of the above eight languages and India on the whole has more than 62 million people that are visually impaired.
The goal and how the initiative came about
"We're looking to create speech output for as many languages as possible," said Suresh Bazaj, founder of the Hear2Read project, which is based in the San Francisco Bay area.
While TTS software is commonplace in the USA and many parts of the world, Bazaj found that good quality TTS for Indian languages is difficult to find, use or afford. Yet the need is great — only 10 percent of blind children in India get any education, and 90 percent of visually impaired Indians live in poverty.
Alan Black, professor from the School of Computer Science's Language Technologies Institute (LTI) noted that making the app free, open-source software was their key goal.
Bazaj met Black, a scientist internationally known for his work in speech synthesis, through a former student of Black's two years ago and recruited him to the project. While LTI had the knowledge and tools for creating TTS software, the Hear2Read project inspired Black and his students to develop a system to do so repeatedly and efficiently, and for producing user-friendly software.
How it works
The system developed by Black's research team enables the creation of a baseline TTS system after recording 2-3 hours of clear, consistent speech from a native speaker. The open-source text read by the speaker comes from various sources such as Wikisource, books and periodicals.
Bazaj said to CMU,"Each language is different and historically TTS systems have been done one at a time. We looked at the commonalities of Indian languages and developed tools to apply the same technology to multiple languages."
Though a machine learning process to create voice databases requires large-scale computing, the resulting database for each language is relatively small and can run on low-end Android phones or tablets that retail for under $100 (Rs 7,000). CMU and Hear2Read found that the cost threshold was within guidelines established by the Government of India's Assistance for Disabled Persons programme, which helps people with disabilities purchase assistive devices based on income.
The conversion from text to speech is done in real-time without internet access as most people in India either do not have continuous internet access or cannot afford it.
The Hear2Read app works with the Android Talkback accessibility option that allows people with low vision to use different baisc applications like web browsers, email, SMS, phone calls, word processors, spreadsheets and book readers.
Bazaj has had retinal detachments in both eyes that were successfully repaired. He realised that he was fortunate to have access to excellent medical care, which is not the case for most people in India. So this cause has personal meaning to him, and given his belief that the ability to read is directly related to a good quality of life, the determination with which he began this mission is perfectly understandable.
After meeting Black, he began supporting a CMU student to develop TTS for Indian languages. In addition, he has recruited more than 50 native Indian speaking volunteers based in the United States and India.
"This project couldn't have been accomplished without the dedication and support provided by our selfless volunteers," Bazaj said. The San Francisco Bay Area nonprofits Access Braille and Indians for Collective Action have provided funding to support the project.
Download the app here for Android.