Our current research involves developing and testing embellished eyeglasses (iGlasses; iGlasses is a trademark of Dominic W. Massaro), which will perform two simultaneous functions (Massaro, Carreira-Perpinan, & Merrill, 2009). First, real-time acoustic analysis of an interlocutor’s speech will track several speech-relevant acoustic features. In this case, the features are voicing, frication, and nasality. Second, these acoustic features will be transformed into continuous visual cues displayed on the eyeglasses. By integrating these visual cues with lipreading (called speechreading, because it involves more than just the lips), the user will gain nearly full perceptual access to the conversation.
The iGlasses are envisioned to be worn as a regular pair of eyeglasses, but will contain two small microphones and three colored LEDs. The wearer looks at the interlocutor and the microphones deliver the interlocutor’s speech to a processing device such as an iPhone, which processes the acoustic input. The input is analyzed for low frequency voicing information, high frequency frication energy, and nasal resonance that are associated with the acoustic/phonetic properties of voicing, frication, and nasality. The three properties are then transformed in real-time into simple visual cues displayed on the three vertically mounted LEDs, as shown in Figure 1. These particular phonetic properties were chosen because they are fairly easy to track in the speech signal, and more importantly, because they distinguish instances within a viseme category (i.e., subsets of phonemes that are highly confusable in speechreading). These cues also require no literacy, which is a benefit in that it widens the demographic to include pre-literate children and other non-readers.
Presenting auditory speech characteristics visually on LEDs is a form of sensory substitution (Bach-y-Rita & Kercel, 2003), although in our case it is supplementary because the presence of auditory speech is not precluded. Many studies have been done that show a benefit of sensory substitution. Brain plasticity allows for adaptations in the central nervous system that result in highly comparable perceptual experiences coming from one sensory modality being substituted for another (Rosenblum, 2010). Sensory substitution can result in inputs from one sensory modality reaching brain structures related to another. An example is that sign language can activate the auditory cortex in deaf individuals (Finney, Fine, & Dobkins, 2001).
Our preliminary research shows that the LED cues can be learned to assist in speechreading comprehension and help the user better understand his or her conversational partner. An important question for the current research is whether the visual cues can be integrated with the facial information in the same manner that has been found for the integration of audible and visible speech (Massaro, 1998; Massaro & Cohen, 2000).