If computers could become ‘smart’ enough to recognize who is talking, that could allow them to produce real-time transcripts of meetings, courtroom proceedings, debates, and other important events. In the dissertation that will allow him to receive his Ph.D. at Commencement this year, Brian Reggiannini found a way to advance the state of the art for voice- and speaker-recognition.
Everyone does signal processing every day, even if we don’t call it that. With friends at a sports bar, we peer up at the TV to see the score, we turn our head toward the crashing sound when a waitress drops a glass, and perhaps most remarkably, we can track the fast-paced banter of all the people in our booth, even if we’ve never met some of the friends-of-friends who have insinuated themselves into the scene.
Very few of us, however, could ever get a computer to do anything like that. That’s why doing it well has earned Brian Reggiannini a Ph.D. at Brown and a career in the industry.
In his dissertation, Reggiannini managed to raise the bar for how well a computer connected to a roomful of microphones can keep track of who among a small group of speakers is talking. Further refined and combined with speech recognition, such a system could lead to instantaneous transcriptions of meetings, courtroom proceedings, or debates among, say, several rude political candidates who are prone to interrupt. It could help the deaf follow conversations in real-time. (more…)