Advanced Lecture Series in Pattern Recognition
题 目 (TITLE)：Text to Speech without the Text
讲 座 人 (SPEAKER)：Prof. Alan W Black (Carnegie Mellon University)
主 持 人 (CHAIR)：Prof. Jianhua Tao
时 间 (TIME)：July 23(Wednesday), 2014, 16:00PM
地 点 (VENUE)：No.1 Conference Room (3rd floor), Intelligence Building
The quality of data driven text to speech has moved from being just about understandable to high quality, high understandability and near natural quality for the world’s major languages. However there are still issues in building speech technologies for the language beyond the top 10 languages of the world. Given that once we go beyond even the top 100, many of these languages have ill-defined written forms and sometimes no standard writing systems at all. Although some of the speakers of those languages may be literate in other languages, if we are to provide speech systems to everyone on the planet we need to address speech processing in environments where written forms do not exist.
This talk will describe initial steps in building text to speech systems for languages where no written form exists, or only a poor standard exists. We show how new symbolic representations can be derived from acoustics only, using current and novel statistical modeling techniques. The results are sufficient to build understandable synthesizers and we show how that representation may be used in practical speech technologies.
Alan W Black is a Professor in the Language Technologies Institute at Carnegie Mellon University. He was born in Edinburgh, Scotland, and did his bachelors in Coventry, England, and his masters and doctorate at the University of Edinburgh. Before joining the faculty at CMU in1999, he worked in the Centre for Speech Technology Research at the University of Edinburgh, and before that at ATR in Japan. He is one of the principal authors of the free software Festival Speech Synthesis System, the FestVox voice building tools and CMU Flite, a small footprint speech synthesis engine that is the basis for many research and commercial systems around the world. He also works in spoken dialog systems, the Lets Go Bus Information project and mobile speech-to-speech translation systems. Prof Black is an elected member of ISCA board (2007-2015). He has over 200 refereed publications and is one of the highest cited authors in his field.