On the just-concluded "Blizzard Challenge 2014" international speech synthesisevaluation event, the system submitted by USTC National Engineering Laboratory for Speech and Language Information Processing (NEL-SLIP) ranked first place on 11 of all 25 test items, which is the best overall performance among all submitted systems. The wonderful ranking, which is the 9th time in succession since 2006, fully shows the lab's solid leading position in international speech synthesis field.
As the largest and most influential speech synthesis evaluation event, the previous Blizzard Challenge had attracted many world-class research institutes and companies, such as Carnegie-Mellon University, Edinburgh University, Nagoya Institute of Technology, Advanced Telecommunications Research Institute International (ATR), IBM Research Laboratory and Microsoft Research Asia etc. Compared with previous competitions, the competition this year changed the main test language from English to non-English languagesfor the first time. The submitted systems were required to cover six Indian languages, including Hindi, Assamese, etc. Besides, new test items were added to evaluate the performance of synthesizing multilingual sentences with mixed Indian and English words, which further increased the difficult of system construction.
Leaded by Associate Professor Zhen-Hua Ling, the research team of NEL-SLIP overcame the difficulties of tight preparation schedule, multiple unfamiliar languages, limited language resources and expert experiences. By adopting the framework of statistical acoustic model based unit selection and parametric speech synthesis methods, the team completed the construction of all speech synthesis systems within two months. Furthermore, a novel speech synthesis method using deep neural networks was proposed and implemented, which further improved the quality of synthetic speech. The organizer of this year finally selected five Indian languages for subjective listening test under unified organizations. The test items for each language included similarity, naturalness, intelligibility, similarity for multilingual sentences, and naturalness for multilingual sentences. The system submitted by NEL-SLIP ranked first place on 11 evaluation items. Its overall performance is the best among all submitted systems.
NEL-SLIP was approved for construction by National Development and Reform Commission in June 2011. It is a research institute co-founded by USTC and iFLYTEK Co, Ltd. Its former was USTC iFLYTEK Speech Laboratory.NEL-SLIP has the comprehensive advantages of supporting research and leading industrialization. It is the only national level research and development platform of speech industry in China.
(LING Zhenghua, School of Information Sciences)