2017 Speech Processing Courses in Crete
Towards Intelligible and Conversational Speech Synthesis Engines

24-28 July 2017    University of Crete, Heraklion, Crete, Greece


Alex Sorin received his M.Sc. degree in Applied Mathematics from the Automation and Computers Department of Moscow Oil and Gas Institute, USSR in 1979. In 1979 - 1987 he worked as a researcher in the Center for Geophysical Data Processing, Moscow. Since 1988 he works at IBM Haifa Research Lab on numerous research and product development projects in the areas of speech and image processing. He is an author of numerous scientific publications and patented inventions. His research interests include vocoding techniques, speech synthesis, voice transformation, speech emotion recognition, voice-based detection of cognitive deficit.

Spyros Raptis received his diploma in Electrical and Computer Engineering (1994) and the Doctoral Degree (2001) in Computational Intelligence from the National Technical University of Athens, Greece. Since 2006 he has been a member of the Institute for Language and Speech Processing (ILSP/R.C."Athena"). He is now a Research Director at the Speech and Music Technology Department and coordinates the research and development activities of the Speech Synthesis Group. He has been actively involved in a number of national and European R&D projects undertaking research, development and administrative roles. Dr. Raptis has participated in the core team of the development of the "Ekfonitis" system and its follow-up versions, which is one of the first text-to-speech synthesisers for the Greek language. He is the author or co-author of more than 40 publications in scientific books, journals and international conferences in the areas of speech processing, computational intelligence, and music technology.

Raul Fernandez works in the speech group at the IBM T. J. Watson Research Center, which he joined after receiving his PhD at MIT, doing research in text-to-speech systems. Formerly he was a Research Assistant at the MIT Media Lab, working on computational models for the automatic recognition of affect from spoken language. He has been actively involved in the development of various TTS systems that have achieved leading positions in international evaluations (e.g., the TC-STAR speech-to-speech evaluations, and in several editions of the Blizzard Challenge), and was also one of the leads for the text-to-speech component of Watson, the Deep-Question-Answering system developed at IBM to play Jeopardy. He currently works on improving the prosodic expressiveness of synthesis systems, with a focus on deep-learning, dynamic models, and exemplar-based techniques.

Richard Sproat received his Ph.D. in Linguistics from the Massachusetts Institute of Technology in 1985. He has worked at AT&T Bell Labs, at Lucent's Bell Labs and at AT&T Labs Research, before joining the faculty of the University of Illinois. From there he moved to the Center for Spoken Language Understanding at the Oregon Health & Science University. In the Fall of 2012 he moved to Google, New York as a Research Scientist. Sproat has worked in numerous areas relating to language and computational linguistics, including syntax, morphology, computational morphology, articulatory and acoustic phonetics, text processing, text-to-speech synthesis, and text-to-scene conversion. Some of his recent work includes multilingual named entity transliteration, the effects of script layout on readers' phonological awareness, and tools for automated assessment of child language. At Google he works on multilingual text normalization, most recently using neural methods. He also has a long-standing interest in writing systems and symbol systems more generally.

Simon King is Full Professor of Speech Processing with 23 years of research experience in speech technology. He has supervised 14 PhDs and examined 20 doctorates. He has co-ordinated two FP7 projects and several national projects, participated in many more, and organises the Blizzard Challenge speech synthesis evaluations. He is a Fellow of the IEEE, has served on the IEEE Spoken Language Technical Committee, edited on IEEE Trans. Audio, Speech & Language Proc., and is on the board of Computer Speech & Language.

Yannis Stylianou is Professor of Speech Processing at University of Crete, in Greece, Department of Computer Science, CSD UOC, and since 2013, he is also the Group Leader of the Speech Technology Group at Toshiba Cambridge Research Lab, UK. From 1996 until 2001 he was with AT&T Labs Research (Murray Hill and Florham Park, NJ, USA) as a Senior Technical Staff Member. In 2001 he joined Bell-Labs Lucent Technologies, in Murray Hill, NJ, USA (now Alcatel-Lucent). He holds MSc and PhD from ENST-Paris on Signal Processing and he has studied Electrical Engineering at NTUA Athens Greece (1991). He is an IEEE Fellow.

Vassilis Tsiaras received his degree in Mathematics from the University of Thessaloniki in 1990, his M.Sc in Mathematics from the QMW College, University of London in 1992 and his Ph.D. in Computer Science from the University of Crete in 2009. His research areas of interest include graph algorithms, biomedical signal processing, and statistical speech synthesis.