About me

I was born in Athens, Greece. I received the ECE Dipl. from the National Tech. Univ. of Athens in 1990. I then spend 12 years in the east coast studying and doing research. I received the M.Sc. and Ph.D. degrees from Harvard University, in 1991 and 1995, respectively. My thesis work focused on analyzing and modeling non-linear interaction between source and vocal tract during speech production. The work relied heavily on the AM-FM speech model proposed by my advisor Petros Maragos with colleagues Jim Kaiser and Tom Quatieri.

I then went on to work for Bell Labs, and AT&T Shannon Labs. My research work there focused on robust speech recognition in collaboration with my mentor Rick Rose, on children speech analysis, recognition and interaction with my long-time collaborator and friend Shri Narayanan, and on spoken dialogue interaction with the DARPA Communicator team at Bell Labs most notably Eric Fosler-Lussier, Jeff Kuo and Egbert Ammicht. While at Bell Labs I embarked on a part-time M.B.A. at Stern School of Business, NYU.

among world-class colleagues including Vas Digalakis. My work there focused on robust speech recognition, multimedia processing, child-computer interaction, spoken dialogue systems and more recently lexical semantics. In 2013, I went back to my old stomping grounds at NTUA, now as an academic, working with work-class colleagues and students on child-robot interaction, emotion recognition, language development and deep learning.

In 2016, together with Shri Narayanan and Prem Natarajan, we founded Behavioral Signals an emotion AI deep tech startup, where I served both as a CEO and CTO. In 2021, I joined Alexa as an Amazon scholar working on natural multiparty dialogue joining forces with excellent colleagues and good friends.

Where I Work

What I do

My current research interests include speech processing, analysis, synthesis and recognition, dialog and multi-modal systems, lexical semantics, natural language understanding, general artificial intelligence, behavioral informatics, affective analysis, modeling and recognition, machine learning and representation learning, cognitive semantics, nonlinear signal processing, and multimodal child-computer interaction. I am especially interested on how cognitive semantic representations can motivate us to create computational models that are robust, accurate and rapid learners of multimodal information with application to deep learning.

I have authored or co-authored over 200 papers in professional journals and conferences (citations: 8006, h-index: 46, google scholar - Jan 2023). I was a co-author of the paper "Creating conversational interfaces for children" that received a 2005 IEEE Signal Processing Society Best Paper Award, and the co-editor of the book "Multimodal Processing and Interaction: Audio, Video, Text" , Springer, 2008. A list of my patents can be found here. I have been a member of the IEEE Signal Processing Society since 1992 and a IEEE fellow since 2016. I have served three terms at the IEEE Speech and Language Technical Committee and one term at the IEEE Multimedia Signal Processing Committee.

Contact me