Author Archives: Prof Ming Li, Ph.D.

About Prof Ming Li, Ph.D.

Ming Li received his B.S. degree in communication engineering from Nanjing University, China, in 2005 and his M.S. degree in signal processing from the Institute of Acoustics, Chinese Academy of Sciences, in 2008. He joined the Signal Analysis and Interpretation Laboratory (SAIL) at USC on a Provost fellowship in 2008 and received his Ph.D. in Electrical Engineering in May 2013. He is currently an associate professor at Electrical and Computer Engineering of Duke Kunshan University and a research scholar at ECE department of Duke University. His research interests are in the areas of speech processing and multimodal behavior signal analysis with applications to human centered behavioral informatics notably in health, education and security. He has published more than 80 papers and served as scientific committee members and reviewers for multiple conferences and journals. He servers as the area chair of speaker and language recognition on Interspeech 2016 and Interspeech 2018. Works co-authored with his colleagues have won awards at Body Computing Slam Contest 2009, IEEE DCOSS 2009, Interspeech 2011 Speaker State Challenge, Interspeech 2012 Speaker Trait Challenge, and ISCSLP 2014 best paper award. He received the IBM faculty award in 2016 and ISCA computer speech and language best journal paper award in 2018.

Bio of Prof. Ming Li

Ming Li received his Ph.D. in Electrical Engineering from University of Southern California in 2013. He is currently a Professor of Electronical and Computer Engineering at Division of Natural and Applied Science and Principal Research Scientist at Digital Innovation Research Center at Duke Kunshan University. He is also an Adjunct Professor at School of Computer Science of Wuhan University. His research interests are in the areas of audio, speech and language processing as well as multimodal behavior signal analysis and interpretation. He has published more than 200 papers and served as the member of IEEE speech and language technical committee, CCF speech dialogue and auditory processing technical committee, CAAI affective intelligence technical committee, APSIPA speech and language processing technical committee. He was an area chair at Interspeech 2016, Interspeech 2018, Interspeech 2020, SLT2022, Interspeech 2024, Interspeech 2025, ASRU 2025. He is the technical program co-chair at Odyssey 2022 and ASRU 2023. He is an editorial member of IEEE Transactions on Audio, Speech and Language Processing, Computer Speech and Language and APSIPA Transactions on Signal and Information Processing. Works co-authored with his colleagues have won first prize awards at Interspeech Computational Paralinguistic Challenges 2011, 2012 and 2019, ASRU 2019 MGB-5 ADI Challenge, Interspeech 2020 and 2021 Fearless Steps Challenges, VoxSRC 2021, 2022 and 2023 Challenges, ICASSP 2022 M2MeT Challenge, IJCAI 2023 ADD challenge, ICME 2024 ChatCLR challenge and Interspeech 2024 AVSE challenge. As a co-author, he has won the best paper award in DCOSS2009 and ISCSLP2014 as well as the best paper shortlist in Interspeech 2024. He received the IBM faculty award in 2016, the ISCA Computer Speech and Language 5-years best journal paper award in 2018 and the youth achievement award of outstanding scientific research achievements of Chinese higher education in 2020. He is a senior member of IEEE.

 

Welcome to the Speech and Multimodal Intelligent Information Processing (SMIIP) lab at Duke Kunshan University

Our research interests lie in the areas of intelligent speech processing as well as multimodal behavior signal analysis and interpretation.

  1. Intelligent speech processing: speaker verification, speaker diarization, paralinguistic state detection, anti-spoofing countermeasure, speech synthesis, voice conversion, keyword spotting, speech separation, spoken language identification, singing and music signal processing, etc.
  2. Multimodal behavior signal analysis and interpretation: Gathering, analyzing, modeling and interpreting multimodal human behavior signals (e.g. speech/language/audio/visual/physiological signal analysis and understanding) for assisted diagnose and treatment of autism spectrum disorders, etc.
  3. Pathological speech processing: laryngoscopic audio-visual signal processing, electronic laryngeal voice conversion