Chung-Ming Chien (簡仲明)

Chicago, Illinois, United States

20230530_santorini.jpg

Santorini, Greece

May 30, 2023

I am a 3rd-year Ph.D. student at Toyota Technological Institute at Chicago (TTIC), where I am fortunate to work with Karen Livescu. My research interests encompass the fields of speech and natural language processing technologies. Here are some topics I have been focusing on recently:

  • Speech Language Models
    How to facilitate speech applications with the aids of the knowledge learned from text by pre-trained large language models (LLMs)? How to fine-tune LLMs to take speech inputs and outputs and enable general-purpose speech conversational AI?
  • Speech Generation
    Control and model non-lexical information in generated speech in a more efficient and intuitive way.
  • Self-Supervised Speech Representations
    Analyze the information encoded in self-supervised speech representations and explore various applications for the learned representations and units.

Prior to joining TTIC, I earned my Master’s degree in Computer Science from National Taiwan University (NTU), where I had the privilege of working with Lin-shan Lee and Hung-yi Lee at the Speech Processing Lab. Outside of school, I also gained valuable experience through summer internships with Amazon Alexa TTS Research, FAIR (AI at Meta), and NVIDIA.

Beyond my academic pursuits, I am a sports enthusiast and amateur athlete. I captained the baseball varsity team of NTU during my undergraduate years. I am also broadly interested in tennis, hiking, scuba diving, swimming, badminton, and training. In 2022, I achieved a personal milestone by completing my first marathon, and I have been dedicated to improving my PB with the goal of breaking the 3:10 mark!

news

Jun 4, 2024 “Learning Fine‑Grained Controllability on Speech Generation via Efficient Fine‑Tuning” is accepted to InterSpeech 2024!
May 16, 2024 “On the Evaluation of Speech Foundation Models for Spoken Language Understanding” is accepted to Findings of ACL 2024!
Apr 16, 2024 I gave a talk at Midwest Speech and Language Days in Ann Arbor, Michigan :microphone:
Apr 9, 2024 I successfully passed the qualifying exam of TTIC and will soon become a Ph.D. candidate :mortar_board:
Jan 23, 2024 “What Do Self‑Supervised Speech Models Know about Words” is accepted to TACL 2024!
Jan 18, 2024 I will join NVIDIA NeMo Team for my 2024 summer internship and will work on speech language models!
Jan 13, 2024 My open-source FastSpeech 2 project gets over 1.5k stars on Github :sparkles:
Dec 20, 2023 I share the honor of the Best Student Paper Award of ASRU 2023 with Mingjiamei, Ju-Chieh, and Karen. Check out our work “Few-shot SLU via Joint Speech-Text Models” for more details :trophy:
Oct 7, 2023 “Toward Joint Language Modeling for Speech Units and Text” is accepted to Findings of EMNLP 2023!
Sep 22, 2023 Our work “Few-shot SLU via Joint Speech-Text Models” is accepted at ASRU 2023, and I’ll surely go back Taiwan to present it in person!

selected publications

  1. Learning Fine-Grained Controllability on Speech Generation via Efficient Fine-Tuning
    Chung-Ming Chien, Andros Tjandra, Apoorv Vyas, and 3 more authors
    In Interspeech 2024
  2. Few-Shot Spoken Language Understanding via Joint Speech-Text Models
    Chung-Ming Chien, Mingjiamei Zhang, Ju-Chieh Chou, and 1 more author
    Best Student Paper Award
    In 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
  3. FragmentVC: Any-To-Any Voice Conversion by End-To-End Extracting and Fusing Fine-Grained Voice Fragments with Attention
    Chung-Ming Chien*, Yist Y. Lin*, Jheng-Hao Lin, and 2 more authors
    *equal contribution
    In ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)