News
-
[8/13/2024]: CoVoSwitch was selected as a Spotlight Paper at ACL 2024 Student Research Workshop. I did a poster presentation on August 11th and oral presentation on August 12th!
-
[7/9/2024]: My first paper was accepted to the Student Research Workshop at ACL 2024. I was also offered a travel grant for attending ACL in person. Beyond excited to be in Bangkok, Thailand!
-
[6/26/2024]: 4 other team members (from the U.S., China, India, and Vietnam) and I at MBZUAI UGRIP received the best team award from Professor Timothy Baldwin, Provost of MBZUAI. We ranked first out of 9 ML/CV/NLP teams for research content of our project on multilingual, multitask statement tuning of encoder models.
-
[6/6/2024]: I was accepted to Interspeech YFRSW 2024. I was offered a scholarship to attend and present my speech processing research at Interspeech 2024 in Kos, Greece.
Publications
2024
- CoVoSwitch: Machine Translation of Synthetic Code-Switched Text Based on Intonation Units
Yeeun Kang.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, 2024
Paper and Code
Selected Projects
-
Multilingual, Multitask Statement Tuning for Encoder Models
May 2024 - Jun 2024 (@MBZUAI)
Created multilingual NLU datasets and made them available through HuggingFace. Evaluated zero-shot performance of encoder models on various NLU tasks.
Topics:encoder models
,statement tuning
,NLU
[HuggingFace Org Repo] , [Final Presentation Slides], and [Blog Post] -
Code-Switched Text Dataset Creation by Intonation Unit Detection and Replacement
Apr 2024 - Jun 2024 (Independent Project)
Work accepted at ACL-SRW 2024.
Detected intonation unit boundaries of utterances in CoVoST 2 (speech-to-text translation dataset) with PSST, a pre-trained speech segmentation model from Whisper (STT), to create code-switched text leveraging prosodic features. Evaluated current SOTA NMT models’ performance on 13 languages, including low-resource languages such as Welsh, Mongolian, and Tamil. I named my synthetic dataset CoVoSwitch.
Topics:prosodic speech segmentation
,speech recognition (STT)
,neural machine translation (NMT)
,code-switching
[Paper], [Code], and [CoVoSwitch on HuggingFace Datasets] -
Undisclosed Project using LLMs
Jan 2024 - Apr 2024 (@NAVER Cloud)
Intern project at NAVER Cloud.
Paper TBD. -
Fine-tuning Whisper for Speech Recognition and Transcription
Aug 2023 - Dec 2023 (@Yale University)
Course final project for Yale’sCPSC 488/588: AI Foundation Models
. Took course with MS, PhD students and received full marks.
[Code] -
Refining Custom Voice Metric
Jun 2023 - Aug 2023 (@Samsung Electronics)
Intern project at Samsung Electronics.
Topics:speech synthesis (TTS)
,mean opinion score (MOS)
,custom voice on Bixby
[Blog Post]
Teaching
I was involved in different computer science education initiatives as an undergrad at Yale.
I was a TA (Teaching Assistant) for the following courses:
- CPSC 223: Data Structures and Programming Techniques (C, C++) [Jan 2023 - Dec 2023]
- CS50: Introduction to Computing and Programming (C, Python, SQL, JavaScript) [Aug 2022 - Dec 2023]
and a mentor for:
- Code Haven [Sep. 2021 - May 2022]
- Taught middle school students in New Haven, Connecticut how to code in Scratch.
Other Fun Things
I participated in HackMIT 2022. Along with 3 other team members I met on the site (2 others from the US and 1 from Canada), we were awarded as finalists at the hackathon held in Cambridge, Massachusetts.
I was also at the CS50 Hackathon at Harvard in Fall 2022, where I pulled an all-nighter(!!) helping Harvard and Yale students with their creative projects.
Last updated
I last updated this page on Aug 15, 2024.