About me

I'm Feng, a Software Engineer at AI Rudder, a startup in San Mateo, California, where we're dedicated to empowering LLMs (Large Language Models). At AI Rudder, We integrate with Text-to-Speech (TTS) and Automatic Speech Recognition (ASR) technologies to develop innovative voice bots for phone calls.

I received my master's degree in Computer Science from University of Illinois at Urbana-Champaign in Winter 2021. Before pursuing my master's degree, I served as a Research Intern at Westlake University, advised by Prof. Lanzhenzhong in the Deep Learning Lab. Prior to that, I received my bachelor's degree in Computer Science from Ohio State University in 2020.

My research interests are deeply rooted in unraveling the fundamental principles behind human language and communication. I am driven by the challenge of translating these insights into practical algorithms that address real-world challenges. This drive fuels my enthusiasm for life.

Publication

  • Toward the Limitation of Code-Switching in Cross-Lingual Transfer

    Y Feng, F Li, P Koehn EMNLP

    Multilingual pretrained models have shown strong cross-lingual transfer ability. Some works used code-switching sentences, which consist of tokens from multiple languages, to enhance the cross-lingual representation further, and have shown success in many zero-shot cross-lingual tasks. However, code-switched tokens are likely to cause grammatical incoherence in newly substituted sentences, and negatively affect the performance on token-sensitive tasks, such as Part-of-Speech (POS) tagging and Named-Entity-Recognition (NER). This paper mitigates the limitation of the code-switching method by not only making the token replacement but considering the similarity between the context and the switched tokens so that the newly substituted sentences are grammatically consistent during both training and inference. We conduct experiments on cross-lingual POS and NER over 30+ languages, and demonstrate the effectiveness of our method by outperforming the mBERT by 0.95 and original code-switching method by 1.67 on F1 scores.

    Paper Code

    December 2022

  • Learn To Remember: Transformer With Recurrent Memory For Document-Level Machine Translation

    Y Feng, F Li, Z Song, B Zheng, P Koehn NAACL

    The Transformer architecture has led to significant gains in machine translation. However, most studies focus on only sentence-level translation without considering the context dependency within documents, leading to the inadequacy of document-level coherence. Some recent research tried to mitigate this issue by introducing an additional context encoder or translating with multiple sentences or even the entire document. Such methods may lose the information on the target side or have an increasing computational complexity as documents get longer. To address such problems, we introduce a recurrent memory unit to the vanilla Transformer, which supports the information exchange between the sentence and previous context. The memory unit is recurrently updated by acquiring information from sentences, and passing the aggregated knowledge back to subsequent sentence states. We follow a two-stage training strategy, in which the model is first trained at the sentence level and then finetuned for document-level translation. We conduct experiments on three popular datasets for document-level machine translation and our model has an average improvement of 0.91 s-BLEU over the sentence-level baseline. We also achieve state-of-the-art results on TED and News, outperforming the previous work by 0.36 s-BLEU and 1.49 d-BLEU on average.

    Paper Code

    July 2022

What's New

Resume

Education

  1. University of Illinois Urbana-Champaign

    08/2021 – 12/2022

    Master of Computer Science

  2. The Ohio State University

    08/2017 – 08/2020

    Bachelor of Science in Computer Science & Engineering

Research Experience

  1. Research Intern,Deep Learning Lab, Westlake University

    08/2020 – 04/2021

    Participated developing AI-Powered Psychology Counseling System base on GPT with a ranker for responses.


    Enhanced the chatbot's capabilities by incorporating structured counseling rules through the Rasa open-source framework. Statistical models in the Dialogue System, along with a Ranker for responses, were utilized for improved performance

  2. Research Intern,Department of Linguistic, The Ohio State University

    02/2019 – 04/2019

    Developed a well-visualized tool for Humanity Entity Recognizer, an active learning system that reduces the time and cost for NER annotation task.


    By visualizing the original HER to a web interface, white-box HER solution becomes more accessible to non-technical linguists experts.

Work Experience

  1. Software Developer, AI Rudder Inc.

    02/2023 – Current

    Developing, and maintaining a voice conversational AI system, leveraging the integration of LLM, TTS, and ASR.


    Developed an inbound call monitoring system, granting clients the ability to conduct in-depth performance analysis utilizing a wealth of metrics, accessing call details for review and download, and seamlessly managing the knowledge base and callback API

  2. Software Developer Intern, AI-Lab, ByteDance

    04/2021 – 07/2021

    Built an analysis tool to measure the performance of computer vision SDK for optimization.


    Evaluating cutting-edge computational graphics models, design and deployed recurrent test set onto monitoring servers.

  3. Software Developer Intern,Bank of America Merrill Lynch

    06/2019 – 08/2019

    Maintained the communication center website to deliver an instant and reliable services to better connect clients and investment advisors.

-->