Jie Cao

Assistant Professor
School of Computer Science
University of Oklahoma
110 W. Boyd St., DEH 205, Norman, OK 73019
🎓 Prospective Students: I'm recruiting 1-2 Ph.D. students in Fall 2025!

Jie Cao is an Assitant Professor in the School of Computer Science at the University of Oklahoma. Before joining OU, he spent two years as a post-doctoral researcher at the NSF AI Institute for Student-AI Teaming (iSAT) at the University of Colorado Boulder, where he mainly worked with Dr. James Martin and Dr. Martha Palmer. He obtained his Ph.D. from the Kahlert School of Computing at the University of Utah, where he worked with Dr. Vivek Srikumar. Earlier in his academic journey, he completed his M.S. and B.S. in Computer Science at Huazhong University of Science of Technology~(HUST) in China, and he has also worked/interned in industrial companies including Alibaba, Baidu, Sohu, WeChat(@Palo Alto), and Amazon, etc.

Research Interests

I work on Natural Language Processing and Machine Learning. Current interests include:

  • Multi-party Multi-modal Dialogue, and its applications on Mental Health, Education, etc
  • Efficient Structured Prediction and Symbolic Methods for Controlling and Augmenting Neural Networks
  • Robust Deployment, and Evaluation of Trustworthy AI

News

  • 11/2024: Our paper “Enhancing Talk Moves Analysis in Mathematics Tutoring through Classroom Teaching Discourse” got accepted to COLING’2025.
  • 11/2024: Our paper ““Understanding Robustness Lottery”: A Geometric Visual Comparative Analysis of Neural Network Pruning Approaches” got accepted to TVCG.
  • 09/2024: Talk with students on “History of NLP” at the OU AI/ML Club.
  • 07/2024: Our paper on dialogue classification via LLM finetuning is accepted to L@S’24.
  • 02/2024: Invited Talk on “Modularized Conversational Modeling” at Emory University, Northern Illinois University, Georgia State University, University of Oklahoma.
  • 11/2023: In Fall 2023, I taught NLP class~(CSCI-LING 5832) with James Martin. I newly created course materials on LLMs, In-Context Learning, Dialogue Generation, etc.
  • 05/2023: Our paper on Question Generation accepted to BEA’23
  • 05/2023: A short paper on “Mind the Gap between the Application Track and the Real World” got accepted to ACL’23
  • 04/2023: Our paper on “A Comparative Analysis of Automatic Speech Recognition Errors in Small Group Classroom Discourse” got accepted to UMAP’23.
  • 03/2023: My research on conversational simulation on small-group discussion got awarded by iSAT Trainee Grant.
  • 02/2023: Our paper on AI agent for Jigsaw Classrooms got accepted on AIAIC’23.
  • 12/2022: Our paper on Dependency Dialog Act got accepted on IWSDS’23.
  • 12/2022: Invited Talk on Database Workload Characterization work at Microsoft’s Gray Systems Lab. Slides.
  • 08/2022: I joined NSF AI Institute for Student-AI Teaming (iSAT) as a post-doctoral researcher.
  • 06/2022: New preprint on visual analysis of neural network pruning.

Selected Publications

(See full list in Publication Page or Google Scholar)

  • Jie Cao, Abhijit Suresh, Jennifer Jacobs, Charis Clevenger, Amanda Howard, Chelsea Brown, Brent Milne, Tom Fischaber, Tamara Sumner, and James H. Martin. 2025. Enhancing Talk Moves Analysis in Mathematics Tutoring through Classroom Teaching Discourse. In The 31st International Conference on Computational Linguistics (COLING 2025).    BibTeX
  • Baptiste Moreau-Pernet, Yu Tian, Sandra Sawaya, Peter Foltz, Jie Cao, Brent Milne, and Thomas Christie. 2024. Classifying Tutor Discursive Moves at Scale in Mathematics Classrooms with Large Language Models. In Proceedings of the Eleventh ACM Conference on Learning @ Scale, pages 361–365. Association for Computing Machinery.    BibTeX |  PDF  |  URL
  • Zhimin Li, Shusen Liu, Xin Yu, Kailkhura Bhavya, Jie Cao, Diffenderfer James Daniel, Peer-Timo Bremer, and Valerio Pascucci. 2024. “Understanding Robustness Lottery”: A Geometric Visual Comparative Analysis of Neural Network Pruning Approaches. IEEE Transactions on Visualization and Computer Graphics.    BibTeX
  • Jie Cao, Ananya Ganesh, Jon Cai, Rosy Southwell, Magerate Perkoff, Michael Regan, Katharina Kann, James Martin, Martha Palmer, and Sideny D’Mello. 2023. A Comparative Analysis of Automatic Speech Recognition Errors in Small Group Classroom Discourse. Proceedings of the 31st ACM Conference on User Modeling Adaptation and Personalization (ACM UMAP 2023).    BibTeX |  PDF
  • Jon Cai, Brendan D. King, Margaret Perkoff, Shiran Dudy, Jie Cao, Marie Grace, Natalia Wojarnik, Ganesh Ananya, James Martin, Martha Palmer, Marilyn Walker, and Jeffrey Flanigan. 2022. Dependency Dialogue Acts — Annotation Scheme and Case Study. The 13th International Workshop on Spoken Dialogue Systems Technology.    BibTeX |  PDF
  • Jie Cao. 2022. Inductive Biases for Deep Linguistic Structured Prediction with Independent Factorization. Available from ProQuest Dissertations & Theses A&I;ProQuest Dissertations & Theses Global. (2777357718).    BibTeX |  PDF
  • Debjyoti Paul*, Jie Cao*, Feifei Li, and Vivek Srikumar. 2021. Database Workload Characterization with Query Plan Encoders. Proceedings of the VLDB Endowment, 15(4):923–935.    BibTeX |  PDF
  • Jie Cao and Yi Zhang. 2021. A Comparative Study on Schema-Guided Dialogue State Tracking. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 782–796.    BibTeX |  PDF  |  Poster
  • Zhiqiang Liu, Zuohui Fu, Jie Cao, Gerard de Melo, Yik-Cheung Tam, Cheng Niu, and Jie Zhou. 2019. Rhetorically Controlled Encoder-Decoder for Modern Chinese Poetry Generation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.    BibTeX |  PDF
  • Jie Cao, Yi Zhang, Adel Youssef, and Vivek Srikumar. 2019. Amazon at MRP 2019: Parsing Meaning Representations with Lexical and Phrasal Anchoring. In Proceedings of the Shared Task on Cross-Framework Meaning Representation Parsing at the Conference on Natural Language Learning(CoNLL), pages 138–148.    BibTeX |  PDF
  • Jie Cao, Michael Tanana, Zac Imel, Eric Poitras, David Atkins, and Vivek Srikumar. 2019. Observing Dialogue in Therapy: Categorizing and Forecasting Behavioral Codes. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.    BibTeX |  PDF  |  Slides

Research Experience

  • [09/2022 - 08/2024] Postdoctoral Research Associate at NSF AI Institute for Student-AI Teaming(iSAT), CU Boulder.
  • [08/2015 - 08/2022] Research Assistant at Utah NLP Lab, Univeristy of Utah, Salt Lake City
  • [06/2020 - 12/2020] Applied Scientist II Intern at AWS AI, Amazon Lex, Remote
    • Our paper on schema-guided dialog got accepted by NAACL 2021.
  • [06/2019 - 09/2019] Applied Scientist Intern at AWS AI, Amazon Lex, Seattle
    • In CoNLL shared task MRP 2019, over 16 teams, our system on cross-framework meaning representation parsing ranked 1st in AMR parsing task, 5th in UCCA, 6th and 7th in PSD and DM tasks. Spotlight Talk
  • [05/2018 - 08/2018] Research Intern at Tecent, WechatAI, Palo Alto
    • Our dialogue system based Gated Attentive Memory Network ranked Top 2 in DSTC7, and got accepted by AAAI 2019 DSTC7 workshop.
  • [09/2008 - 03/2012] Research Assistant at CGCL Lab, Huazhong University of Science and Technology, Wuhan
    • I worked closely with Prof. Xia Xie and Prof. Hai Jin. My research interests are widely around Xen, Xen-ARM virtualization, and distributed computing. We study equipping R language with JVM-based large scale distributed statistical infrastructure, such as Hadoop, Spark.

Academic Service