Boxi Cao 曹博希

I am a Ph.D. Candidate (from 2019.09) in the Chinese Information Processing Laboratory at the Institute of Software, Chinese Academy of Sciences, under the Supervision of Professor Xianpei Han and Professor Le Sun. I received my Bachelor degree in Beijing University of Posts and Telecommunications in June 2019. My research interests include:

  • Natural Language Processing
  • Knowledge Lifecycle in Big Language Models
  • Information Extraction

Contact: boxi2020 AT iscas dot ac dot cn

Google Scholar    /    Semantic Scholar    /    ACL Anthology   /    GitHub    /    Blog    /    Zhihu   /    Douban

News
10/2023 One first-authored paper got accepted by EMNLP 2023 main conference.
08/2023 We present a tutorial on CCKS 2023 about the life cycle of knowledge in big language models.
08/2023 Exceeded 100 citations on Google Scholar!
05/2023 One co-authored paper got accepted by ACL 2023 main conference.
12/2022 One first-authored survey paper got accepted by Machine Intelligence Research.
02/2022 One first-authored paper got accepted by ACL 2022 main conference.
02/2022 One co-authored paper got accepted by ACL 2022 main conference.
12/2021 Participated in building the benchmark CUGE as a core member.
12/2021 One paper got recommended by Micheal Galkin on Towards Data Science.
05/2021 One first-authored paper got accepted by ACL 2021 main conference.
Publications
2023
The Life Cycle of Knowledge in Big Language Models: A Survey
Boxi Cao, Hongyu Lin, Xianpei Han, Le Sun
Machine Intelligence Research (2023)
Tutorial Slides / Paperlist / Paper / Preprint
Does the Correctness of Factual Knowledge Matter for Factual Knowledge-Enhanced Pre-trained Language Models?
Boxi Cao*, Qiaoyu Tang*, Hongyu Lin, Xianpei Han, Le Sun
The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023)
Paper
Retentive or Forgetful? Diving into the Knowledge Memorizing Mechanism of Language Models
Boxi Cao, Qiaoyu Tang, Hongyu Lin, Xianpei Han, Jiawei Chen, Tianshu Wang, Le Sun
The 2024 International Conference on Computational Linguistics (COLING 2024)
Preprint
Learning In-context Learning for Named Entity Recognition
Jiawei Chen, Yaojie Lu, Hongyu Lin, Jie Lou, Wei Jia, Dai Dai, Hua Wu, Boxi Cao, Xianpei Han, Le Sun
Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (ACL 2023)
Preprint / Paper
ToolAlpaca: Generalized Tool Learning for Language Models with 3000 Simulated Cases
Qiaoyu Tang, Ziliang Deng, Hongyu Lin, Xianpei Han, Qiao Liang, Boxi Cao, Le Sun
arXiv preprint arXiv:2306.05301 (2023)
Preprint
2022
Can Prompt Probe Pretrained Language Models? Understanding the Invisible Risks from a Causal View
Boxi Cao, Hongyu Lin, Xianpei Han, Fangchao Liu, Le Sun
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022)
Paper / Preprint / Code / Slides / Poster
Pre-training to Match for Unified Low-shot Relation Extraction
Fangchao Liu, Hongyu Lin, Xianpei Han, Boxi Cao, Le Sun
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022)
Paper / Preprint / Code
2021
Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases
Boxi Cao, Hongyu Lin, Xianpei Han, Le Sun, Lingyong Yan, Meng Liao, Tong Xue, Jin Xu
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL 2021)
Paper / Preprint / Code / Slides / Poster
CUGE: A Chinese Language Understanding and Generation Evaluation Benchmark
Yuan Yao, Qingxiu Dong, Jian Guan, Boxi Cao, Zhengyan Zhang, Chaojun Xiao, Xiaozhi Wang, et al.
arXiv preprint arXiv:2112.13610 (2021)
Website / Preprint / Github / News
Academic Services
Reviewer/PC Member ACL 2023
EMNLP 2022, 2023
COLING 2022, 2024
Education
2019-Present Ph.D in Computer Software and Theory (Candidate)
School of Computer Science and Technology, University of Chinese Academy of Sciences
2015-2019 B. Eng in Computer Science and Technology
School of Computer Science, Beijing University of Posts and Telecommunications
2012-2015 Nanya Middle Shool of Changsha
Selected Honors and Awards
2022 Pacemaker to Merit Student, University of Chinese Academy of Sciences (Top 1%)
2021 Merit Student, University of Chinese Academy of Sciences
2019 Outstanding Graduates, Beijing Municipal Commission of Education
2018, 2016 Merit Student, Beijing University of Posts and Telecommunications
2017, 2016 First-class Scholarship, Beijing University of Posts and Telecommunications
2017 Outstanding Student Cadre, Beijing University of Posts and Telecommunications
2016 Merit Student, Beijing Municipal Commission of Education
Selected Competition Awards
2022 Third Prize, Language and Intelligence Challenge - Sentiment Interpretation Task (LIC 2022)
2017, 2016 Bronze Medal, China Collegiate Programming Contest (ACM-CCPC)
2017 Second Prize, Group Programming Lodder Tournament (CCCC-GPLT)
2017 Second Prize, China Collegiate Cloud Computing Application and Innovation Competition