I am a researcher at Guangdong University of Technology, working in the Data Mining & Information Retrieval Laboratory under the supervision of Prof. Ruichu Cai. My research focuses on Text-to-SQL, Large Language Models and Sentiment Analysis. My work aims to bridge the gap between natural language processing and database management, enhancing human-computer interaction.

My research interests include:

  • Text-to-SQL: Developing models that translate natural language queries into SQL statements, facilitating intuitive database interactions.
  • Large Language Models: Exploring the capabilities and applications of large-scale language models in various NLP tasks.
  • Sentiment Analysis: Analyzing and interpreting human emotions and opinions through computational methods.

I am always keen to collaborate with motivated students and industry partners on Text‑to‑SQL agents, causal LLMs and ABSA. Drop me an email with your CV and a short statement of interest.

🔥 News

  • 2025.01:  🎉🎉 3 papers under my supervision have been accepted by NAACL 2025.
  • 2025.01:  🎉🎉 Chat2DB have been accepted by ICDE 2025 Demo track.
  • 2024.11:  🎉🎉 2 papers under my supervision have been accepted by COLING 2025.
  • 2024.07:  🎉🎉 I am the problem setter for the 2024 Third International Algorithm Case Competition (IACC), hosted by Pazhou Lab. The challenge I designed focuses on “Generating Database Query Commands Based on Large Language Models”. The competition is currently underway with a total prize pool of 500,000 RMB. (Learn more)
  • 2024.05:  🎉🎉 The paper I supervised, “S2GSL: Incorporating Segment to Syntactic Enhanced Graph Structure Learning for Aspect-based Sentiment Analysis” has been accepted by ACL 2024 Main.
  • 2023.04:  🎉🎉 My leader launched Chat2DB, a conversational AI product designed to access private databases or tabular data. With Chat2DB, users don’t need to learn technical principles or use specialized tools. By simply uploading their data or database and describing their requirements in the chatbox, they can receive results within seconds.

📝 Publications

†Corresponding Author, *Equal Contribution

NAACL 2025(Main)
sym

Handling Missing Entities in Zero-Shot Named Entity Recognition: Integrated Recall and Retrieval Augmentation

Ruichu Cai, Junhao Lu, Zhongjie Chen, Boyan Xu†, Zhifeng Hao

  • We introduce IRRA, a novel two-stage framework that first uses recall-augmented entity extraction on perturbed data to boost recall and then applies retrieval-augmented generation for type correction, achieving state-of-the-art zero-shot cross-domain NER performance.
NAACL 2025(Main)
sym

Track-SQL: Enhancing Generative Language Models with Dual-Extractive Modules for Schema and Context Tracking in Multi-turn Text-to-SQL

Bingfeng chen, Shaobin Shi, yongqi luo, Boyan Xu†, Ruichu Cai, Zhifeng Hao

  • We propose Track-SQL, a dual-extractive framework that augments generative LMs with a semantic-enhanced schema extractor and a schema-aware context extractor to track schema and context changes in multi-turn Text-to-SQL, achieving state-of-the-art results on SparC and CoSQL with execution accuracy gains of 7.1% and 9.55%.
NAACL 2025(Findings)
sym

S^2IT: Stepwise Syntax Integration Tuning for Large Language Models in Aspect Sentiment Quad Prediction

Bingfeng chen, Chenjie Qiu, Yifeng Xie, Boyan Xu†, Ruichu Cai, Zhifeng Hao

  • We propose S^2IT, a novel Stepwise Syntax Integration Tuning framework that progressively incorporates global and local syntactic structure knowledge into LLMs to significantly enhance performance on Aspect Sentiment Quad Prediction.
COLING 2025(Oral)
sym

Dr.ECI: Infusing Large Language Models with Causal Knowledge for Decomposed Reasoning in Event Causality Identification

Ruichu Cai, Shengyin Yu, Jiahao Zhang, Wei Chen, Boyan Xu†, Keli Zhang

  • We propose Dr.ECI, a multi-agent decomposed reasoning framework for Event Causality Identification that employs specialized discovery agents (Causal Explorer, Mediator Detector) and reasoning agents (Direct and Indirect Reasoners) to capture implicit, indirect, and generalized causal structures, achieving state-of-the-art performance.
COLING 2025(Oral)
sym

CACA: Context-Aware Cross-Attention Network for Extractive Aspect Sentiment Quad Prediction

Bingfeng Chen, Haoran Xu, Yongqi Luo, Boyan Xu†, Ruichu Cai, Zhifeng Hao

  • We propose CACA, an extractive ASQP framework that employs a Context-Aware Cross-Attention Network—alternating updates of explicit and implicit representations—and contrastive learning to align aspects and opinions for implicit term prediction, achieving superior results on three benchmarks.
ACL 2024(Main)
sym

S^2GSL: Incorporating Segment to Syntactic Enhanced Graph Structure Learning for Aspect-based Sentiment Analysis

Bingfeng Chen, Qihan Ouyang, Yongqi Luo, Boyan Xu†, Ruichu Cai, Zhifeng Hao (ACL 2024 Main)

  • We introduce S^2GSL, a novel graph-structure learning framework for ABSA that combines segment-aware semantic graph learning with syntax-based latent graph learning—and a self-adaptive aggregation network to filter irrelevant contexts and fuse diverse structures—achieving superior results on four benchmarks.