Abstract

The natural language to SQL (NL2SQL) task enables non-expert users to interact with relational databases via natural language interfaces. However, NL2SQL frameworks often rely on Large Language Models (LLMs), raising concerns about computational overhead, data privacy, and deployment in resource-limited environments. To address these issues, we propose a hybrid schema-aware agentic system using Small Language Models (SLMs) as primary agents, with a selective LLM fallback mechanism. The LLM activates only when errors are detected in SLM-generated queries, reducing inference costs. Experiments on the BIRD benchmark dataset show our system achieves an execution accuracy of 53.91% and validation efficiency score of 50.46% on BIRD development set. While this accuracy is lower than state-of-the-art LLM-only systems like MAC-SQL (59.59% execution accuracy), the hybrid approach reduces query processing cost threefold compared to LLM-only frameworks. These findings demonstrate our cost-efficient hybrid method offers a compelling trade-off, delivering competitive performance with enhanced cost efficiency versus LLM-driven baselines.

Advisor

Naseef Mansoor

Committee Member

John Burke

Committee Member

Rajeev Bukralia

Date of Degree

2025

Language

english

Document Type

APP

Degree

Master of Science (MS)

Program of Study

Data Science

Department

Computer Information Science

College

Science, Engineering and Technology

Included in

Data Science Commons

Share

COinS
 

Rights Statement

In Copyright