About Reliable Text to SQL

We have developed the first Text to SQL Engine that generates SQL from natural language queries in a reliable manner. This means that the engine will generate an SQL query only when is confident regarding accurate query generation. Otherwise it will abstain from providing an answer. Instead it will commence a human in the loop workflow where it will identify the precise aspects of the query that it needs help with and ask for help formulating precise questions. Once these questions are answered it will commence generation and either provide the correct answer or ask further clarification questions.

Although reliability in LLM generation is a difficult problem, we demonstrate that in the context of Text to SQL it is ammenable to a robust solution. We believe that reliable text to SQL provides an important first step in accurate natural language interfaces to data management systems. Current state of the art in text to SQL always yields an answer. This is problematic, error prone and unreliable, especially when text to SQL moves away from benchmark environments into production. Without reliable generation that provides accuracy guarantees text to SQL will be hard to adopt in the real work. We provide a first solution to this important problem.


Research directions include:


  • Reliable natural language to SQL generation
  • Measuring and quantifying accuracy in text to SQL generation with probabilistic guarantees
  • Ability to abstain at any stage of query generation during schema linking and/or SQL generation
  • Translate internal LLM states during answering into meaningful questions for a human to assist with query answering
  • Precisely quantifying abstaintion rate as a fraction of overall query answering accuracy and development of meanginful metrics.

logo
logo

Publications