The “Holy Grail” of Business Intelligence (BI) is to stop writing SQL queries and just ask plain English questions.
“Show me total revenue for Q3 broken down by region.“
“Which users churned in the last 30 days?”
In 2026, Text-to-SQL technology has matured enough to make this real. However, connecting an LLM directly to your production database is dangerous. If you don’t do it right, the AI might hallucinate a query or, worse, try to DROP TABLE.
At The AI Division, we build Secure AI Analytics Dashboards for enterprises. We ensure the AI is smart enough to answer complex questions but restricted enough to never damage your data.
Here is how to build a production-ready Text-to-SQL pipeline using Python and LangChain.
The Architecture: How It Works
We don’t just ask ChatGPT to “write SQL.” We use an Agentic Workflow:
- Schema Inspection: The AI reads your table names and column types.
- Query Generation: It writes the SQL query based on the user’s question.
- Execution: It runs the query against the database.
- Interpretation: It takes the raw data rows and summarizes them into natural language.
Step 1: Safety First (Read-Only Permissions)
STOP. Before you run any code, you must create a specific database user for your AI.
Never give the AI admin or write access. It should only have SELECT permissions.
-- Run this in your Postgres/SQL Console
CREATE USER ai_analyst WITH PASSWORD 'secure_password';
GRANT CONNECT ON DATABASE my_company_db TO ai_analyst;
GRANT USAGE ON SCHEMA public TO ai_analyst;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO ai_analyst;
-- The AI can now LOOK, but it cannot TOUCH.Security Alert: If you are dealing with PII (Personally Identifiable Information), you need row-level security. Contact our Team to audit your database before connecting an AI.
Step 2: Setting up the LangChain SQL Agent
We will use langchain-community to bridge the gap between OpenAI and your database.
Use Enlighter Block: Language = Python
from langchain_community.utilities import SQLDatabase
from langchain_community.agent_toolkits import create_sql_agent
from langchain_openai import ChatOpenAI
# 1. Connect to your Database (URI format)
# Example: postgresql://user:password@localhost:5432/dbname
db = SQLDatabase.from_uri("sqlite:///company_data.db")
# 2. Initialize the LLM (Use GPT-4o for best SQL generation)
llm = ChatOpenAI(model="gpt-4o", temperature=0)
# 3. Create the Agent
agent_executor = create_sql_agent(
llm=llm,
db=db,
agent_type="openai-tools",
verbose=True
)Step 3: Running Your First Query
Now, let’s test it with a natural language question.
Use Enlighter Block: Language = Python
# The Question
response = agent_executor.invoke(
"How many customers from Japan ordered products over $500 last month?"
)
print(response["output"])What happens in the background:
- The Agent looks at tables: customers, orders.
- It writes: SELECT count(*) FROM orders JOIN customers … WHERE country=’Japan’ AND amount > 500.
- It runs the query.
- It returns: “There were 14 customers from Japan who met those criteria.”
Step 4: Improving Accuracy with “Few-Shot Prompting”
Out of the box, the AI doesn’t know your business logic. It doesn’t know that “ARR” means “Annual Recurring Revenue.”
To fix this, we feed it examples.
Use Enlighter Block: Language = Python
from langchain_core.prompts import ChatPromptTemplate, FewShotChatMessagePromptTemplate
examples = [
{
"input": "List all churned users.",
"query": "SELECT * FROM users WHERE status = 'inactive' AND last_login < DATE('now', '-30 days');"
},
{
"input": "Calculate ARR.",
"query": "SELECT SUM(monthly_price) * 12 FROM subscriptions WHERE status = 'active';"
}
]
# We inject these examples into the system prompt
# This teaches the AI your specific business definitions.Real World Application: The “CEO Dashboard”
We deployed similar architecture for one of our customers a Retail Chain (Read Case Study). Instead of using complex Tableau filters, the CEO simply types: “Which drivers are currently delayed?” into a Slack bot, and the SQL Agent queries the live tracking database to return the answer in seconds.
Conclusion: Data access for Everyone
Text-to-SQL democratizes data. It removes the bottleneck between “People with questions” and “People who know SQL.”
However, it requires strict engineering to prevent hallucinations and security leaks. Do not just connect it and forget it.
Want to Chat with Your Data?
Building a robust Text-to-SQL engine involves handling join complexities, ambiguous schemas, and latency. The AI Division builds custom “AI Analyst” layers for Enterprise Data Warehouses (Snowflake, Databricks, Postgres).
Explore AI Data Analytics Services
Turn your database into a conversational engine.
Frequently Asked Questions (FAQ)
Q: Is Text-to-SQL safe for production databases?
A: Yes, IF you strictly limit the database user permissions to READ ONLY (SELECT). Never allow the AI user to have INSERT, UPDATE, or DELETE privileges.
Q: Which LLM is best for writing SQL?
A: Currently, GPT-4o and Claude 3.5 Sonnet are the leaders in SQL generation. Smaller models often struggle with complex JOINs and window functions.
Q: Can this handle messy database schemas?
A: It struggles with messy column names (e.g., col_1, data_x). You should either rename your columns to be descriptive or use an “Intermediate Layer” where you document your schema for the AI in the system prompt.





