Simple SQL Generator using AWS Bedrock

Using AWS Bedrock agents to convert user queries into SQL statements involves leveraging Bedrock's large language models (LLMs) and designing a system that integrates the schema, the user query, and the SQL generation process. Below is a step-by-step guide to achieving this: 1. Define the Problem Your goal is to convert a user query (in natural language) into an SQL statement based on a predefined table schema. For example: User Query: "Find the names of all employees in the Sales department." Table Schema: { "table_name": "employees", "columns": [ {"name": "id", "type": "INTEGER"}, {"name": "name", "type": "VARCHAR"}, {"name": "department", "type": "VARCHAR"}, {"name": "salary", "type": "FLOAT"} ] } Generated SQL: SELECT name FROM employees WHERE department = 'Sales'; 2. Set Up AWS Bedrock AWS Bedrock allows you to access and use foundation models without managing infrastructure. Here's how to proceed: Activate Bedrock in your AWS account. Choose a foundation model from Bedrock's available options (e.g., Anthropic, Cohere, or others). Access the Bedrock SDK via AWS CLI, SDK, or Console. 3. Prepare the Environment Install Required SDKs: Install the AWS SDK for Python (boto3) to interact with AWS services: pip install boto3 Set Up IAM Roles: Ensure your role has permissions for Bedrock: bedrock:InvokeModel bedrock:ListModels bedrock:ListFoundationModels 4. Define the Input Schema Your input schema includes: Natural Language Query: A text prompt provided by the user. Table Schema: JSON describing the database schema. Example input for the LLM: { "query": "Find the names of all employees in the Sales department.", "schema": { "table_name": "employees", "columns": [ {"name": "id", "type": "INTEGER"}, {"name": "name", "type": "VARCHAR"}, {"name": "department", "type": "VARCHAR"}, {"name": "salary", "type": "FLOAT"} ] } } 5. Use Bedrock to Generate SQL Here’s how to send the query and schema to Bedrock: Python Code Example: import boto3 import json # Initialize Bedrock client bedrock_client = boto3.client('bedrock', region_name='us-east-1') # Input schema and user query input_data = { "query": "Find the names of all employees in the Sales department.", "schema": { "table_name": "employees", "columns": [ {"name": "id", "type": "INTEGER"}, {"name": "name", "type": "VARCHAR"}, {"name": "department", "type": "VARCHAR"}, {"name": "salary", "type": "FLOAT"} ] } } # Prepare prompt for the LLM prompt = f""" Generate an SQL query based on the following table schema and user request: Table Schema: {json.dumps(input_data['schema'], indent=2)} User Query: {input_data['query']} SQL Query: """ # Invoke the Bedrock model response = bedrock_client.invoke_model( modelId='your-chosen-model-id', # Replace with the specific model ID from Bedrock body=json.dumps({"prompt": prompt}), contentType='application/json', accept='application/json' ) # Extract the SQL query model_output = json.loads(response['body']) sql_query = model_output['completions'][0]['data']['text'] print("Generated SQL Query:") print(sql_query) 6. Validate and Test Test the Agent: Provide various user queries and validate the generated SQL. Add Validation Rules: Ensure that the generated SQL aligns with your schema and guards against SQL injection. 7. (Optional) Deploy as an API You can deploy the solution as an API using AWS Lambda and API Gateway: Create a Lambda function to handle the input and invoke the Bedrock model. Use API Gateway to expose the function as an HTTP endpoint. 8. Enhance with Few-Shot Learning To improve accuracy: Provide a few examples (prompts with expected SQL outputs) in the Bedrock prompt. Add comments explaining the rules. Example prompt: Generate an SQL query based on the schema and request. Example 1: Table Schema: { "table_name": "employees", "columns": [ {"name": "id", "type": "INTEGER"}, {"name": "name", "type": "VARCHAR"}, {"name": "department", "type": "VARCHAR"}, {"name": "salary", "type": "FLOAT"} ] } User Query: Find employees in the Sales department. SQL Query: SELECT * FROM employees WHERE department = 'Sales'; Example 2: Table Schema: {...} Now generate an SQL query for this schema and request: 9. Monitor and Iterate Logging: Log all queries and generated SQL for debugging and analytics. Fine-Tune: Adjust prompts based on model performance. This process allows you to create a robust and scalable system for converting user queries into SQL statements using AWS Bedrock agents.

Jan 15, 2025 - 16:30
Simple SQL Generator using AWS Bedrock

Using AWS Bedrock agents to convert user queries into SQL statements involves leveraging Bedrock's large language models (LLMs) and designing a system that integrates the schema, the user query, and the SQL generation process. Below is a step-by-step guide to achieving this:

1. Define the Problem

Your goal is to convert a user query (in natural language) into an SQL statement based on a predefined table schema. For example:

  • User Query: "Find the names of all employees in the Sales department."
  • Table Schema:
{
  "table_name": "employees",
  "columns": [
    {"name": "id", "type": "INTEGER"},
    {"name": "name", "type": "VARCHAR"},
    {"name": "department", "type": "VARCHAR"},
    {"name": "salary", "type": "FLOAT"}
  ]
}
  • Generated SQL:
SELECT name FROM employees WHERE department = 'Sales';

2. Set Up AWS Bedrock

AWS Bedrock allows you to access and use foundation models without managing infrastructure. Here's how to proceed:

  1. Activate Bedrock in your AWS account.
  2. Choose a foundation model from Bedrock's available options (e.g., Anthropic, Cohere, or others).
  3. Access the Bedrock SDK via AWS CLI, SDK, or Console.

3. Prepare the Environment

  1. Install Required SDKs: Install the AWS SDK for Python (boto3) to interact with AWS services:
   pip install boto3
  1. Set Up IAM Roles: Ensure your role has permissions for Bedrock:
    • bedrock:InvokeModel
    • bedrock:ListModels
    • bedrock:ListFoundationModels

4. Define the Input Schema

Your input schema includes:

  • Natural Language Query: A text prompt provided by the user.
  • Table Schema: JSON describing the database schema.

Example input for the LLM:

{
  "query": "Find the names of all employees in the Sales department.",
  "schema": {
    "table_name": "employees",
    "columns": [
      {"name": "id", "type": "INTEGER"},
      {"name": "name", "type": "VARCHAR"},
      {"name": "department", "type": "VARCHAR"},
      {"name": "salary", "type": "FLOAT"}
    ]
  }
}

5. Use Bedrock to Generate SQL

Here’s how to send the query and schema to Bedrock:

Python Code Example:

import boto3
import json

# Initialize Bedrock client
bedrock_client = boto3.client('bedrock', region_name='us-east-1')

# Input schema and user query
input_data = {
    "query": "Find the names of all employees in the Sales department.",
    "schema": {
        "table_name": "employees",
        "columns": [
            {"name": "id", "type": "INTEGER"},
            {"name": "name", "type": "VARCHAR"},
            {"name": "department", "type": "VARCHAR"},
            {"name": "salary", "type": "FLOAT"}
        ]
    }
}

# Prepare prompt for the LLM
prompt = f"""
Generate an SQL query based on the following table schema and user request:

Table Schema:
{json.dumps(input_data['schema'], indent=2)}

User Query:
{input_data['query']}

SQL Query:
"""

# Invoke the Bedrock model
response = bedrock_client.invoke_model(
    modelId='your-chosen-model-id',  # Replace with the specific model ID from Bedrock
    body=json.dumps({"prompt": prompt}),
    contentType='application/json',
    accept='application/json'
)

# Extract the SQL query
model_output = json.loads(response['body'])
sql_query = model_output['completions'][0]['data']['text']

print("Generated SQL Query:")
print(sql_query)

6. Validate and Test

  1. Test the Agent:
    • Provide various user queries and validate the generated SQL.
  2. Add Validation Rules:
    • Ensure that the generated SQL aligns with your schema and guards against SQL injection.

7. (Optional) Deploy as an API

You can deploy the solution as an API using AWS Lambda and API Gateway:

  1. Create a Lambda function to handle the input and invoke the Bedrock model.
  2. Use API Gateway to expose the function as an HTTP endpoint.

8. Enhance with Few-Shot Learning

To improve accuracy:

  1. Provide a few examples (prompts with expected SQL outputs) in the Bedrock prompt.
  2. Add comments explaining the rules.

Example prompt:

Generate an SQL query based on the schema and request.

Example 1:
Table Schema:
{
  "table_name": "employees",
  "columns": [
    {"name": "id", "type": "INTEGER"},
    {"name": "name", "type": "VARCHAR"},
    {"name": "department", "type": "VARCHAR"},
    {"name": "salary", "type": "FLOAT"}
  ]
}
User Query: Find employees in the Sales department.
SQL Query: SELECT * FROM employees WHERE department = 'Sales';

Example 2:
Table Schema:
{...}

Now generate an SQL query for this schema and request:

9. Monitor and Iterate

  1. Logging: Log all queries and generated SQL for debugging and analytics.
  2. Fine-Tune: Adjust prompts based on model performance.

This process allows you to create a robust and scalable system for converting user queries into SQL statements using AWS Bedrock agents.