Stay Hungry,Stay Foolish!

text2sql-workshop

text2sql-workshop

https://github.com/fanqingsong/text2sql-workshop

Text2SQL Workshop

Using OpenAI, Langchain and Postgresql to Talk to Your Data


Overview

  • This repo demonstrates the power of Large Language Models and Generative AI for simplifying access to data: instead of querying a database using SQL, why not doing so using Natural Language?
  • text2sql is a basic Python package which ships with Langchain. It contains simple logic for connecting to a local Postgresql instance, and by leveraging Langchain's create_sql_query_chain, it obtains metadata from our local DB instances and creates multiple prompts which are executed against an LLM (in our case, OpenAI ChatGPT).
  • As a result, we are able to convert questions from Natural Language to SQL Queries that are compliant with Postgresql's dialect.

Usage

  • Create a virtual environment with your tool of choice and install the text2sql Python package
  • Once the package is installed, you can create an IPython kernel and use it in Jupyter - checkout the notebooks provided in the sandbox folder.

Example

from text2sql.core import Text2SQL

sql = Text2SQL(model = "gpt-3.5-turbo")
query = sql.query("How much do we have in total sales?")
print(query)
 
> SELECT SUM("Weekly_Sales") AS total_sales FROM sales
 

Prereqs

  • We use Docker to boot up a Postgresql DB. Just run docker-compose up -d and you should be good to go
  • To ingest data into Postgres, run text2sql/ingest.py (for simplification purposes, the package expects you to be running a local instance of Postgresql at port 5432)
  • Make sure that you properly set your OPENAI_API_KEY
from langchain.chat_models import ChatOpenAI
from langchain.chains import create_sql_query_chain
from langchain.utilities.sql_database import SQLDatabase
from sqlalchemy import create_engine
from pydantic import BaseModel
from langchain_core.runnables.base import Runnable
import pydantic

class Text2SQL(BaseModel):

    uri: str = "postgresql://postgres:changeme@localhost:5432"
    # model: str = "gpt-4-1106-preview"
    model: str = "internlm/internlm2_5-7b-chat-gguf/internlm2_5-7b-chat-q2_k.gguf"
    
    temperature: int = 0

    class Config:
        arbitrary_types_allowed = True

    @pydantic.computed_field()
    @property
    def db(self) -> SQLDatabase:
        return SQLDatabase(engine = create_engine(self.uri))
    
    @pydantic.computed_field()
    @property
    def llm(self) -> ChatOpenAI:
        return ChatOpenAI(
            model = self.model,
            temperature = self.temperature,
            api_key="...",  
            base_url="http://192.168.0.108:1234/v1/",
        )
    
    @pydantic.computed_field()
    @property
    def chain(self) -> Runnable:
        return create_sql_query_chain(
            llm = self.llm,
            db = self.db
        )
    
    def query(self, question: str):

        response = self.chain.invoke({"question": question})
        sql_query = response.split("SQLQuery:")[0]

        return sql_query

 

效果

 

 create_sql_query_chain

https://api.python.langchain.com/en/latest/chains/langchain.chains.sql_database.query.create_sql_query_chain.html

 

Create a chain that generates SQL queries.

Security Note: This chain generates SQL queries for the given database.

The SQLDatabase class provides a get_table_info method that can be used to get column information as well as sample data from the table.

To mitigate risk of leaking sensitive data, limit permissions to read and scope to the tables that are needed.

Optionally, use the SQLInputWithTables input type to specify which tables are allowed to be accessed.

Control access to who can submit requests to this chain.

See https://python.langchain.com/docs/security for more information.

Args:

llm: The language model to use. db: The SQLDatabase to generate the query for. prompt: The prompt to use. If none is provided, will choose one

based on dialect. Defaults to None. See Prompt section below for more.

k: The number of results per select statement to return. Defaults to 5.

Returns:

A chain that takes in a question and generates a SQL query that answers that question.

Example:

# pip install -U langchain langchain-community langchain-openai
from langchain_openai import ChatOpenAI
from langchain.chains import create_sql_query_chain
from langchain_community.utilities import SQLDatabase

db = SQLDatabase.from_uri("sqlite:///Chinook.db")
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
chain = create_sql_query_chain(llm, db)
response = chain.invoke({"question": "How many employees are there"})

 

Examples using create_sql_query_chain

 

参考

https://python.langchain.com/v0.2/docs/integrations/chat/openai/

https://github.com/fanqingsong/ChatDB (websocket)

 

平台

https://github.com/chat2db/chat2db

http://chat2db.ai/en-US

 

posted @ 2024-07-27 22:40  lightsong  阅读(3)  评论(0编辑  收藏  举报
Life Is Short, We Need Ship To Travel