Daft sql查询数据库处理简单说明

此sql 不是Daft 对于dataframe 的sql 查询处理，而是对于database 数据源的查询处理，以下是一些简单数名

简单说明

支持20+ 数据库方言，基于了sqlGlot，当然处理部分有基于connector-x的以及sqlalchemy的，默认使用connector-x 对于不支持的会进行fall back 到sqlalchemy，我们也可以指定明确使用的数据connection
执行并行以及分布式读，当然这个是Daft 的标配能力
支持数据过滤，select，limit，where
目前对于db的写入暂时是不支持的，同时对于ADBC的支持也在规划中

Daft 对于使用connector-x以及sqlalchemy的判断

实际上就是内部支持db 类型的判断，代码如下

def execute_sql_query(self, sql: str) -> pa.Table:
    if self._should_use_connectorx():
        return self._execute_sql_query_with_connectorx(sql)
    else:
        return self._execute_sql_query_with_sqlalchemy(sql)

_should_use_connectorx 的处理

def _should_use_connectorx(self) -> bool:
    # Supported DBs extracted from here https://github.com/sfu-db/connector-x/tree/7b3147436b7e20b96691348143d605e2249d6119?tab=readme-ov-file#sources
    connectorx_supported_dbs = {
        "postgres",
        "postgresql",
        "mysql",
        "mssql",
        "oracle",
        "bigquery",
        "sqlite",
        "clickhouse",
        "redshift",
    }

    if isinstance(self.conn, str):
        if self.dialect in connectorx_supported_dbs and self.driver == "":
            return True
    return False