dbt statement macro 简单说明

statement blocks 实际上就是一个标准的jinja2 macro 调用包装, 提供了方便的sql 执行能力,因为需要进行
查询结果的存储,dbt 提供了一个store_result 的macro,内部数据的处理基于了agate 这个方便的python 数据处理包
为了查询使用提供了load_result macro
以下只说明关于statement部分,store_result 以及load_result macro 后边单独说明

statement 处理

statement macro 是在dbt-adapter 包中,实现内在dbt core 中,dbt-adapter 包含了dbt 的global macro (package 为dbt)

  • 参考使用
# 定义查询
{%- call statement('states', fetch_result=True) -%}
 
    select distinct state from {{ ref('users') }}
 
{%- endcall -%}
# 获取数据
{%- set states = load_result('states') -%}
{%- set states_data = states['data'] -%}
{%- set states_status = states['response'] -%}
  • 参考代码
{#--
The macro override naming method (spark__statement) only works for macros which are called with adapter.dispatch. For macros called directly, you can just redefine them.
--#}
{%- macro statement(name=None, fetch_result=False, auto_begin=True, language='sql') -%}
  {%- if execute: -%}
   # 通过caller获取传递的sql 
    {%- set compiled_code = caller() -%}
 
    {%- if name == 'main' -%}
      {{ log('Writing runtime {} for node "{}"'.format(language, model['unique_id'])) }}
      {{ write(compiled_code) }}
    {%- endif -%}
    {%- if language == 'sql'-%}
     # 执行sql 查询,adapter macro 的execute
      {%- set res, table = adapter.execute(compiled_code, auto_begin=auto_begin, fetch=fetch_result) -%}
    {%- elif language == 'python' -%}
       # 执行python 代码处理
      {%- set res = submit_python_job(model, compiled_code) -%}
      {#-- TODO: What should table be for python models? --#}
      {%- set table = None -%}
    {%- else -%}
      {% do exceptions.raise_compiler_error("statement macro didn't get supported language") %}
    {%- endif -%}
 
    {%- if name is not none -%}
      # 存储结果
      {{ store_result(name, response=res, agate_table=table) }}
    {%- endif -%}
 
  {%- endif -%}
{%- endmacro %}
 
 
{% macro noop_statement(name=None, message=None, code=None, rows_affected=None, res=None) -%}
  {%- set sql = caller() -%}
 
  {%- if name == 'main' -%}
    {{ log('Writing runtime SQL for node "{}"'.format(model['unique_id'])) }}
    {{ write(sql) }}
  {%- endif -%}
 
  {%- if name is not none -%}
    {{ store_raw_result(name, message=message, code=code, rows_affected=rows_affected, agate_table=res) }}
  {%- endif -%}
 
{%- endmacro %}
 
# run_query 友好包装
{# a user-friendly interface into statements #}
{% macro run_query(sql) %}
  {% call statement("run_query_statement", fetch_result=true, auto_begin=false) %}
    {{ sql }}
  {% endcall %}
 
  {% do return(load_result("run_query_statement").table) %}
{% endmacro %}

说明

官方推荐通过run_query 对于statement 进行替换,run_query 是statement的包装使用上更加方便

参考资料

dbt/include/global_project/macros/etc/statement.sql
core/dbt/context/providers.py
https://docs.getdbt.com/reference/dbt-jinja-functions/run_query
https://docs.getdbt.com/reference/dbt-jinja-functions/statement-blocks
https://jinja.palletsprojects.com/en/3.0.x/templates/#call
https://agate.readthedocs.io/en/latest/
https://github.com/wireservice/agate

posted on 2024-04-04 08:33  荣锋亮  阅读(18)  评论(0编辑  收藏  举报

导航