dbt adapter expand_target_column_types 简单说明

adapter expand_target_column_types 核心作用是进行关系的展开对比,确定我们的模型是否需要进行提升(实际上就是数据类型变动)
此功能目前在dbt 的快照以及增量物化场景中使用到

参考使用

  • 简单示例
{% set tmp_relation = adapter.get_relation(...) %}
{% set target_relation = adapter.get_relation(...) %}
 
{% do adapter.expand_target_column_types(tmp_relation, target_relation) %}
  • 增量物化处理
{% if existing_relation is none %}
  {% set build_sql = get_create_table_as_sql(False, target_relation, sql) %}
{% elif full_refresh_mode %}
  {% set build_sql = get_create_table_as_sql(False, intermediate_relation, sql) %}
  {% set need_swap = true %}
{% else %}
{% do run_query(get_create_table_as_sql(True, temp_relation, sql)) %}
{% do adapter.expand_target_column_types(
         from_relation=temp_relation,
         to_relation=target_relation) %}

内部处理

def expand_column_types(self, goal, current):
    reference_columns = {c.name: c for c in self.get_columns_in_relation(goal)}
 
    target_columns = {c.name: c for c in self.get_columns_in_relation(current)}
 
    for column_name, reference_column in reference_columns.items():
        target_column = target_columns.get(column_name)
 
        if target_column is not None and target_column.can_expand_to(reference_column):
            col_string_size = reference_column.string_size()
            new_type = self.Column.string_type(col_string_size)
            fire_event(
                ColTypeChange(
                    orig_type=target_column.data_type,
                    new_type=new_type,
                    table=_make_ref_key_dict(current),
                )
            )
           # 判断需要进行提升的会进行ALTER_COLUMN_TYPE_MACRO_NAME macro 的调用进行类型修改
            self.alter_column_type(current, column_name, new_type)
 
def alter_column_type(self, relation, column_name, new_column_type) -> None:
    """
    1. Create a new column (w/ temp name and correct type)
    2. Copy data over to it
    3. Drop the existing column (cascade!)
    4. Rename the new column to existing column
    """
    kwargs = {
        "relation": relation,
        "column_name": column_name,
        "new_column_type": new_column_type,
    }
    self.execute_macro(ALTER_COLUMN_TYPE_MACRO_NAME, kwargs=kwargs)

具体ALTER_COLUMN_TYPE_MACRO_NAME macro 的处理如下
可以看到为了确保变更的安全,使用了多步处理

{% macro default__alter_column_type(relation, column_name, new_column_type) -%}
  {#
    1. Create a new column (w/ temp name and correct type)
    2. Copy data over to it
    3. Drop the existing column (cascade!)
    4. Rename the new column to existing column
  #}
  {%- set tmp_column = column_name + "__dbt_alter" -%}
 
  {% call statement('alter_column_type') %}
    alter table {{ relation }} add column {{ adapter.quote(tmp_column) }} {{ new_column_type }};
    update {{ relation }} set {{ adapter.quote(tmp_column) }} = {{ adapter.quote(column_name) }};
    alter table {{ relation }} drop column {{ adapter.quote(column_name) }} cascade;
    alter table {{ relation }} rename column {{ adapter.quote(tmp_column) }} to {{ adapter.quote(column_name) }}
  {% endcall %}

说明

此功能还是比较重要的,尤其我们可能会进行数据类型的调整场景,就可以帮助我们自动处理一些ddl了

参考资料

dbt/include/global_project/macros/adapters/columns.sql
https://docs.getdbt.com/reference/dbt-jinja-functions/adapter#expand_target_column_types

posted on   荣锋亮  阅读(18)  评论(0编辑  收藏  举报

相关博文:
阅读排行:
· 全程不用写代码,我用AI程序员写了一个飞机大战
· DeepSeek 开源周回顾「GitHub 热点速览」
· 记一次.NET内存居高不下排查解决与启示
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· .NET10 - 预览版1新功能体验(一)
历史上的今天:
2022-05-20 nginx 安全请求头
2022-05-20 cratedb 4.8 新特性
2020-05-20 zabbix docker-compose 运行配置
2020-05-20 dgraph 集群模式 native graphal 试用
2020-05-20 一些不错的graphql 工具
2020-05-20 dgraph native graphal 试用
2018-05-20 Stream Processing 101: From SQL to Streaming SQL in 10 Minutes

导航

< 2025年3月 >
23 24 25 26 27 28 1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31 1 2 3 4 5
点击右上角即可分享
微信分享提示