dbt adapter expand_target_column_types 简单说明
adapter expand_target_column_types 核心作用是进行关系的展开对比,确定我们的模型是否需要进行提升(实际上就是数据类型变动)
此功能目前在dbt 的快照以及增量物化场景中使用到
参考使用
- 简单示例
{% set tmp_relation = adapter.get_relation(
{% set target_relation = adapter.get_relation(
{% do adapter.expand_target_column_types(tmp_relation, target_relation) %}
- 增量物化处理
{% if existing_relation is none %}
{% set build_sql = get_create_table_as_sql(False, target_relation, sql) %}
{% elif full_refresh_mode %}
{% set build_sql = get_create_table_as_sql(False, intermediate_relation, sql) %}
{% set need_swap = true %}
{% else %}
{% do run_query(get_create_table_as_sql(True, temp_relation, sql)) %}
{% do adapter.expand_target_column_types(
from_relation=temp_relation,
to_relation=target_relation) %}
内部处理
def expand_column_types(self, goal, current):
reference_columns = {c.name: c for c in self.get_columns_in_relation(goal)}
target_columns = {c.name: c for c in self.get_columns_in_relation(current)}
for column_name, reference_column in reference_columns.items():
target_column = target_columns.get(column_name)
if target_column is not None and target_column.can_expand_to(reference_column):
col_string_size = reference_column.string_size()
new_type = self.Column.string_type(col_string_size)
fire_event(
ColTypeChange(
orig_type=target_column.data_type,
new_type=new_type,
table=_make_ref_key_dict(current),
)
)
# 判断需要进行提升的会进行ALTER_COLUMN_TYPE_MACRO_NAME macro 的调用进行类型修改
self.alter_column_type(current, column_name, new_type)
def alter_column_type(self, relation, column_name, new_column_type) -> None:
"""
1. Create a new column (w/ temp name and correct type)
2. Copy data over to it
3. Drop the existing column (cascade!)
4. Rename the new column to existing column
"""
kwargs = {
"relation": relation,
"column_name": column_name,
"new_column_type": new_column_type,
}
self.execute_macro(ALTER_COLUMN_TYPE_MACRO_NAME, kwargs=kwargs)
具体ALTER_COLUMN_TYPE_MACRO_NAME macro 的处理如下
可以看到为了确保变更的安全,使用了多步处理
{% macro default__alter_column_type(relation, column_name, new_column_type) -%}
{#
1. Create a new column (w/ temp name and correct type)
2. Copy data over to it
3. Drop the existing column (cascade!)
4. Rename the new column to existing column
#}
{%- set tmp_column = column_name + "__dbt_alter" -%}
{% call statement('alter_column_type') %}
alter table {{ relation }} add column {{ adapter.quote(tmp_column) }} {{ new_column_type }};
update {{ relation }} set {{ adapter.quote(tmp_column) }} = {{ adapter.quote(column_name) }};
alter table {{ relation }} drop column {{ adapter.quote(column_name) }} cascade;
alter table {{ relation }} rename column {{ adapter.quote(tmp_column) }} to {{ adapter.quote(column_name) }}
{% endcall %}
说明
此功能还是比较重要的,尤其我们可能会进行数据类型的调整场景,就可以帮助我们自动处理一些ddl了
参考资料
dbt/include/global_project/macros/adapters/columns.sql
https://docs.getdbt.com/reference/dbt-jinja-functions/adapter#expand_target_column_types
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 全程不用写代码,我用AI程序员写了一个飞机大战
· DeepSeek 开源周回顾「GitHub 热点速览」
· 记一次.NET内存居高不下排查解决与启示
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· .NET10 - 预览版1新功能体验(一)
2022-05-20 nginx 安全请求头
2022-05-20 cratedb 4.8 新特性
2020-05-20 zabbix docker-compose 运行配置
2020-05-20 dgraph 集群模式 native graphal 试用
2020-05-20 一些不错的graphql 工具
2020-05-20 dgraph native graphal 试用
2018-05-20 Stream Processing 101: From SQL to Streaming SQL in 10 Minutes