dbt graph 上下文变量简单说明
dbt graph 上下文变量,包含了dbt 项目相关的nodes 信息(比如model,source,metrics,macros,tests,snapshots)
因为dbt 包含了不同的处理阶段,实际如果希望获取graph context 中的一些nodes 信息,注意执行的阶段,一般建议的玩法
是结合execute 上下文变量
参考使用
- 访问模型的
{% if execute %}
{% for node in graph.nodes.values()
| selectattr("resource_type", "equalto", "model")
| selectattr("package_name", "equalto", "snowplow") %}
{% do log(node.unique_id ~ ", materialized: " ~ node.config.materialized, info=true) %}
{% endfor %}
{% endif %}
- 获取sources
{%- macro source(source_name, table_name) -%}
{%- set relation = builtins.source(source_name, table_name) -%}
{%- if execute -%}
{%- set source = graph.sources.values() | selectattr("source_name", "equalto", source_name) | selectattr("name", "equalto", table_name) | list | first -%}
{%- set format = source.external.format if
source.external is defined
and source.external.format is defined
else none -%}
{%- set format_clause = format_clause_from_node(source.external) if format is not none else none -%}
{%- set relation2 = api.Relation.create(database=relation.database, schema=relation.schema, identifier=relation.identifier, format=format, format_clause=format_clause) -%}
{{ return (relation2) }}
{%- else -%}
{{ return (relation) }}
{%- endif -%}
{%- endmacro -%}
- 获取metrics
{% macro get_metric_sql_for(metric_name) %}
{% set metrics = graph.metrics.values() %}
{% set metric = (metrics | selectattr('name', 'equalto', metric_name) | list).pop() %}
/* Elsewhere, I've defined a macro, get_metric_timeseries_sql, that will return
the SQL needed to perform a time-based rollup of this metric's calculation */
{% set metric_sql = get_metric_timeseries_sql(
relation = metric['model'],
type = metric['type'],
expression = metric['sql'],
) %}
{{ return(metric_sql) }}
{% endmacro %}
内部处理
graph 内部处理与dbt 大部分变量是一致的,就是一个装饰器标记的方法,数据实际上就是manifest,只是进行了规范化处理
@contextproperty()
def graph(self) -> Dict[str, Any]:
return self.manifest.flat_graph
flat_graph 的处理
def build_flat_graph(self):
"""This attribute is used in context.common by each node, so we want to
only build it once and avoid any concurrency issues around it.
Make sure you don't call this until you're done with building your
manifest!
"""
self.flat_graph = {
"exposures": {k: v.to_dict(omit_none=False) for k, v in self.exposures.items()},
"groups": {k: v.to_dict(omit_none=False) for k, v in self.groups.items()},
"metrics": {k: v.to_dict(omit_none=False) for k, v in self.metrics.items()},
"nodes": {k: v.to_dict(omit_none=False) for k, v in self.nodes.items()},
"sources": {k: v.to_dict(omit_none=False) for k, v in self.sources.items()},
"semantic_models": {
k: v.to_dict(omit_none=False) for k, v in self.semantic_models.items()
},
"saved_queries": {
k: v.to_dict(omit_none=False) for k, v in self.saved_queries.items()
},
}
说明
graph 在一些dbt package 中使用还是比较多的,graph 的功能,有助于方便的进行dbt 扩展