dbt seed 以及base ephemeral使用
seed 可以方便的进行数据的导入,可以方便的进行不变数据(少量)以及测试数据的导入,
base 设置为 ephemeral(暂态),这个同时也是官方最佳实践的建议
项目依赖的gitlab 数据可以参考https://github.com/rongfengliang/graphql-engine-gitlab
参考项目
- 初始化
dbt init gitlab-data
- 配置项目
# Name your package! Package names should contain only lowercase characters
# and underscores. A good package name should reflect your organization's
# name or the intended use of these models
name: 'gitlab'
version: '1.0'
# This setting configures which "profile" dbt uses for this project. Profiles contain
# database connection information, and should be configured in the ~/.dbt/profiles.yml file
profile: 'default'
# These configurations specify where dbt should look for different types of files.
# The `source-paths` config, for example, states that source models can be found
# in the "models/" directory. You probably won't need to change these!
source-paths: ["models"]
analysis-paths: ["analysis"]
test-paths: ["tests"]
data-paths: ["data"] # 可以放seed 数据
macro-paths: ["macros"]
target-path: "target" # directory which will store compiled SQL files
clean-targets: # directories to be removed by `dbt clean`
- "target"
- "dbt_modules"
# You can define configurations for models in the `source-paths` directory here.
# Using these configurations, you can enable or disable models, change how they
# are materialized, and more!
# In this example config, we tell dbt to build all models in the example/ directory
# as views (the default). Try changing `view` to `table` below, then re-running dbt
models:
gitlab:
gitlab:
base:
materialized: ephemeral # base 建议配置为ephemeral
- 模型添加
model/gitlab/base/gitlab_projectinfo.sql:
select * from projects
model/gitlab/transform/gitlab_project_counts.sql:
select * from {{ref('gitlab_projectinfo')}}
profile 配置
~/.dbt/profiles.yml
default:
target: dev
outputs:
dev:
type: postgres
host: 127.0.0.1
user: postgres
pass: password
port: 5432
dbname: gitlabhq_production
schema: public
threads: 3
pg:
target: dev
outputs:
dev:
type: postgres
host: 127.0.0.1
user: postgres
pass: password
port: 5433
dbname: gitlabhq_production
schema: public
threads: 3
运行&&测试&&文档
- 运行
dbt run && dbt seed --show && dbt docs generate && dbt docs serve
- 效果
参考资料
https://github.com/rongfengliang/graphql-engine-gitlab
https://docs.getdbt.com/docs/configuring-models
https://docs.getdbt.com/docs/best-practices
https://docs.getdbt.com/reference#seed
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 记一次.NET内存居高不下排查解决与启示
· 探究高空视频全景AR技术的实现原理
· 理解Rust引用及其生命周期标识(上)
· 浏览器原生「磁吸」效果!Anchor Positioning 锚点定位神器解析
· 没有源码,如何修改代码逻辑?
· 全程不用写代码,我用AI程序员写了一个飞机大战
· DeepSeek 开源周回顾「GitHub 热点速览」
· 记一次.NET内存居高不下排查解决与启示
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· .NET10 - 预览版1新功能体验(一)