Airflow exploration

airflow python script is just a configuration file specifying the DAG's structure as code. The script's purpose is to define a DAG object. 

STEPS

import modules.

default arguments: pass a set of arguments to each task's constructor.

Instantiate a DAG:

how to pass parameters as well to this function using the PythonOperator.

There are actually two ways of passing parameters.

  • First, we can use the op_args parameter which is a list of positional arguments that will get unpacked when calling the callable function.
  • Second, we can use the op_kwargs parameter which is a dictionary of keyword arguments that will get unpacked in the callable function.

Return result from PythonOperator;

使用XComs in airflow. 

Run your own dags;

1. For DAG file to be visible by Scheduler (and consequently, Webserver), you need to add it to dags_folder (specified in airflow.cfg. eg./home/wenjing/airflow/dags).

2. restart scheduler

airflow scheduler

3. Wait until current Scheduler process picks up new DAGs.

Supplementary knowledge:

1. python argument

variable argument,可变参数;*args and **kwargs; The special syntax *args in function definitions in python is used to pass a variable number of arguments to a function. It is used to pass a non-keyworded, variable-length argument list. **kwargs is variable keyword arguments list. 并且,*args必须位于**kwargs之前,因为positional arguments必须位于keyword arguments之前。

*args和**kwargs语法不仅可以在函数定义中使用,同样可以在函数调用的时候使用。不同的是,如果说在函数定义的位置使用*args和**kwargs是一个将参数pack的过程,那么在函数调用的时候就是一个将参数unpack的过程了。

References:

  1. https://marclamberti.com/blog/airflow-pythonoperator/
  2. https://stackoverflow.com/questions/50149085/python-airflow-return-result-from-pythonoperator
  3. https://kodango.com/variable-arguments-in-python
  4. https://stackoverflow.com/questions/38992997/dag-not-visible-in-web-ui
posted @ 2020-04-15 19:24  keeps_you_warm  阅读(99)  评论(0编辑  收藏  举报