Airflow exploration
airflow python script is just a configuration file specifying the DAG's structure as code. The script's purpose is to define a DAG object.
STEPS
import modules.
default arguments: pass a set of arguments to each task's constructor.
Instantiate a DAG:
how to pass parameters as well to this function using the PythonOperator.
There are actually two ways of passing parameters.
- First, we can use the op_args parameter which is a list of positional arguments that will get unpacked when calling the callable function.
- Second, we can use the op_kwargs parameter which is a dictionary of keyword arguments that will get unpacked in the callable function.
Return result from PythonOperator;
使用XComs in airflow.
Run your own dags;
1. For DAG file to be visible by Scheduler (and consequently, Webserver), you need to add it to dags_folder
(specified in airflow.cfg
. eg./home/wenjing/airflow/dags).
2. restart scheduler
airflow scheduler
3. Wait until current Scheduler process picks up new DAGs.
Supplementary knowledge:
1. python argument
variable argument,可变参数;*args and **kwargs; The special syntax *args in function definitions in python is used to pass a variable number of arguments to a function. It is used to pass a non-keyworded, variable-length argument list. **kwargs is variable keyword arguments list. 并且,*args必须位于**kwargs之前,因为positional arguments必须位于keyword arguments之前。
*args和**kwargs语法不仅可以在函数定义中使用,同样可以在函数调用的时候使用。不同的是,如果说在函数定义的位置使用*args和**kwargs是一个将参数pack的过程,那么在函数调用的时候就是一个将参数unpack的过程了。