[Kubeflow] 00 - Introduction to Kubeflow
Ref: Introduction to Kubeflow: Fundamentals【课程不错】
Ref: Introduction【官网文档】
Ref: Kubeflow Pipelines standalone【我需要的】
** KubeFlow **
Ref: 最好的任务编排工具:Airflow vs Luigi vs Argo vs MLFlow vs KubeFlow
Argo vs. Kubeflow
Parts of Kubeflow (like Kubeflow Pipelines) are built on top of Argo, but Argo is built to orchestrate any task, while Kubeflow focuses on those specific to machine learning – such as experiment tracking, hyperparameter tuning, and model deployment. Kubeflow Pipelines is a separate component of Kubeflow that focuses on model deployment and CI/CD and can be used independently of Kubeflow’s other features. Both tools rely on Kubernetes and are likely to be more interesting to you if you’ve already adopted that. With Argo, you define your tasks using YAML, while Kubeflow allows you to use a Python interface instead.
- Use Argo if you need to manage DAG of general tasks running as Kubernetes pods.
- Use Kubeflow if you want a more opinionated tool focused on machine learning solutions.
Ref: Argo: Kubernetes Native Workflows and Pipelines | Canva
Canva's choice
At Canva, we leverage it to schedule and run all model trainers on our Kubernetes clusters.
The map-reduce approach requires two kinds of application containers with differing responsibilities:
- Optimizer: A single container generating the next batch of hyperparameters to explore based on all previous hyperparameters and model evaluation results.
- Model Trainers: A batch of model trainer containers that accepts hyperparameter values and returns pre-defined evaluation metrics.
Ref: KubeFlow-Pipeline及Argo实现原理速析
Based on Argo
描述了Argo "下一步容器怎么拿到上一步容器的结果"。
部署一套Argo很简单,启动一个K8s-Controller就行。可是部署一套 Kubeflow-Pipeline 系统就复杂多了,总共下来有8个组件。那是Argo什么地方不足,需要新开发一套KFP,并搞这么复杂呢?主要的原因还在于Argo是基于K8s云原生这套理念,即ETCD充当“数据库”来运行的,导致约束比较大。
世界上为什么有这么多的 "流程引擎"
- 大数据步骤说:“这一步要执行的SQL语句是xxx”,
- 而K8s任务步骤却说:“这一步执行需要的Docker镜像是yyy”。
AWS:Cloudformation编排,Batch服务,SageMaker-ML Pipeline,Data Pipeline Azure:Pipeline服务,ML Pipeline,Data Factory Aliyun:函数Pipeline服务,ROS资源编排,Batch服务,PAI-Studio 大数据领域:Oozie,AirFlow 软件部署:Puppet,Chef,Ansible 基因分析:DNAnexus,NextFlow,Cromwell
- 第一层:用户交互层。如:模板语法规则,Console界面等
- 第二层:API持久化层。如:模板记录,历史执行记录等
- 第三层:引擎实例层。如:能否水平扩容,流程是否有优先级等
- 第四层:驱动层。如:一个步骤能干什么活。跑一个容器还是跑一个Spark任务。
** Kubernetes **
Ref: Kubeflow Fundamentals - How To Build ML/AI Pipelines
Ref: [K8S] 00 - Kubernetes Arch
Ref: Kubeflow 系列,第 1 讲:Kubeflow 概览和功能介绍