TVM 各个模块总体架构

TVM 各个模块总体架构

 

  

 

 Deploy Deep Learning Everywhere

 

 

 Existing Deep Learning Frameworks

 

 

 Limitations of Existing Approach

 

 

 Learning-based Learning System

 

 

 Problem Setting

 

 

 Example Instance in a Search Space

 

 

 

 

  Optimization Choices in a Search Space

 

 

 Problem Formalization

 

 

 Black-box Optimization

 

 

 Cost-model Driven Approach

 

 

 Statistical Cost Model

 

 

 Unique Problem Characteristics

 

 

 Vanilla Cost Modeling

 

 

 Program-aware Modeling: Tree-based Approach

 

 

 Program-aware Modeling: Neural Approach

 

 

 Comparisons of Models

 

 

 Unique Problem Characteristics

 

 

 Transferable Cost Model

 

 

 Impact of Transfer Learning

 

 

 Learning to Optimize Tensor Programs

 

 

 Device Fleet: Distributed Test Bed for AutoTVM

 

 

 TVM: End to End Deep Learning Compiler

 

 

 Tensor Expression and Optimization Search Space

 

 

 Search Space for CPUs

 

 

 Hardware-aware Search Space

 

 

 Search Space for GPUs

 

 

 Search Space for TPU-like Specialized Accelerators

 

 

 Tensorization Challenge

 

 

 Tensorization Challenge

 

 

 Search Space for TPU-like Specialized Accelerators

 

 

 Software Support for Latency Hiding

 

 

 

 

 Summary: Hardware-aware Search Space

 

 

 VTA: Open & Flexible Deep Learning Accelerator

 

 

 TVM: End to End Deep Learning Compiler

 

 

 Need for More Dynamism

 

 

 Relay Virtual Machine

 

 

 uTVM: TVM on bare-metal Devices

 

 

 Core Infrastructure

 

 

 TSIM: Support for Future Hardware

 

 

 Unified Runtime For Heterogeneous Devices

 

 

 Unified Runtime Benefit

 

 

 Effectiveness of ML based Model

 

 

 Comparisons of Models

 

 

 Device Fleet in Action

 

 

 End to End Inference Performance (Nvidia Titan X)

 

 

 Portable Performance Across Hardware Platforms

 

 

posted @   吴建明wujianming  阅读(497)  评论(0编辑  收藏  举报
编辑推荐:
· 记一次.NET内存居高不下排查解决与启示
· 探究高空视频全景AR技术的实现原理
· 理解Rust引用及其生命周期标识(上)
· 浏览器原生「磁吸」效果!Anchor Positioning 锚点定位神器解析
· 没有源码,如何修改代码逻辑?
阅读排行:
· 全程不用写代码,我用AI程序员写了一个飞机大战
· DeepSeek 开源周回顾「GitHub 热点速览」
· 记一次.NET内存居高不下排查解决与启示
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· .NET10 - 预览版1新功能体验(一)
历史上的今天:
2020-06-12 深度学习调用TensorFlow、PyTorch等框架
2020-06-12 CUDA C 纹理提取Texture Fetching
2020-06-12 CPU,GPU,GPGPU
2020-06-12 毫米波RADAR与LIDAR探秘
2020-06-12 电脑识别指令和代码的原理
2020-06-12 CUDA C编程接口技术分析
点击右上角即可分享
微信分享提示