摘要: 自动驾驶算力效率 特斯拉 Hardware 3.0 的效率之谜 特斯拉在其推出的 Hardware 3.0 自动驾驶平台中,采用自研芯片替代了Nvidia Drive PX2,其理论算力直线提升了 12 倍,而以 MAPS 方式来评估,其真实 AI 性能更是惊人的提升了 21 倍。具体而言,Hard 阅读全文
posted @ 2021-05-30 12:57 吴建明wujianming 阅读(505) 评论(0) 推荐(0) 编辑
摘要: 昇腾AI 软硬件全栈平台 阅读全文
posted @ 2021-05-30 11:12 吴建明wujianming 阅读(134) 评论(0) 推荐(0) 编辑
摘要: TVM性能评估分析(七) Figure 1. Performance Improvement Figure 2. Depthwise convolution Figure 3. Data Fusion Figure 4. Data Fusion(2) Figure 5. Shared memory 阅读全文
posted @ 2021-05-30 08:52 吴建明wujianming 阅读(149) 评论(0) 推荐(0) 编辑
摘要: TVM性能评估分析(六) Figure 1. The workflow of development PC, compile, deploy to the device, test, then modify the codes again to see whether it accelerates. 阅读全文
posted @ 2021-05-30 07:55 吴建明wujianming 阅读(94) 评论(0) 推荐(0) 编辑
摘要: TVM性能评估分析(五) Figure 3. A futher speed up with operator fusion Table 1. Performance issue of cuBLAS’ batch matmul Table 2. Finding the best combination 阅读全文
posted @ 2021-05-30 07:29 吴建明wujianming 阅读(160) 评论(0) 推荐(0) 编辑
摘要: TVM性能评估分析(四) Figure 1. Efficient Privacy-Preserving ML Using TVM Figure 2. Motivation: Privacy-Preserving ML Figure 3. Backend Figure 4. Differential 阅读全文
posted @ 2021-05-30 07:05 吴建明wujianming 阅读(102) 评论(0) 推荐(0) 编辑
摘要: TVM性能评估分析(三) Figure 1. TVM’s WebGPU backend close to native GPU performance when deploying models to the web. Figure 2. WebGPU is to write shaders for 阅读全文
posted @ 2021-05-30 06:27 吴建明wujianming 阅读(90) 评论(0) 推荐(0) 编辑
摘要: TVM性能评估分析(二) Figure 1. A bird’s eye view of the µTVM + AutoTVM infrastructure Figure 2. A standard µTVM setup, where the host communicates with the de 阅读全文
posted @ 2021-05-30 06:00 吴建明wujianming 阅读(117) 评论(0) 推荐(0) 编辑
摘要: TVM性能评估分析(一) System Overview AutoTVM vs Auto-scheduler Table 1. Workflow Comparision Figure 1. Search Process Overview Figure 2. Code Performance Comp 阅读全文
posted @ 2021-05-30 05:41 吴建明wujianming 阅读(113) 评论(0) 推荐(0) 编辑