CUDA编程图例

CUDA编程图例

Figure 7. Matrix Multiplication without Shared Memory

Figure 8. Matrix Multiplication with Shared Memory

Figure 20. Examples of Global Memory Accesses. Examples of Global Memory Accesses by a Warp, 4-Byte Word per Thread, and Associated Memory Transactions for Compute Capabilities 3.x and Beyond

Figure 21. Strided Shared Memory Accesses. Examples for devices of compute capability 3.x (in 32-bit mode) or compute capability 5.x and 6.x

Figure 22. Irregular Shared Memory Accesses. Examples for devices of compute capability 3.x, 5.x, or 6.x.

参考链接：

https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#arithmetic-instructions__throughput-native-arithmetic-instructions

posted @ 2021-12-07 06:12 吴建明wujianming 阅读(76) 评论(0) 编辑收藏举报

刷新页面返回顶部

登录后才能查看或发表评论，立即登录或者逛逛博客园首页

阅读排行：
· 全程不用写代码，我用AI程序员写了一个飞机大战
· DeepSeek 开源周回顾「GitHub 热点速览」
· 记一次.NET内存居高不下排查解决与启示
· MongoDB 8.0这个新功能碉堡了，比商业数据库还牛
· .NET10 - 预览版1新功能体验（一）

历史上的今天：
2020-12-07 NVIDIA Turing Architecture架构设计（下）
2020-12-07 NVIDIA Turing Architecture架构设计（上）
2020-12-07 MLPerf Inference 0.7应用
2020-12-07 机器人应用程序设计

公告

昵称：吴建明wujianming
园龄： 7年5个月
粉丝： 532
关注： 0

+加关注

2025年3月

日

一

二

三

四

五

六

吴建明

CUDA编程图例

公告

搜索

常用链接

我的标签

随笔档案

阅读排行榜

评论排行榜

推荐排行榜

最新评论