摘要: autocast op reference Op Eligibility Ops that run in float64 or non-floating-point dtypes are not eligible, and will run in these types whether or not 阅读全文
posted @ 2022-05-18 17:21 xuyv 阅读(76) 评论(0) 推荐(0) 编辑
摘要: CTA: Cooperative Thread Array 即 CUDA BLOCK https://github.com/NVIDIA/cuda-samples/blob/2e41896e1b2c7e2699b7b7f6689c107900c233bb/Samples/3_CUDA_Feature 阅读全文
posted @ 2022-05-18 16:18 xuyv 阅读(164) 评论(0) 推荐(0) 编辑