AMP

 

 

 

 

 

autocast op reference

Op Eligibility

Ops that run in float64 or non-floating-point dtypes are not eligible, and will run in these types whether or not autocast is enabled.

Only out-of-place ops and Tensor methods are eligible. In-place variants and calls that explicitly supply an out=... Tensor are allowed in autocast-enabled regions, but won’t go through autocasting. For example, in an autocast-enabled region a.addmm(b, c) can autocast, but a.addmm_(b, c) and a.addmm(b, c, out=d) cannot. For best performance and stability, prefer out-of-place ops in autocast-enabled regions.

Ops called with an explicit dtype=... argument are not eligible, and will produce output that respects the dtype argument.

 

CUDA Ops that can autocast to float16

__matmul__addbmmaddmmaddmvaddrbaddbmmbmmchain_matmulmulti_dotconv1dconv2dconv3dconv_transpose1dconv_transpose2dconv_transpose3dGRUCelllinearLSTMCellmatmulmmmvpreluRNNCell

CUDA Ops that can autocast to float32

__pow____rdiv____rpow____rtruediv__acosasinbinary_cross_entropy_with_logitscoshcosine_embedding_losscdistcosine_similaritycross_entropycumprodcumsumdisterfinvexpexpm1group_normhinge_embedding_losskl_divl1_losslayer_normloglog_softmaxlog10log1plog2margin_ranking_lossmse_lossmultilabel_margin_lossmulti_margin_lossnll_lossnormnormalizepdistpoisson_nll_losspowprodreciprocalrsqrtsinhsmooth_l1_losssoft_margin_losssoftmaxsoftminsoftplussumrenormtantriplet_margin_loss

 

reference:

https://pytorch.org/docs/master/amp.html#autocast-op-reference

https://on-demand.gputechconf.com/gtc-taiwan/2018/pdf/5-1_Internal%20Speaker_Michael%20Carilli_PDF%20For%20Sharing.pdf

https://nvlabs.github.io/eccv2020-mixed-precision-tutorial/

https://zhuanlan.zhihu.com/p/79887894

 

 

 

 
posted @ 2022-05-18 17:21  xuyv  阅读(76)  评论(0编辑  收藏  举报