[AWS GPU] Performance and pricing

Ref: Choosing the right GPU for deep learning on AWS

P系列，适合训练；

G系列，适合推理。

Amazon EC2 P4 instance product details

Ref: https://aws.amazon.com/ec2/instance-types/g4/

Ref: How can I get 65Tflops performance with NVIDIA T4

Speed calculation:
if->
On a GPU which can provide 18.7Tflops of performance YOLO runs at 160fps with 100% GPU utilization
Then
Then on a GPU which can provide 65Tflops of performance YOLO should run at 555 fps with 100% GPU utilization ( With no bttlenecks)

至少能处理6路视频，或者6个模型。Detecting 200 categories become possible!!!

Single Precision Floating Point Performance	8.1 TFLOPS (GPU Boost Clock)	60 fps
Mixed Precision (FP16 / FP32)	65 TFLOPS	550 fps
INT8-Precision	130 TOPS
INT4-Precision	260 TOPS

	Instance Size	GPU	vCPUs	Memory (GB)	Storage (GB)	Network Bandwidth (Gbps)	EBS Bandwidth (Gbps)	On-Demand Price/hr*	1-yr Reserved Instance Effective Hourly* (Linux)	3-yr Reserved Instance Effective Hourly* (Linux)
G4dn
Single GPU VMs	g4dn.xlarge	1	4	16	125	Up to 25	Up to 3.5	$0.526	$0.316	$0.210
	g4dn.2xlarge	1	8	32	225	Up to 25	Up to 3.5	$0.752	$0.452	$0.300
	g4dn.4xlarge	1	16	64	225	Up to 25	4.75	$1.204	$0.722	$0.482
	g4dn.8xlarge	1	32	128	1x900	50	9.5	$2.176	$1.306	$0.870
	g4dn.16xlarge	1	64	256	1x900	50	9.5	$4.352	$2.612	$1.740

Multi GPU VMs	g4dn.12xlarge	4	48	192	1x900	50	9.5	$3.912	$2.348	$1.564
Multi GPU VMs	g4dn.metal	8	96	384	2x900	100	19	$7.824	$4.694	$3.130
G4ad
Single GPU VMs	g4ad.xlarge	1	4	16	150	Up to 10	Up to 3	$0.379	$0.227	$0.178
	g4ad.2xlarge	1	8	32	300	Up to 10	Up to 3	$0.541	$0.325	$0.254
	g4ad.4xlarge	1	16	64	600	Up to 10	Up to 3	$0.867	$0.520	$0.405

Multi GPU VMs	g4ad.8xlarge	2	32	128	1200	15	3	$1.734	$1.040	$0.810
Multi GPU VMs	g4ad.16xlarge	4	64	256	2400	25	6	$3.468	$2.081	$1.619

Amazon EC2 P3 instance product details

Ref: https://aws.amazon.com/ec2/instance-types/p3/

Ref: V100服务器和T4服务器的性能指标

Instance Size	GPUs - Tesla V100	GPU Peer to Peer	GPU Memory (GB)	vCPUs	Memory (GB)	Network Bandwidth	EBS Bandwidth	On-Demand Price/hr*	1-yr Reserved Instance Effective Hourly*	3-yr Reserved Instance Effective Hourly*
p3.2xlarge	1	N/A	16	8	61	Up to 10 Gbps	1.5 Gbps	$3.06	$1.99	$1.05
p3.8xlarge	4	NVLink	64	32	244	10 Gbps	7 Gbps	$12.24	$7.96	$4.19
p3.16xlarge	8	NVLink	128	64	488	25 Gbps	14 Gbps	$24.48	$15.91	$8.39
p3dn.24xlarge	8	NVLink	256	96	768	100 Gbps	19 Gbps	$31.218	$18.30	$9.64

V100 vs 2080 Ti

How can the 2080 Ti be 80% as fast as the Tesla V100, but only 1/8th of the price?

Ref: hackmd.io 2080Ti and V100 Benchmarks 【非常好】

2080 TI

Running warm up
Done warm up
Step   Img/sec                                     total_loss
1      images/sec: 115.1 +/- 0.0 (jitter = 0.0)    9.865
10     images/sec: 113.0 +/- 0.6 (jitter = 1.2)    9.741
20     images/sec: 112.8 +/- 0.4 (jitter = 1.5)    10.067
30     images/sec: 112.9 +/- 0.3 (jitter = 1.2)    9.834
40     images/sec: 112.9 +/- 0.2 (jitter = 1.1)    10.052
50     images/sec: 113.0 +/- 0.2 (jitter = 0.9)    9.889
60     images/sec: 113.0 +/- 0.2 (jitter = 1.0)    9.771
70     images/sec: 112.8 +/- 0.2 (jitter = 1.2)    9.697
80     images/sec: 112.6 +/- 0.2 (jitter = 1.3)    9.946
90     images/sec: 112.5 +/- 0.1 (jitter = 1.3)    9.611
100    images/sec: 112.3 +/- 0.1 (jitter = 1.6)    9.870
----------------------------------------------------------------
total images/sec: 112.24
----------------------------------------------------------------

V100

Running warm up
Done warm up
Step   Img/sec                                     total_loss
1      images/sec: 122.4 +/- 0.0 (jitter = 0.0)    9.924
10     images/sec: 123.3 +/- 0.6 (jitter = 2.2)    9.732
20     images/sec: 124.4 +/- 0.4 (jitter = 1.6)    10.058
30     images/sec: 124.6 +/- 0.3 (jitter = 1.0)    9.818
40     images/sec: 124.9 +/- 0.2 (jitter = 0.9)    10.044
50     images/sec: 125.1 +/- 0.2 (jitter = 1.0)    9.893
60     images/sec: 125.0 +/- 0.2 (jitter = 1.1)    9.798
70     images/sec: 125.1 +/- 0.2 (jitter = 1.1)    9.733
80     images/sec: 125.1 +/- 0.2 (jitter = 1.1)    9.947
90     images/sec: 125.1 +/- 0.1 (jitter = 1.1)    9.631
100    images/sec: 125.1 +/- 0.1 (jitter = 1.2)    9.861
----------------------------------------------------------------
total images/sec: 125.05
----------------------------------------------------------------

两张2080 ti，看来收敛快乐一点点~

Running warm up
Done warm up
Step   Img/sec                                     total_loss
1      images/sec: 205.7 +/- 0.0 (jitter = 0.0)    9.789
10     images/sec: 206.1 +/- 0.3 (jitter = 1.3)    9.812
20     images/sec: 205.9 +/- 0.4 (jitter = 1.5)    9.996
30     images/sec: 205.9 +/- 0.3 (jitter = 1.5)    9.851
40     images/sec: 205.6 +/- 0.3 (jitter = 1.3)    10.102
50     images/sec: 205.5 +/- 0.2 (jitter = 1.3)    9.877
60     images/sec: 205.3 +/- 0.2 (jitter = 1.5)    9.866
70     images/sec: 205.2 +/- 0.2 (jitter = 1.4)    9.916
80     images/sec: 205.1 +/- 0.2 (jitter = 1.5)    9.897
90     images/sec: 205.1 +/- 0.2 (jitter = 1.5)    9.799
100    images/sec: 205.0 +/- 0.2 (jitter = 1.5)    9.787
----------------------------------------------------------------
total images/sec: 204.94
----------------------------------------------------------------

总结

训练10小时，30.6美金；训练十次，306美金（430澳币 or 1960人民币）；

posted @ 2021-11-29 15:42 郝壹贰叁阅读(184) 评论(0) 收藏举报

刷新页面返回顶部

机器学习水很深

We all have two lives. The second one starts when we realize that we only have one. --- Tom Hiddleston

[AWS GPU] Performance and pricing

Amazon EC2 P4 instance product details

Amazon EC2 P3 instance product details

V100 vs 2080 Ti

总结

公告