[SAA] 32. Data Engineering
AWS Batch Overview
- Run batch jobs as Docker images
- Dynamic provisioning of the instances (EC2 & Spot Instances) - in VPC
- Optimal quantity and type based on volume and requirements
- No need to manage clusters, fully serverless
- You just pay for the underlying EC2 instance
- Example: batch process of images, running thousands of concurrent jobs
- Schedule Batch Jobs using CloudWatch Events
- Orchestrate Batch Jobs using AWS Step Functions
Lambda vs Batch
Lambda
- Time limit: 15 mins
- Limted runtime
- Limited temporary disk space
- Serverless
Batch
- No time limit
- Any runtinme as long as it's package as a Docker image
- Rely on EBS / instance store for disk space
- Relies on EC2 (can be managed by AWS)
Compute Environments
Managed Compute Environment
- AWS Batch managed the capacity and instance types within the environment
- You can choose On-Demand or Spot Instance
- You can set a maximum price for Spot instance
- Launched within your own VPC
- If you launch within your own private subnet, make sure it has access to the ECS service
- Either using a NAT Gateway / instance or using VPC Endpoint for ECS
Unmanaged Compute Environment
- You control and manage instance configuration, provisioning and scaling
Kinesis
CloudWatch cannot send to Kinesis Data Firehose or Kinesis Data Streams
Near real-time: Kinesis Data Firehose
Kinesis agent can directly configured to send data to Kinesis Data Firehose
Firehose can connect to S3
Kinesis Data Firehose is near real-time
Using Lambda to send to ElasticSearch
Athena
- Quicksight for visiulization dashboard
- CloudTrail can stream logs to CloudWatch
- EMR can choose to use Spot Fleet to control the cost
- Athena: data must stay in S3
- Redshift Spectrum for serverless queries on S3
分类:
AWS SAP
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· SQL Server 2025 AI相关能力初探
· Linux系列:如何用 C#调用 C方法造成内存泄露
· AI与.NET技术实操系列(二):开始使用ML.NET
· 记一次.NET内存居高不下排查解决与启示
· 探究高空视频全景AR技术的实现原理
· 阿里最新开源QwQ-32B,效果媲美deepseek-r1满血版,部署成本又又又降低了!
· Manus重磅发布:全球首款通用AI代理技术深度解析与实战指南
· 开源Multi-agent AI智能体框架aevatar.ai,欢迎大家贡献代码
· 被坑几百块钱后,我竟然真的恢复了删除的微信聊天记录!
· AI技术革命,工作效率10个最佳AI工具
2016-09-23 [Angular 2] Create template with Params
2016-09-23 [Angular 2] Generate and Render Angular 2 Template Elements in a Component
2016-09-23 [Angular 2] Move and Delete Angular 2 Components After Creation
2016-09-23 [Angular 2] Order Dynamic Components Inside an Angular 2 ViewContainer
2016-09-23 [Angular 2] Set Properties on Dynamically Created Angular 2 Components
2016-09-23 [Angular 2] Generate Angular 2 Components Programmatically with entryComponents & ViewContainerRef
2016-09-23 [Angular 2] ElementRef, @ViewChild & Renderer