[AWS] Deploy Container Image for AWS Lambda
Lambda Container 爬虫
Ref: Build and Deploy a Web Scraper using Docker and AWS Lambda
文章的关键是:如何构建 image for Lambda。
一、爬虫取图片上传S3
Ref: https://github.com/rchauhan9/image-scraper-lambda-container/blob/master/app/app.py
import scraper import aws_s3 as s3s def handler(event, context): scr = scraper.ImageScraper() urls = scr.get_image_urls(query=event['query'], max_urls=event['count'], sleep_between_interactions=1)
files = [] for url in urls: img_obj, img_hash = scr.get_in_memory_image(url, 'jpeg') files.append(img_hash) s3s.upload_object(img_obj, event['bucket'], event['folder_path']+img_hash, 'jpeg') scr.close_connection() return "Successfully loaded {} images to bucket {}. Folder path {} and file names {}.".format(event['count'], event['bucket'], event['folder_path'], files)
问题的关键,就是如何理解“触发机制”:
可见,container的三大 关键字:WORKDIR, ENTRYPOINT, CMD。
# # Stage 4 - final runtime image # # Grab a fresh copy of the Python image FROM python-alpine ARG FUNCTION_DIR WORKDIR ${FUNCTION_DIR} # Copy in the built dependencies COPY --from=build-image2 ${FUNCTION_DIR} ${FUNCTION_DIR} RUN apk add jpeg-dev zlib-dev libjpeg-turbo-dev \ && apk add chromium chromium-chromedriver # (Optional) Add Lambda Runtime Interface Emulator and use a script in the ENTRYPOINT for simpler local runs ADD https://github.com/aws/aws-lambda-runtime-interface-emulator/releases/latest/download/aws-lambda-rie /usr/bin/aws-lambda-rie RUN chmod 755 /usr/bin/aws-lambda-rie # Copy handler function COPY app/* ${FUNCTION_DIR} COPY entry.sh / ENTRYPOINT [ "/entry.sh" ] CMD [ "app.handler" ]
下面有四个阶段,每个阶段,变量都要引用一下,记得这点。
# Define global args ARG FUNCTION_DIR="/home/app/" ARG RUNTIME_VERSION="3.9" ARG DISTRO_VERSION="3.12" # # Stage 1 - bundle base image + runtime # # Grab a fresh copy of the image and install GCC # A minimal Docker image based on Alpine Linux with a complete package index and only 5 MB in size! FROM python:${RUNTIME_VERSION}-alpine${DISTRO_VERSION} AS python-alpine # Install GCC (Alpine uses musl but we compile and link dependencies with GCC) RUN apk add --no-cache \ libstdc++ # # Stage 2 - build function and dependencies # FROM python-alpine AS build-image # Install aws-lambda-cpp build dependencies RUN apk add --no-cache \ build-base \ libtool \ autoconf \ automake \ libexecinfo-dev \ make \ cmake \ libcurl # Include global args in this stage of the build ARG FUNCTION_DIR ARG RUNTIME_VERSION # Create function directory RUN mkdir -p ${FUNCTION_DIR} # *** Install Lambda Runtime Interface Client for Python RUN python${RUNTIME_VERSION} -m pip install awslambdaric --target ${FUNCTION_DIR} # # Stage 3 - Add app related dependencies 前两步是套路,这里根据具体情况,添加依赖内容。 # FROM python-alpine as build-image2 ARG FUNCTION_DIR WORKDIR ${FUNCTION_DIR} # Copy in the built dependencies COPY --from=build-image ${FUNCTION_DIR} ${FUNCTION_DIR} # Copy over and install requirements RUN apk update \ && apk add gcc python3-dev musl-dev \ && apk add jpeg-dev zlib-dev libjpeg-turbo-dev COPY requirements.txt . RUN python${RUNTIME_VERSION} -m pip install -r requirements.txt --target ${FUNCTION_DIR} # # Stage 4 - final runtime image 主要是拷贝 主代码 # # Grab a fresh copy of the Python image FROM python-alpine ARG FUNCTION_DIR WORKDIR ${FUNCTION_DIR} # Copy in the built dependencies COPY --from=build-image2 ${FUNCTION_DIR} ${FUNCTION_DIR} RUN apk add jpeg-dev zlib-dev libjpeg-turbo-dev \ && apk add chromium chromium-chromedriver # (Optional) Add Lambda Runtime Interface Emulator and use a script in the ENTRYPOINT for simpler local runs ADD https://github.com/aws/aws-lambda-runtime-interface-emulator/releases/latest/download/aws-lambda-rie /usr/bin/aws-lambda-rie RUN chmod 755 /usr/bin/aws-lambda-rie
# Copy handler function COPY app/* ${FUNCTION_DIR}
COPY entry.sh / ENTRYPOINT [ "/entry.sh" ] CMD [ "app.handler" ]
二、本地测试
共享了aws的账户信息。
image-scraper-lambda-container$ docker run -p 9000:8080 -v ~/.aws/:/root/.aws/ lambda/image-scraper:1.0
time="2021-01-21T13:03:12.648" level=info msg="exec '/usr/local/bin/python' (cwd=/home/app, handler=app.handler)"
如此,就可以操作S3等服务了呢。
image-scraper-lambda-container$ curl -XPOST "http://localhost:9000/2015-03-31/functions/function/invocations" -d '{"query":"beagle puppy", "count":3, "bucket":"my-dogs-youtube", "folder_path":"local/"}'
"Successfully loaded 3 images to bucket my-dogs-youtube. Folder path local/ and file names ['a154ddd833.jpeg', '952693f107.jpeg', 'ecd5070ed4.jpeg']."
三、线上测试
创建Lambda by Image。
然后测试:
{ "query": "beagle puppy", "count": 3, "bucket": "tmp-my-dogs-youtube", "folder_path": "local/" }
Lambda 支持 10GB 镜像
Ref: New for AWS Lambda – Container Image Support【实践ing,支持10GB】
与上一个例子没有区别,只要就是多出了 API GATEWAY。
/* implement */
End.