[AWS] Deploy Container Image for AWS Lambda

Lambda Container 爬虫

Ref: Build and Deploy a Web Scraper using Docker and AWS Lambda

文章的关键是:如何构建 image for Lambda。

 

一、爬虫取图片上传S3

Ref: https://github.com/rchauhan9/image-scraper-lambda-container/blob/master/app/app.py

import scraper
import aws_s3 as s3s

def handler(event, context):
    scr  = scraper.ImageScraper()
    urls = scr.get_image_urls(query=event['query'], max_urls=event['count'], sleep_between_interactions=1)
files
= [] for url in urls: img_obj, img_hash = scr.get_in_memory_image(url, 'jpeg') files.append(img_hash) s3s.upload_object(img_obj, event['bucket'], event['folder_path']+img_hash, 'jpeg') scr.close_connection() return "Successfully loaded {} images to bucket {}. Folder path {} and file names {}.".format(event['count'], event['bucket'], event['folder_path'], files)

  

问题的关键,就是如何理解“触发机制”:

可见,container的三大 关键字:WORKDIR, ENTRYPOINT, CMD。

#
# Stage 4 - final runtime image
#
# Grab a fresh copy of the Python image
FROM python-alpine
ARG FUNCTION_DIR
WORKDIR ${FUNCTION_DIR}
# Copy in the built dependencies
COPY --from=build-image2 ${FUNCTION_DIR} ${FUNCTION_DIR}
RUN apk add jpeg-dev zlib-dev libjpeg-turbo-dev \
    && apk add chromium chromium-chromedriver
# (Optional) Add Lambda Runtime Interface Emulator and use a script in the ENTRYPOINT for simpler local runs
ADD https://github.com/aws/aws-lambda-runtime-interface-emulator/releases/latest/download/aws-lambda-rie /usr/bin/aws-lambda-rie
RUN chmod 755 /usr/bin/aws-lambda-rie
# Copy handler function
COPY app/* ${FUNCTION_DIR}
COPY entry.sh /
ENTRYPOINT [ "/entry.sh" ]
CMD [ "app.handler" ]

下面有四个阶段,每个阶段,变量都要引用一下,记得这点。

# Define global args
ARG FUNCTION_DIR="/home/app/"
ARG RUNTIME_VERSION="3.9"
ARG DISTRO_VERSION="3.12"

#
# Stage 1 - bundle base image + runtime
#
# Grab a fresh copy of the image and install GCC
# A minimal Docker image based on Alpine Linux with a complete package index and only 5 MB in size!
FROM python:${RUNTIME_VERSION}-alpine${DISTRO_VERSION} AS python-alpine
# Install GCC (Alpine uses musl but we compile and link dependencies with GCC)
RUN apk add --no-cache \
    libstdc++

#
# Stage 2 - build function and dependencies
#
FROM python-alpine AS build-image
# Install aws-lambda-cpp build dependencies
RUN apk add --no-cache \
    build-base \
    libtool \
    autoconf \
    automake \
    libexecinfo-dev \
    make \
    cmake \
    libcurl
# Include global args in this stage of the build
ARG FUNCTION_DIR
ARG RUNTIME_VERSION
# Create function directory
RUN mkdir -p ${FUNCTION_DIR}
# *** Install Lambda Runtime Interface Client for Python
RUN python${RUNTIME_VERSION} -m pip install awslambdaric --target ${FUNCTION_DIR}

#
# Stage 3 - Add app related dependencies 前两步是套路,这里根据具体情况,添加依赖内容。
#
FROM python-alpine as build-image2
ARG FUNCTION_DIR
WORKDIR ${FUNCTION_DIR}
# Copy in the built dependencies
COPY --from=build-image ${FUNCTION_DIR} ${FUNCTION_DIR}
# Copy over and install requirements
RUN apk update \
    && apk add gcc python3-dev musl-dev \
    && apk add jpeg-dev zlib-dev libjpeg-turbo-dev
COPY requirements.txt .
RUN python${RUNTIME_VERSION} -m pip install -r requirements.txt --target ${FUNCTION_DIR}

#
# Stage 4 - final runtime image 主要是拷贝 主代码
#
# Grab a fresh copy of the Python image
FROM python-alpine
ARG FUNCTION_DIR
WORKDIR ${FUNCTION_DIR}
# Copy in the built dependencies
COPY --from=build-image2 ${FUNCTION_DIR} ${FUNCTION_DIR}
RUN apk add jpeg-dev zlib-dev libjpeg-turbo-dev \
    && apk add chromium chromium-chromedriver
# (Optional) Add Lambda Runtime Interface Emulator and use a script in the ENTRYPOINT for simpler local runs
ADD https://github.com/aws/aws-lambda-runtime-interface-emulator/releases/latest/download/aws-lambda-rie /usr/bin/aws-lambda-rie
RUN chmod 755 /usr/bin/aws-lambda-rie
# Copy handler function COPY app/* ${FUNCTION_DIR}
COPY entry.sh
/ ENTRYPOINT [ "/entry.sh" ] CMD [ "app.handler" ]

 

二、本地测试

共享了aws的账户信息。

image-scraper-lambda-container$ docker run -p 9000:8080 -v ~/.aws/:/root/.aws/ lambda/image-scraper:1.0 

time="2021-01-21T13:03:12.648" level=info msg="exec '/usr/local/bin/python' (cwd=/home/app, handler=app.handler)" 

如此,就可以操作S3等服务了呢。

image-scraper-lambda-container$ curl -XPOST "http://localhost:9000/2015-03-31/functions/function/invocations" -d '{"query":"beagle puppy", "count":3, "bucket":"my-dogs-youtube", "folder_path":"local/"}'
"Successfully loaded 3 images to bucket my-dogs-youtube. Folder path local/ and file names ['a154ddd833.jpeg', '952693f107.jpeg', 'ecd5070ed4.jpeg']."

 

三、线上测试

创建Lambda by Image。

然后测试:

{
  "query": "beagle puppy",
  "count": 3,
  "bucket": "tmp-my-dogs-youtube",
  "folder_path": "local/"
}

 

 

 

Lambda 支持 10GB 镜像

Ref: New for AWS Lambda – Container Image Support【实践ing,支持10GB】

 与上一个例子没有区别,只要就是多出了 API GATEWAY。

 

/* implement */

  

End.

posted @ 2021-01-16 13:27  郝壹贰叁  阅读(200)  评论(0编辑  收藏  举报