Docker Tutorial
Docker Tutorial
Basic Docker Commands
pull an image
docker pull {image name}:{image version}
list all docker images
docker image ls -a
create and run a docker container
docker run \
--name {container name} \
-p {host port}:{container port} \
-p {host ip}:{host port}:{container port} \
-it \
{image name}:{image version} \
{command going to run in the container}
specificlly, we can assign network mode of the container, such as host mode
docker run --net=host
list all docker containers
docker container ls -a
# OR
docker ps -a
start a container
docker start {container name}
run a command in a running container
docker exec \
-it \
{container name} \
/bin/bash
stop a running containers
docker stop {container name}
remove a container
# docker rm [OPTIONS] CONTAINER
docker rm {container name}
Docker in action
Build and Run ElasticSearch Image with Official Image
Pull an elasticsearch 7.12.1 image
docker pull elasticsearch:7.12.1
As running in development mode, create user defined network
docker network create dev
Run Elasticsearch
docker run -d \
--name elasticsearch \
--net dev \
--net-alias es \
-p 9200:9200 -p 9300:9300 \
-e "discovery.type=single-node" \
elasticsearch:7.12.1
To test instrtction:
curl 127.0.0.1:9200
Build and Run SSH Server with DockerFile
Firstly, we have this dockerfile
FROM ubuntu:20.04
ARG DEBIAN_FRONTEND=noninteractive
RUN apt update \
&& apt install -y openssh-server
RUN echo "PermitRootLogin yes" >> /etc/ssh/sshd_config \
&& echo 'root:123' | chpasswd
ENTRYPOINT service ssh start && /bin/bash
EXPOSE 22
Secondly, we build this dockerfile to a image
cd /path/to/Dockerfile
docker build -t myssh .
Thirdly, we run the image to a container
docker run \
-d \
-t \
-p 8022:22 \
--name myssh \
myssh \
/bin/bash
Finally, we can try with
ssh root@127.0.0.1 p 8022
Build PySpark Image with Dockerfile
first of all, there is a dockerfile which can build pyspark and openssh-server
FROM ubuntu:20.04
ARG DEBIAN_FRONTEND=noninteractive
RUN apt update \
&& apt install -y openjdk-8-jdk scala python3-pip wget openssh-server \
&& pip3 install py4j pyspark -i http://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com \
&& cd /opt \
&& wget https://dlcdn.apache.org/spark/spark-3.2.1/spark-3.2.1-bin-hadoop2.7.tgz \
&& tar zxvf spark-3.2.1-bin-hadoop2.7.tgz \
&& echo SPARK_HOME=/opt/spark-3.2.1-bin-hadoop2.7 >> /root/.bashrc \
&& echo PATH=${SPARK_HOME}/bin:$PATH >> /root/.bashrc \
&& echo "PermitRootLogin yes" >> /etc/ssh/sshd_config \
&& echo 'root:123' | chpasswd
ENTRYPOINT service ssh start \
&& python3 /opt/spark-3.2.1-bin-hadoop2.7/examples/src/main/python/pi.py \
&& /bin/bash
EXPOSE 22 4040 4041 4042
And there is a pure PySpark Dockerfile, you can choose any one of them.
FROM ubuntu:latest
ARG DEBIAN_FRONTEND=noninteractive
RUN apt update \
&& apt install -y openjdk-8-jdk scala python3-pip wget \
&& pip3 install py4j pyspark -i http://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com
RUN cd /opt \
&& wget https://dlcdn.apache.org/spark/spark-3.2.1/spark-3.2.1-bin-hadoop2.7.tgz \
&& tar zxvf spark-3.2.1-bin-hadoop2.7.tgz \
&& echo SPARK_HOME=/opt/spark-3.2.1-bin-hadoop2.7 >> /root/.bashrc \
&& echo PATH=${SPARK_HOME}/bin:$PATH >> /root/.bashrc
ENTRYPOINT python3 /opt/spark-3.2.1-bin-hadoop2.7/examples/src/main/python/pi.py \
&&/bin/bash
EXPOSE 4040 4041 4042
Secondly, we build this dockerfile to a image
cd /path/to/Dockerfile
docker build -t pyspark .
Finally, you can create and run this container.
docker run \
-d \
-t \
-p 8022:22 \
-p 4040:4040 \
-p 4041:4041 \
-p 4042:4042 \
-v /home/fyb:/data \
--name pyspark \
pyspark \
/bin/bash