tensorflowonspark运行官方mnist例子

搭建tensorflowonspark请参考：https://www.cnblogs.com/yangyuxia/p/15634030.html

步骤一:下载mnist数据集

cd /home/jianyuan
curl -O "http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz"
curl -O "http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz"
curl -O "http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz"
curl -O "http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz"
zip -r mnist.zip *

步骤二：将 MNIST zip 文件转换为 HDFS 文件

#python的安装路径<br>export PYTHON_ROOT=/opt/module/python3
export LD_LIBRARY_PATH=${PATH}
export PYSPARK_PYTHON=${PYTHON_ROOT}/bin/python3
export SPRAK_YARN_USER_ENV="PYSPARK_PYTHON=/opt/module/python3/bin/python3"
export PATH=${PYTHON_ROOT}/bin/:$PATH
export QUEUE=default<br>#CDH的安装路径
export LIB_HDFS=/opt/cloudera/parcels/CDH/lib64<br>#<code>path to libjvm.so</code>
export LIB_JVM=$JAVA_HOME/jre/lib/amd64/server
 
 
/opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1580995/bin/spark-submit \
--master yarn \
--deploy-mode cluster \
--queue ${QUEUE} \
--num-executors 3 \
--executor-memory 3G \
--archives hdfs:///user/mnist.zip \
--jars hdfs:///user/root/tensorflow-hadoop-1.0-SNAPSHOT.jar \
/home/jianyuan/TensorFlowOnSpark-2.2.4/examples/mnist/mnist_data_setup.py \
--output mnist<br><br>###注意：需要加--jar 指明tensorflow-hadoop-1.0-SNAPSHOT.jar，要不报java.lang.ClassNotFoundException: org.tensorflow.hadoop.io.TFRecordFileOutputFormat

步骤三：运行分布式 MNIST 训练（使用 InputMode.SPARK）

/opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1580995/bin/spark-submit \
--master yarn \
--deploy-mode cluster \
--queue default \
--num-executors 3 \
--executor-memory 2G \
--py-files /home/jianyuan/TensorFlowOnSpark-2.2.4/tfspark.zip \
--conf spark.dynamicAllocation.enabled=false \
--conf spark.yarn.maxAppAttempts=1 \
--archives hdfs:///user/root/TensorFlowOnSpark-2.2.4.tar.gz#tensorflowonspark \
--conf spark.executorEnv.LD_LIBRARY_PATH=$LIB_JVM:$LIB_HDFS \
--conf spark.executorEnv.CLASSPATH=$(hadoop classpath --glob) \
--conf spark.yarn.appMasterEnv.PYSPARK_PYTHON=/usr/bin/python3 \
/home/jianyuan/TensorFlowOnSpark-2.2.4/examples/mnist/keras/mnist_spark.py \
--images_labels hdfs:///user/root/mnist/csv/train \
--model_dir hdfs:///user/root/mnist_model \
--export_dir hdfs:///user/root/mnist_export 

步骤四：运行分布式 MNIST 推理（使用 InputMode.SPARK）

/opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1580995/bin/spark-submit \
--master yarn \
--deploy-mode cluster \
--queue default \
--num-executors 3 \
--executor-memory 2G \
--py-files /root/TensorFlowOnSpark-2.2.1/tfspark.zip \
--conf spark.dynamicAllocation.enabled=false \
--conf spark.yarn.maxAppAttempts=1 \
--archives hdfs:///user/root/TensorFlowOnSpark-2.2.1.tar.gz#tensorflowonspark \
--conf spark.executorEnv.LD_LIBRARY_PATH=$LIB_JVM:$LIB_HDFS \
--conf spark.executorEnv.CLASSPATH=$(hadoop classpath --glob) \
--conf spark.yarn.appMasterEnv.PYSPARK_PYTHON=/usr/bin/python3 \
/home/jianyuan/TensorFlowOnSpark-2.2.4/examples/mnist/keras/mnist_inference.py \
--images_labels hdfs:///user/root/mnist/tfr/test \
--export_dir hdfs:///user/root/mnist_export/1638364658 \
--output hdfs:///user/root/predictions