tensorflowonspark运行官方mnist例子
搭建tensorflowonspark请参考:https://www.cnblogs.com/yangyuxia/p/15634030.html
步骤一:下载mnist数据集
cd /home/jianyuan curl -O "http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz" curl -O "http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz" curl -O "http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz" curl -O "http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz" zip -r mnist.zip *
步骤二:将 MNIST zip 文件转换为 HDFS 文件
#python的安装路径
export PYTHON_ROOT=/opt/module/python3 export LD_LIBRARY_PATH=${PATH} export PYSPARK_PYTHON=${PYTHON_ROOT}/bin/python3 export SPRAK_YARN_USER_ENV="PYSPARK_PYTHON=/opt/module/python3/bin/python3" export PATH=${PYTHON_ROOT}/bin/:$PATH export QUEUE=default
#CDH的安装路径 export LIB_HDFS=/opt/cloudera/parcels/CDH/lib64
#path to libjvm.so
export LIB_JVM=$JAVA_HOME/jre/lib/amd64/server /opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1580995/bin/spark-submit \ --master yarn \ --deploy-mode cluster \ --queue ${QUEUE} \ --num-executors 3 \ --executor-memory 3G \ --archives hdfs:///user/mnist.zip \ --jars hdfs:///user/root/tensorflow-hadoop-1.0-SNAPSHOT.jar \ /home/jianyuan/TensorFlowOnSpark-2.2.4/examples/mnist/mnist_data_setup.py \ --output mnist
###注意:需要加--jar 指明tensorflow-hadoop-1.0-SNAPSHOT.jar,要不报java.lang.ClassNotFoundException: org.tensorflow.hadoop.io.TFRecordFileOutputFormat
步骤三:运行分布式 MNIST 训练(使用 InputMode.SPARK)
/opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1580995/bin/spark-submit \ --master yarn \ --deploy-mode cluster \ --queue default \ --num-executors 3 \ --executor-memory 2G \ --py-files /home/jianyuan/TensorFlowOnSpark-2.2.4/tfspark.zip \ --conf spark.dynamicAllocation.enabled=false \ --conf spark.yarn.maxAppAttempts=1 \ --archives hdfs:///user/root/TensorFlowOnSpark-2.2.4.tar.gz#tensorflowonspark \ --conf spark.executorEnv.LD_LIBRARY_PATH=$LIB_JVM:$LIB_HDFS \ --conf spark.executorEnv.CLASSPATH=$(hadoop classpath --glob) \ --conf spark.yarn.appMasterEnv.PYSPARK_PYTHON=/usr/bin/python3 \ /home/jianyuan/TensorFlowOnSpark-2.2.4/examples/mnist/keras/mnist_spark.py \ --images_labels hdfs:///user/root/mnist/csv/train \ --model_dir hdfs:///user/root/mnist_model \ --export_dir hdfs:///user/root/mnist_export
步骤四:运行分布式 MNIST 推理(使用 InputMode.SPARK)
/opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1580995/bin/spark-submit \ --master yarn \ --deploy-mode cluster \ --queue default \ --num-executors 3 \ --executor-memory 2G \ --py-files /root/TensorFlowOnSpark-2.2.1/tfspark.zip \ --conf spark.dynamicAllocation.enabled=false \ --conf spark.yarn.maxAppAttempts=1 \ --archives hdfs:///user/root/TensorFlowOnSpark-2.2.1.tar.gz#tensorflowonspark \ --conf spark.executorEnv.LD_LIBRARY_PATH=$LIB_JVM:$LIB_HDFS \ --conf spark.executorEnv.CLASSPATH=$(hadoop classpath --glob) \ --conf spark.yarn.appMasterEnv.PYSPARK_PYTHON=/usr/bin/python3 \ /home/jianyuan/TensorFlowOnSpark-2.2.4/examples/mnist/keras/mnist_inference.py \ --images_labels hdfs:///user/root/mnist/tfr/test \ --export_dir hdfs:///user/root/mnist_export/1638364658 \ --output hdfs:///user/root/predictions