一个现成的用python写的项目, 有GUI,https://github.com/mustafamerttunali/deep-learning-training-gui?tab=readme-ov-file, 受到 Nvidia DIGITS 启发


安装该项目

ENV:

Win11

Anaconda

 

主要参考 https://www.tensorflow.org/install/pip

 

1. 安装 python 3.9, 在Anaconda 新建一个python 3.9 环境

2. 安装 Cuda driver

3. 安装 Cuda tool kit 11.8

https://developer.nvidia.com/cuda-11-8-0-download-archive?target_os=Windows&target_arch=x86_64&target_version=11&target_type=exe_network

4. 在 python 3.9 环境里运行下面的命令

conda install -c conda-forge cudatoolkit=11.2 cudnn=8.1.0
# Anything above 2.10 is not supported on the GPU on Windows Native
python -m pip install "tensorflow<2.11"
# Verify the installation:
python -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"

 

遇到的问题

1. 用pyinstaller 打包后,运行app.exe, 遇到这个

Traceback (most recent call last):
  File "app.py", line 139, in <module>
    app.run(debug=True)
  File "flask\app.py", line 920, in run
  File "werkzeug\serving.py", line 1071, in run_simple
  File "werkzeug\serving.py", line 852, in make_server
  File "werkzeug\serving.py", line 718, in __init__
  File "socket.py", line 544, in fromfd
OSError: [WinError 10038] 在一个非套接字上尝试了一个操作。
[27024] Failed to execute script 'app' due to unhandled exception!

 

solution:

加 freeze_support

if __name__ == "__main__":
    freeze_support()
    app.run(debug=True)

 

Ref:

https://docs.python.org/3/library/multiprocessing.html#multiprocessing.freeze_support

 

2. 解决了#1 问题后,再次运行又遇到了

204/317 [==================>...........] - ETA: 5s - loss: 0.2859 - accuracy: 0.8863Your TensorFlow version is up to date! 2.10.1
206/317 [==================>...........] - ETA: 5s - loss: 0.2836 - accuracy: 0.8874'tensorboard' 不是内部或外部命令,也不是可运行的程序
或批处理文件

 

solution:
?

 

 

下面的已经deprecated了

1.安装 Python 3.7, 在Anaconda 新建一个python 3.7 环境

2. 安装 VC++ build tool 14.0  以上版本, 我从下面这个link下载的最新版是 17.6.4

https://visualstudio.microsoft.com/visual-cpp-build-tools/

否则会遇到

 

3. 修改一下 requirement.txt 解决冲突

tensorboard==2.1.0

否则会遇到这个问题

 4.

git clone https://github.com/mustafamerttunali/deep-learning-training-gui.git

cd Deep-Learning-Training-GUI

On your conda terminal: pip install -r requirements.txt

 

5. 安装成功

6. 运行 python app.py 遇到如下问题

(AI_On_ARM_MCU) E:\projects\202312_ARM_MCU\code\deep-learning-training-gui>python app.py
Traceback (most recent call last):
  File "app.py", line 13, in <module>
    from flask import Flask, request, jsonify, render_template
  File "D:\Users\shuai\anaconda3\envs\AI_On_ARM_MCU\lib\site-packages\flask\__init__.py", line 14, in <module>
    from jinja2 import escape
ImportError: cannot import name 'escape' from 'jinja2' (D:\Users\shuai\anaconda3\envs\AI_On_ARM_MCU\lib\site-packages\jinja2\__init__.py)

原因是flask 1.x 依赖escape 但是最新的jinja包里已经没有了对escape的支持(https://stackoverflow.com/questions/71718167/importerror-cannot-import-name-escape-from-jinja2),解决办法是升级flask 到2.x

Flask==2.0
 
 
 
NOTE: TF 2.10 是支持windows OS 上GPU的最后一个版本, 后面TF版本只支持WSL 上GPU了
 
posted @ 2024-03-16 13:15  mashuai_191  阅读(12)  评论(0编辑  收藏  举报