tdmpc2 Failed to make environment

问题描述

https://github.com/nicklashansen/tdmpc2

readme 文档的例子是 python train.py task=dog-run steps=7000000,然后我想跑下 metaworld 中的 assembly-v2 任务,就得到下面的错误

$ python train.py task=assembly
ValueError: Failed to make environment "assembly": please verify that dependencies are installed and that the task exists.
$ python train.py task=assembly-v2-goal-observable
ValueError: Failed to make environment "assembly-v2-goal-observable": please verify that dependencies are installed and that the task exists.
$ python train.py task=assembly-v2
ValueError: Failed to make environment "assembly-v2": please verify that dependencies are installed and that the task exists.

metaworld 中 存在 assembly-v2 任务

https://meta-world.github.io/
https://metaworld.farama.org

import metaworld
print(metaworld.ML1.ENV_NAMES)  # 50个机械臂操作物体的独立任务

上面的代码会得到:['assembly-v2', 'basketball-v2', 'bin-picking-v2', 'box-close-v2', 'button-press-topdown-v2', 'button-press-topdown-wall-v2', 'button-press-v2', 'button-press-wall-v2', 'coffee-button-v2', 'coffee-pull-v2', 'coffee-push-v2', 'dial-turn-v2', 'disassemble-v2', 'door-close-v2', 'door-lock-v2', 'door-open-v2', 'door-unlock-v2', 'hand-insert-v2', 'drawer-close-v2', 'drawer-open-v2', 'faucet-open-v2', 'faucet-close-v2', 'hammer-v2', 'handle-press-side-v2', 'handle-press-v2', 'handle-pull-side-v2', 'handle-pull-v2', 'lever-pull-v2', 'peg-insert-side-v2', 'pick-place-wall-v2', 'pick-out-of-hole-v2', 'reach-v2', 'push-back-v2', 'push-v2', 'pick-place-v2', 'plate-slide-v2', 'plate-slide-side-v2', 'plate-slide-back-v2', 'plate-slide-back-side-v2', 'peg-unplug-side-v2', 'soccer-v2', 'stick-push-v2', 'stick-pull-v2', 'push-wall-v2', 'reach-wall-v2', 'shelf-place-v2', 'sweep-into-v2', 'sweep-v2', 'window-open-v2', 'window-close-v2']

assembly-v2 是其中的一个任务,

下面是简单运行查看维度的代码

# https://blog.csdn.net/qq_37051669/article/details/126607105
import metaworld
import random
ml1 = metaworld.ML1('pick-place-v2')
env = ml1.train_classes['pick-place-v2']()
task = random.choice(ml1.train_tasks)
env.set_task(task)

print("osb space: ", env.observation_space) # 39 维。
print("action space: ", env.action_space) # 4 维。前3个维度分别控制爪子部分的左右、前后、上下移动,第4个维度是爪子的闭合控制,大于0的时候爪子收缩,小于0的时候爪子张开

env.reset()
for i in range(50):
    a = env.action_space.sample()  # Sample an action [ 0.73054402 -0.36440682  0.50068905 -0.57026342]
    obs, reward, done, info = env.step(a) 
    print(i,obs, reward, done, info)
env.close()
from metaworld.envs import ALL_V2_ENVIRONMENTS_GOAL_OBSERVABLE
print(ALL_V2_ENVIRONMENTS_GOAL_OBSERVABLE.keys())

通过上面的代码会得到:odict_keys(['assembly-v2-goal-observable', 'basketball-v2-goal-observable', 'bin-picking-v2-goal-observable', 'box-close-v2-goal-observable', 'button-press-topdown-v2-goal-observable', 'button-press-topdown-wall-v2-goal-observable', 'button-press-v2-goal-observable', 'button-press-wall-v2-goal-observable', 'coffee-button-v2-goal-observable', 'coffee-pull-v2-goal-observable', 'coffee-push-v2-goal-observable', 'dial-turn-v2-goal-observable', 'disassemble-v2-goal-observable', 'door-close-v2-goal-observable', 'door-lock-v2-goal-observable', 'door-open-v2-goal-observable', 'door-unlock-v2-goal-observable', 'hand-insert-v2-goal-observable', 'drawer-close-v2-goal-observable', 'drawer-open-v2-goal-observable', 'faucet-open-v2-goal-observable', 'faucet-close-v2-goal-observable', 'hammer-v2-goal-observable', 'handle-press-side-v2-goal-observable', 'handle-press-v2-goal-observable', 'handle-pull-side-v2-goal-observable', 'handle-pull-v2-goal-observable', 'lever-pull-v2-goal-observable', 'peg-insert-side-v2-goal-observable', 'pick-place-wall-v2-goal-observable', 'pick-out-of-hole-v2-goal-observable', 'reach-v2-goal-observable', 'push-back-v2-goal-observable', 'push-v2-goal-observable', 'pick-place-v2-goal-observable', 'plate-slide-v2-goal-observable', 'plate-slide-side-v2-goal-observable', 'plate-slide-back-v2-goal-observable', 'plate-slide-back-side-v2-goal-observable', 'peg-unplug-side-v2-goal-observable', 'soccer-v2-goal-observable', 'stick-push-v2-goal-observable', 'stick-pull-v2-goal-observable', 'push-wall-v2-goal-observable', 'reach-wall-v2-goal-observable', 'shelf-place-v2-goal-observable', 'sweep-into-v2-goal-observable', 'sweep-v2-goal-observable', 'window-open-v2-goal-observable', 'window-close-v2-goal-observable'])

总之就是说明存在 assembly 这个任务

# tdmpc2/tdmpc2/envs/metaworld.py
def make_env(cfg):
	"""
	Make Meta-World environment.
	"""
	env_id = cfg.task.split("-", 1)[-1] + "-v2-goal-observable"
	if not cfg.task.startswith('mw-') or env_id not in ALL_V2_ENVIRONMENTS_GOAL_OBSERVABLE:
		raise ValueError('Unknown task:', cfg.task)

调试发现 env_id 确实是在 ALL_V2_ENVIRONMENTS_GOAL_OBSERVABLE 里面,却还是报错。or 条件不是有个对的就行了嘛。然后想起or如果第一个条件就true的话就err了,于是这 cfg.task.startswith('mw-') 就是说 task 参数开头应该是 mw-,只能是文档不丰富

结论

如果是 metaworld 的任务,在任务开头加上 mw-,如下所示

python train.py task=mw-assembly

ps:看了tdmpc1的issue才知道其针对的是连续的动作空间,离散动作空间作者说用 MuZero/EfficientZero

posted @ 2024-05-24 15:30  沙滩炒花蛤  阅读(60)  评论(0编辑  收藏  举报