飞机游戏六空战强化学习环境6.1DBML 6.1.3 DBML+强化学习算法使用
飞机游戏六空战强化学习环境6.1DBML 6.1.3 DBML+强化学习算法使用
项目文件结构
doc 是指导文档,tutorial.ipynb也是指导项目的配置与安装的文件
src目录下是主要的环境和模型所在位置,
其中environment是作者根据以jsbsim和dogfight2作为仿真软件,编写的符合gym框架下的环境,在使用的时候需要将环境复制注册到代码所运行环境的gym包中,如下图所示
DogfightEnv
我们来看一下 是怎样将dogfight的仿真与gym框架结合起来的
整体的函数
交互的实现
#调用dogfight中的通信端口,使用其中的函数进行获取数据与发送数据
from gym.envs.dogfightEnv.dogfight_sandbox_hg2.network_client_example import dogfight_client as df
def get_missile_state(missile_id):
socket_lib.send_message(str.encode(json.dumps({"command": "GET_MISSILE_STATE", "args": {"missile_id": missile_id}})))
state = json.loads((socket_lib.get_answer()).decode())
return state
def set_missile_target(missile_id, target_id):
socket_lib.send_message(str.encode(json.dumps({"command": "SET_MISSILE_TARGET", "args": {"missile_id": missile_id, "target_id": target_id}})))
具体实现是使用socket进行通信,
首先,对要进行发送的数据使用json进行打包,然后通过socket端口进行发送,然后再对接受到的信息进行解析,返回给调用函数
dogfight2提供的API
dogfight-sandbox-hg2/documentation_network.md at main · harfang3d/dogfight-sandbox-hg2 · GitHub
发送到的命令进行处理的函数都在dogfight2的source文件夹下的network_server.py文件,可以找到发送命令后都调用了哪些函数
示例 :例如,
首先在network_client_example文件夹下打开dogfight_client.py ctrl+F进行搜索
然后再去source文件夹下的network_server.py文件夹下进行搜索,就得到了函数,在函数体内就有其具体内容
可视化的实现
状态空间的定义与实现
observation_space = Box(
low=np.array([ # simple normalized
# Plane
-300, # x / 100
-300, # y / 100
-1, # z / 50
0, # heading
-360, # pitch_attitude * 4
-360, # roll_attitude * 4
# Missile
-300, # x / 100
-300, # y / 100
-1, # z / 50
-315, # heading * 100
-315, # pitch_attitude * 100
-315, # roll_attitude * 100
]),
high=np.array([
300,
300,
200,
360,
360,
360,
300,
300,
200,
315,
315,
315,
])
)
状态空间分为飞机和导弹
飞机
- 位置 x :[-300,300] y:[-300,300] z[-1,200]
- 姿态: 前进方向heading [0,360] , 俯仰角,滚转角 [-360,360] 这里的正负可能代表顺时针和逆时针
导弹:
- 位置 x :[-300,300] y:[-300,300] z[-1,200]
- 姿态:前进方向heading [-315,315] , 俯仰角,滚转角 [-315,315] 这里的正负可能代表顺时针和逆时针
第3个变量 Pitch_attitude的图示 也就是pitch的姿态的角度
动作空间的定义与实现
action_space = Box(
low=np.array([
0, # Flaps
-1, # Pitch
-1, # Roll
-1, # Yaw
]),
high=np.array([
1,
1,
1,
1,
]),
)
动作空间有四个,分别是襟翼的状态,控制区间为[0,1] ; 俯仰舵,滚转,偏航,控制区间为[-1,1]
代码阅读
import os
import random
import re
import sys
import time
import warnings
import gym
import numpy as np
from gym import Env
from gym.spaces import Box, Discrete
#调用dogfight中的通信端口,使用其中的函数进行获取数据与发送数据
from gym.envs.dogfightEnv.dogfight_sandbox_hg2.network_client_example import dogfight_client as df
# sys.path.append('./src/')
# sys.path.append('./src/environments/dogfightEnv/')
# sys.path.append('./src/environments/dogfightEnv/dogfight_sandbox_hg2/network_client_example/')
sys.path.append('gym.envs.dogfightEnv.dogfight_sandbox_hg2.network_client_example/')
# try:
print("gym dogfightEnv")
print(os.getcwd())
print("DBRL")
time.sleep(1)
# except:
# from dogfightEnv.dogfight_sandbox_hg2.network_client_example import dogfight_client as df
# print("DBRL")
# time.sleep(1)
class DogfightEnv(Env):
def __init__(
self,
host='10.168.62.25',
port='50888',
plane_slot=1,
enemy_slot=1,
missile_slot=0,
rendering=True,
) -> None:
"""
构造函数,进行初始化
:param host: 通信地址
:param port: 端口
:param plane_slot: 我方飞机数量
:param enemy_slot: 敌方飞机数量
:param missile_slot: 导弹数量
:param rendering: 是否进行渲染,即可视化
"""
self.host = host
self.port = port
self.nof = 0
self.rendering = rendering
self.plane_slot = plane_slot
self.enemy_slot = enemy_slot
self.missile_slot = missile_slot
#飞机的动作空间
self.action_space = Box(
low=np.array([
0, # Flaps 襟翼
-1, # Pitch 俯仰角
-1, # Roll 翻滚角
-1, # Yaw 偏航角
]),
high=np.array([
1,
1,
1,
1,
]),
)
# 飞机的观测空间
self.observation_space = Box(
low=np.array([ # simple normalized
-300, # x / 100
-300, # y / 100
-1, # z / 50
0, # heading
-360, # pitch_attitude * 4
-360, # roll_attitude * 4
-300, # x / 100
-300, # y / 100
-1, # z / 50
-315, # heading * 100
-315, # pitch_attitude * 100
-315, # roll_attitude * 100
]),
high=np.array([
300,
300,
200,
360,
360,
360,
300,
300,
200,
315,
315,
315,
]),
dtype=np.float64
)
df.set_renderless_mode(True) # 是否进行渲染,默认不进行渲染
#可视化设置,是否进行渲染,当参数为True,即进行渲染,则进行渲染,否则默认不进行渲染
if self.rendering:
df.set_renderless_mode(False)
#Set client update mode ON: the scene update must be done by client network, calling "update_scene()"
df.set_client_update_mode(True)
#飞机进行初始化,按着敌方和我方飞机队列获取飞机列表
print('环境初始化')
try:
df.get_planes_list()
except:
print('Run for the first time')
df.connect(host, int(port))
time.sleep(2)
planes = df.get_planes_list() #从仿真环境获取飞机列表
print(f'初始化 飞机planes:{planes}')
df.disable_log()
self.planeID = planes[self.plane_slot]
self.enemyID = planes[self.enemy_slot]
print(f'初始化 飞机planeID:{self.planeID}')
print(f'初始化 飞机enemyID:{self.enemyID}')
#初始化飞机
for i in planes:
df.reset_machine(i)
# 完成起飞动作
# 设置最大推力
df.set_plane_thrust(self.enemyID, 1)
df.set_plane_thrust(self.planeID, 1)
#Wait until plane thrust = 1
t = 0
while t < 1:
plane_state = df.get_plane_state(self.enemyID)
df.update_scene()
t = plane_state["thrust_level"]
# 激活后续燃料
df.activate_post_combustion(self.enemyID)
df.activate_post_combustion(self.planeID)
df.set_plane_pitch(self.enemyID, -0.5)
df.set_plane_pitch(self.planeID, -0.5)
p = 0
while p < 15:
plane_state = df.get_plane_state(self.enemyID)
df.update_scene()
p = plane_state["pitch_attitude"]
# 飞起来之后的stabilize_plane()函数与就是三个舵偏都设为0
# if p: self.set_pitch_level(0)if y: self.set_yaw_level(0)if r: self.set_roll_level(0)
df.stabilize_plane(self.enemyID)
df.stabilize_plane(self.planeID)
#收起起落架
df.retract_gear(self.enemyID)
df.retract_gear(self.planeID)
s = 0
while s < 1000:
plane_state = df.get_plane_state(self.enemyID)
df.update_scene()
s = plane_state["altitude"]
df.set_plane_yaw(self.planeID, 1)
#设置导弹
#获取导弹列表
missiles = df.get_machine_missiles_list(self.enemyID)
self.missileID = missiles[self.missile_slot]
df.fire_missile(self.enemyID, self.missile_slot)
df.set_missile_target(self.missileID, self.planeID) #设置导弹目标, 第一个参数是导弹id,第二个参数是目标id ,这里仅仅是一个属性设置
df.set_missile_life_delay(self.missileID, 30) #设置导弹存活时长 ,当过去30s后,自动摧毁导弹
def getProperty(
self,
prop
):
if prop == 'position':
return [
df.get_plane_state(self.planeID)['position'][0],
df.get_plane_state(self.planeID)['position'][2],
df.get_plane_state(self.planeID)['position'][1],
]
elif prop == 'positionEci':
warnings.warn('Dogfight simulation environments have no global data!')
return [
df.get_plane_state(self.planeID)['position'][0],
df.get_plane_state(self.planeID)['position'][2],
df.get_plane_state(self.planeID)['position'][1],
]
elif prop == 'positionEcef':
warnings.warn('Dogfight simulation environments have no global data!')
return [
df.get_plane_state(self.planeID)['position'][0],
df.get_plane_state(self.planeID)['position'][2],
df.get_plane_state(self.planeID)['position'][1],
]
elif prop == 'attitudeRad':
return [
df.get_plane_state(self.planeID)['heading'] / 180 * np.pi,
df.get_plane_state(self.planeID)['pitch_attitude'] / 180 * np.pi,
df.get_plane_state(self.planeID)['roll_attitude'] / 180 * np.pi,
]
elif prop == 'attitudeDeg':
return [
df.get_plane_state(self.planeID)['heading'],
df.get_plane_state(self.planeID)['pitch_attitude'],
df.get_plane_state(self.planeID)['roll_attitude'],
]
elif prop == 'pose':
return [
df.get_plane_state(self.planeID)['position'][0],
df.get_plane_state(self.planeID)['position'][2],
df.get_plane_state(self.planeID)['position'][1],
df.get_plane_state(self.planeID)['heading'],
df.get_plane_state(self.planeID)['pitch_attitude'],
df.get_plane_state(self.planeID)['roll_attitude'],
]
elif prop == 'velocity':
warnings.warn('三个值为速度在欧拉角上的分量, 与JSBSim中的速度不同')
return [
df.get_plane_state(self.planeID)['horizontal_speed'],
df.get_plane_state(self.planeID)['linear_speed'],
-df.get_plane_state(self.planeID)['vertical_speed'],
]
elif prop == 'poseMissile':
return [
df.get_missile_state(self.missileID)['position'][0],
df.get_missile_state(self.missileID)['position'][1],
df.get_missile_state(self.missileID)['position'][2],
df.get_missile_state(self.missileID)['Euler_angles'][0],
df.get_missile_state(self.missileID)['Euler_angles'][1],
df.get_missile_state(self.missileID)['Euler_angles'][2],
]
else:
raise Exception("Property {} doesn't exist!".format(prop))
def getDistance(self):
return ((df.get_plane_state(self.planeID)['position'][0] - df.get_missile_state(self.missileID)['position'][0]) ** 2 +\
(df.get_plane_state(self.planeID)['position'][1] - df.get_missile_state(self.missileID)['position'][1]) ** 2 +\
(df.get_plane_state(self.planeID)['position'][2] - df.get_missile_state(self.missileID)['position'][2]) ** 2) ** .5
def getHP(self):
return df.get_health(self.planeID)['health_level']
def terminate(self):
if not df.get_missile_state(self.missileID)['active']:
if self.getHP() >= .9:
return 1
else:
return -1
else:
return 0
def setProperty(
self,
prop,
value,
):
if prop == 'plane':
self.plane_slot = value
elif prop == 'enemy':
self.enemy_slot = value
elif prop == 'missile':
self.missile_slot = value
else:
raise Exception("Property {} doesn't exist!".format(prop))
def sendAction(
self,
action,
actionType=None,
): #设定动作
if actionType == None:
df.set_plane_flaps(self.planeID, float(action[0]))
df.set_plane_pitch(self.planeID, float(action[1]))
df.set_plane_roll(self.planeID, float(action[2]))
df.set_plane_yaw(self.planeID, float(action[3]))
elif actionType == 'Flaps' or actionType == 'flaps':
df.set_plane_flaps(self.planeID, action)
elif actionType == 'Pitch' or actionType == 'pitch':
df.set_plane_pitch(self.planeID, action)
elif actionType == 'Roll' or actionType == 'roll':
df.set_plane_roll(self.planeID, action)
elif actionType == 'Yaw' or actionType == 'yaw':
df.set_plane_yaw(self.planeID, action)
def step(self, action):
self.sendAction(action) #发送动作
df.update_scene() #场景更新
self.nof += 1
terminate = self.terminate()
if terminate == 1: #获取奖赏
reward = 50
elif terminate == -1:
reward = -50
else:
reward = .1
if self.getHP() <= .1:
reward = -1
terminate = True if terminate else False #判断是否终止
ob = np.array([ # normalized 对状态向量进行标准化
df.get_plane_state(self.planeID)['position'][0] / 100,
df.get_plane_state(self.planeID)['position'][2] / 100,
df.get_plane_state(self.planeID)['position'][1] / 50,
df.get_plane_state(self.planeID)['heading'],
df.get_plane_state(self.planeID)['pitch_attitude'] * 4,
df.get_plane_state(self.planeID)['roll_attitude'] * 4,
df.get_missile_state(self.missileID)['position'][0] / 100,
df.get_missile_state(self.missileID)['position'][2] / 100,
df.get_missile_state(self.missileID)['position'][1] / 50,
df.get_missile_state(self.missileID)['Euler_angles'][0] * 100,
df.get_missile_state(self.missileID)['Euler_angles'][1] * 100,
df.get_missile_state(self.missileID)['Euler_angles'][2] * 100,
])
return ob, reward, terminate, {}
def render(self, id=0):
# 关闭无渲染模式,即进行渲染
df.set_renderless_mode(False)
# 环境重新进行初始化
def reset(
self,
):
self.__init__(
host=self.host,
port=self.port,
plane_slot=self.plane_slot,
enemy_slot=self.enemy_slot,
missile_slot=self.missile_slot,
rendering=self.rendering,
)
ob = np.array([ # normalized
df.get_plane_state(self.planeID)['position'][0] / 100,
df.get_plane_state(self.planeID)['position'][2] / 100,
df.get_plane_state(self.planeID)['position'][1] / 50,
df.get_plane_state(self.planeID)['heading'],
df.get_plane_state(self.planeID)['pitch_attitude'] * 4,
df.get_plane_state(self.planeID)['roll_attitude'] * 4,
df.get_missile_state(self.missileID)['position'][0] / 100,
df.get_missile_state(self.missileID)['position'][2] / 100,
df.get_missile_state(self.missileID)['position'][1] / 50,
df.get_missile_state(self.missileID)['Euler_angles'][0] * 100,
df.get_missile_state(self.missileID)['Euler_angles'][1] * 100,
df.get_missile_state(self.missileID)['Euler_angles'][2] * 100,
])
return ob
思考与实践
问题1 初始状态的设置 ?飞机起飞
能不能直接设置飞机的初始状态,也就是根据初始场景
实践发现没法直接设置飞机的位置状态,只能让它先进行起飞
起飞过程也很简单,就是把推力设置为最大,然后设置pitch角,它就飞起来了
# -*- coding: utf-8 -*-
"""
@author : zuti
@software : PyCharm
@file : client_control_test.py
@time : 2022/12/11 16:47
@desc :
"""
import dogfight_client as df
import time
# Print fps function, to check the network client frequency.
t0 = 0
t1 = 0
def print_fps():
global t, t0, t1
t1 = time.time()
dt = t1 - t0
t0 = t1
if dt > 0:
print(str(1 / dt))
# Enter the IP and port displayed in top-left corner of DogFight screen
df.connect("10.168.62.25", 50888)
time.sleep(2)
# Get the whole planes list in DogFight scene
# returns a list that contains planes id
planes = df.get_planes_list()
print(str(planes))
df.disable_log()
# Get the id of the plane you want to control
plane_id = planes[0]
enemyID = planes[1]
# Reset the plane at its start state
df.reset_machine(plane_id)
df.reset_machine(enemyID)
# Set plane thrust level (0 to 1)
df.set_plane_thrust(plane_id, 1)
df.set_plane_thrust(enemyID, 1)
# Set client update mode ON: the scene update must be done by client network, calling "update_scene()"
df.set_client_update_mode(True)
# Wait until plane thrust = 1
t= 0
while t < 1:
plane_state = df.get_plane_state(plane_id)
# Display text & vector - !!! Must be called before update_scene() !!!
# !!! Display text & vector only works in Client Update Mode !!!
df.display_2DText([0.25, 0.75], "Plane speed: " + str(plane_state["linear_speed"]), 0.04, [1, 0.5, 0, 1])
df.display_vector(plane_state["position"], plane_state["move_vector"], "Linear speed: " + str(plane_state["linear_speed"]), [0, 0.02], [0, 1, 0, 1], 0.02)
# Update frame:
df.update_scene()
print_fps()
t = plane_state["thrust_level"]
#到这是 设置最大推力,没有其他操作会平飞然后落入水中
# Activate the post-combustion (increases thrust power)
df.activate_post_combustion(plane_id)
df.activate_post_combustion(enemyID)
# Set broomstick pitch level (<0 : aircraft pitch up, >0 : aircraft pitch down)
df.set_plane_pitch(plane_id, -0.5)
df.set_plane_pitch(enemyID, -0.5)
# Wait until plane pitch attitude >= 15
p = 0
while p < 15:
# Timer to 1/20 s.
# As Client update mode is ON, the renderer runs to 20 FPS until pitch >= 15°
time.sleep(1/20)
plane_state = df.get_plane_state(plane_id)
df.display_2DText([0.25, 0.75], "Plane speed: " + str(plane_state["linear_speed"]), 0.04, [1, 0.5, 0, 1])
df.display_vector(plane_state["position"], plane_state["move_vector"], "Linear speed: " + str(plane_state["linear_speed"]), [0, 0.02], [0, 1, 0, 1], 0.02)
df.update_scene()
p = plane_state["pitch_attitude"]
#设置操纵量为0
df.stabilize_plane(plane_id)
df.stabilize_plane(enemyID)
#设置起落架
df.retract_gear(plane_id)
df.retract_gear(enemyID)
s = 0
while s < 500: #让另一架飞机飞到 1000米空中
plane_state = df.get_plane_state(enemyID)
df.update_scene()
s = plane_state["altitude"]
#执行后,飞机会锁定命令,一直执行下去,所以应该根据飞机状态进行调整,而不是设置一次就完事了
df.set_plane_yaw(plane_id,1)
df.set_plane_roll(enemyID,-1)
问题2 动作控制指令的效果
飞机的控制对飞机飞机的影响 ?
发现还可以设置飞机的推力,这其实也可以作为飞机的一个控制量
-
如果只设置 df.set_plane_pitch(plane_id, -0.5)
有一个问题,这是直接设置舵偏吗,他能直接改变飞机姿态不
解答:这是个动作指令,即操纵指令,并不是直接设置状态量
经过探索发现 pitch roll yaw 三个控制变量直接调整飞机的三个角速度
问题3 飞机姿态的直观理解与含义
控制命令会导致飞机姿态的改变,但是这个环境比较简单,一般不会坠机
发现在network_client_example文件夹下给出了几个示例,设置飞机飞行,可以从这边来进行展示和观看
本文来自博客园,作者:{珇逖},转载请注明原文链接:https://www.cnblogs.com/zuti666/p/16974598.html