• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    迪恩网络公众号

kimbring2/minecraft_ai

原作者: [db:作者] 来自: 网络 收藏 邀请

开源软件名称(OpenSource Name):

kimbring2/minecraft_ai

开源软件地址(OpenSource Url):

https://github.com/kimbring2/minecraft_ai

开源编程语言(OpenSource Language):

Python 70.4%

开源软件介绍(OpenSource Introduction):

Introduction

Code for playing the Minecraft using the Deep Learning.

Normal Dependencies

  1. Ubuntu
  2. Tmux

Python Dependencies

  1. Minerl 0.3.7
  2. Tensorflow 2.4.1
  3. Tensorflow_probability 0.11.0
  4. ZeroMQ
  5. Gym
  6. OpenCV
  7. Matplotlib

Reference

  1. Hierarchical Deep Q-Network from Imperfect Demonstrations in Minecraft, A Skrynnik, 2019
  2. Sample Efficient Reinforcement Learning through Learning From Demonstrations in Minecraft, C Scheller, 2020

Action, Observation of Minecraft

Model Architecture

Learning-Based Model Architecture

Rule-Based Model Architecture

Loss for Training

Training Method

Run Supervised Learning

For Minecraft games, agent can not learn every behaviour for high level playing only using Reinforcment Learning becaue of complexity of task. In such cases, the agent must first learn through human expert data. Try to train network for MineRLTreechop-v0 first using below command.

$ python run_supervised_learning.py --workspace_path [your path]/minecraft_ai/ --data_path [your path]/minerl_data/ --gpu_use True

The loss should fall to near 0 as shown like below graph. Model is saved under folder named model of workspace path.

You can download the weight of trained SL model from Google Drive. Try to use 'tree_supervised_model_15800' file.

After finishing training, you can test trained model using below command.

$ python run_evaluation.py --workspace_path [your path]/minecraft_ai/ --model_name [trained model name] --gpu_use True

Run Reinforcement Learning

Because of long game play time, normal A2C method can not be used because it should use whole episode once. Therefore, off-policy A2C such as IMPALA is needed. It can restore trajectory data from buffer for training like a DQN.

You can run the IMPALA with Supervised model for MineRL by below command.

$ ./run_reinforcement_learning.sh [number of envs] [gpu use] [pretrained model]

You can ignore below error of learner.py part. It does not effect the training process.

Traceback (most recent call last):
File "C:/minerl/learner.py", line 392, in
coord.join(thread_data)
File "C:\Users\sund0\anaconda3\envs\minerl_env\lib\site-packages\tensorflow\python\training\coordinator.py", line 357, in join
threads = self._registered_threads.union(set(threads))

where line 391 and 392 is
for thread_data in thread_data_list:
coord.join(thread_data)

After some training, the agent starts to collect tree and earn rewards as shown in the graph below.

You can download the weight of trained RL model from Google Drive. Try to use 'tree_reinforcement_model_128000' file.

Below video is evluation result of trained agent.

Demo MineRL TreeChop

Detailed inforamtion

  1. Prepare Model: https://medium.com/@dohyeongkim/deep-q-learning-from-demonstrations-dqfd-for-minecraft-tutorial-1-4b462a18de5a
  2. Training Model: https://dohyeongkim.medium.com/how-to-build-the-deep-learning-agent-for-minecraft-with-code-tutorial-2-e5ddbf80eca1



鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
热门推荐
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap