nle下載 - nle原始碼下載

nle

其他源碼

v0.9.0

下載

NetHack 學習環境 (NLE)

請在 github.com/heiner/nle 的新家找到 NLE

NetHack 學習環境 (NLE) 是 NeurIPS 2020 上提出的強化學習環境。

NetHack 是歷史上最古老、可以說是最具影響力的電子遊戲之一，也是目前人類玩的最難的 Roguelike 遊戲之一。它是按程式生成的，具有豐富的實體和動態，總體而言，對於當前最先進的RL 代理來說，這是一個極具挑戰性的環境，同時與其他具有挑戰性的測試床相比，運行成本要低得多。透過 NLE，我們希望將 NetHack 打造為決策和機器學習研究的下一個挑戰之一。

您可以在 NeurIPS 2020 論文中閱讀有關 NLE 的更多信息，並在 nethack.org 和 NetHack wiki 上閱讀其原始自述文件中了解有關 NetHack 的更多信息。

在 NLE 上運行的代理程式範例

NLE 語言包裝器

我們感謝 ngoodger 實作了 NLE 語言包裝器，將 NetHack 任務中的非語言觀察結果轉換為類似的語言表示。也可以選擇以文字形式提供操作，並將其轉換為 NLE 的離散操作。

NetHack 學習資料集

NetHack 學習資料集 (NLD) 程式碼現在隨NLE一起提供，讓使用者可以載入《地下城與資料：大型 NetHack 資料集》中的大型資料集，同時也產生並載入自己的資料集。

 import nle . dataset as nld

if not nld . db . exists ():
    nld . db . create ()
    # NB: Different methods are used for data based on NLE and data from NAO.
    nld . add_nledata_directory ( "/path/to/nld-aa" , "nld-aa-v0" )
    nld . add_altorg_directory ( "/path/to/nld-nao" , "nld-nao-v0" )

dataset = nld . TtyrecDataset ( "nld-aa-v0" , batch_size = 128 , ...)
for i , mb in enumerate ( dataset ):
    foo ( mb ) # etc...

有關如何下載 NLD-AA 和 NLD-NAO 的信息，請參閱此處的資料集文件。

否則，請在此處查看 Colab 筆記本教學。

使用 NetHack 學習環境的論文

Izumiya 和 Simo-Serra 使用基於注意力的元操作進行庫存管理（早稻田大學，CoG 2021）。
薩姆維利安等人。 MiniHack the Planet：開放式強化學習研究的沙箱（FAIR、倫敦大學學院、牛津、NeurIPS 2021）。
張等人。 BeBold：超越已探索區域邊界的探索（伯克利，FAIR，2020 年 12 月）。
庫特勒等人。 NetHack 學習環境（FAIR、牛津大學、紐約大學、帝國學院、倫敦大學學院、NeurIPS 2020）。

打開拉取請求以新增論文。

入門

只要熟悉其他健身房/強化學習環境，從 NLE 環境開始就非常簡單。

安裝

NLE 需要安裝python>=3.5 、 cmake>=3.15並在建置套件時和執行時可用。

在MacOS上，可以如下使用Homebrew ：

$ brew install cmake

在普通的Ubuntu 18.04發行版上，可以透過執行以下操作來安裝cmake和其他依賴項：

 # Python and most build deps
$ sudo apt-get install -y build-essential autoconf libtool pkg-config 
    python3-dev python3-pip python3-numpy git flex bison libbz2-dev

# recent cmake version
$ wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2> /dev/null | sudo apt-key add -
$ sudo apt-add-repository ' deb https://apt.kitware.com/ubuntu/ bionic main '
$ sudo apt-get update && apt-get --allow-unauthenticated install -y 
    cmake 
    kitware-archive-keyring

然後就是設定環境的問題了。我們建議為此使用 conda 環境：

$ conda create -y -n nle python=3.8
$ conda activate nle
$ pip install nle

注意：如果您想擴充/開發 NLE，請如下安裝軟體包：

$ git clone https://github.com/facebookresearch/nle --recursive
$ pip install -e " .[dev] "
$ pre-commit install

碼頭工人

我們提供了一些 docker 映像。請參閱相關的自述文件。

嘗試一下

安裝後，可以嘗試以下提供的任何任務：

 >> > import gym
>> > import nle
>> > env = gym . make ( "NetHackScore-v0" )
>> > env . reset ()  # each reset generates a new dungeon
>> > env . step ( 1 )  # move agent '@' north
>> > env . render ()

NLE 還附帶了一些腳本，允許進行一些環境部署，並使用動作空間：

 # Play NetHackStaircase-v0 as a human
$ python -m nle.scripts.play

# Use a random agent
$ python -m nle.scripts.play --mode random

# Play the full game using directly the NetHack internal interface
# (Useful for debugging outside of the gym environment)
$ python -m nle.scripts.play --env NetHackScore-v0 # works with random agent too

# See all the options
$ python -m nle.scripts.play --help

請注意，如果已正確安裝軟體包，則nle.scripts.play也可以與nle-play一起運作。

此外，TorchBeast 代理與一個簡單的模型捆綁在nle.agent中，為實驗提供起點：

$ pip install " nle[agent] "
$ python -m nle.agent.agent --num_actors 80 --batch_size 32 --unroll_length 80 --learning_rate 0.0001 --entropy_cost 0.0001 --use_lstm --total_steps 1000000000

繪製過去 100 集的平均報酬：

$ python -m nle.scripts.plot

                              averaged episode return

  140 +---------------------------------------------------------------------+
      |             +             +            ++-+ ++++++++++++++++++++++++|
      |             :             :          ++++++++||||||||||||||||||||||||
  120 |-+...........:.............:...+-+.++++|||||||||||||||||||||||||||||||
      |             :        +++++++++++++++||||||||||AAAAAAAAAAAAAAAAAAAAAA|
      |            +++++++++++++||||||||||||||AAAAAAAAAAAA|||||||||||||||||||
  100 |-+......+++++|+|||||||||||||||||||||||AA||||||||||||||||||||||||||||||
      |       +++|||||||||||||||AAAAAAAAAAAAAA|||||||||||+++++++++++++++++++|
      |    ++++|||||AAAAAAAAAAAAAA||||||||||||++++++++++++++-+:             |
   80 |-++++|||||AAAAAA|||||||||||||||||||||+++++-+...........:...........+-|
      | ++|||||AAA|||||||||||||||++++++++++++-+ :             :             |
   60 |++||AAAAA|||||+++++++++++++-+............:.............:...........+-|
      |++|AA||||++++++-|-+        :             :             :             |
      |+|AA|||+++-+ :             :             :             :             |
   40 |+|A+++++-+...:.............:.............:.............:...........+-|
      |+AA+-+       :             :             :             :             |
      |AA-+         :             :             :             :             |
   20 |AA-+.........:.............:.............:.............:...........+-|
      |++-+         :             :             :             :             |
      |+-+          :             :             :             :             |
    0 |-+...........:.............:.............:.............:...........+-|
      |+            :             :             :             :             |
      |+            +             +             +             +             |
  -20 +---------------------------------------------------------------------+
      0           2e+08         4e+08         6e+08         8e+08         1e+09
                                       steps

貢獻

我們歡迎對 NLE 做出貢獻。如果您有興趣做出貢獻，請參閱此文件。

建築學

NLE 是 NetHack 的直接分支，因此包含在許多不同抽象層級上執行的程式碼。其範圍從低階遊戲邏輯到重複網路駭客遊戲的高階管理，最後到將這些遊戲綁定到 Python gym環境。

如果您想了解有關nle架構及其幕後工作原理的更多信息，請查看架構文件。對於任何希望為 NLE 較低階元素做出貢獻的人來說，這可能是一個有用的起點。

關於權重和偏差環境的採訪

Facebook AI Research 的 Tim 和 Heiner 致力於讓強化學習研究民主化。

引文

如果您在任何工作中使用 NLE，請引用：

 @inproceedings{kuettler2020nethack,
  author    = {Heinrich K{"{u}}ttler and
               Nantas Nardelli and
               Alexander H. Miller and
               Roberta Raileanu and
               Marco Selvatici and
               Edward Grefenstette and
               Tim Rockt{"{a}}schel},
  title     = {{The NetHack Learning Environment}},
  booktitle = {Proceedings of the Conference on Neural Information Processing Systems (NeurIPS)},
  year      = {2020},
}

如果您在任何工作中使用 NLD 或資料集，請引用：

 @article{hambro2022dungeons,
  title={Dungeons and Data: A Large-Scale NetHack Dataset},
  author={Hambro, Eric and Raileanu, Roberta and Rothermel, Danielle and Mella, Vegard and Rockt{"a}schel, Tim and K{"u}ttler, Heinrich and Murray, Naila},
  journal={Advances in Neural Information Processing Systems},
  volume={35},
  pages={24864--24878},
  year={2022}
}

展開

附加信息