gym · PyPI

The OpenAI Gym: A toolkit for developing and comparing your reinforcement learning agents.

Project description

https://travis-ci.org/openai/gym.svg?branch=master

OpenAI Gym is a toolkit for developing and comparing reinforcement learning algorithms. This is the gym open-source library, which gives you access to an ever-growing variety of environments.

gym makes no assumptions about the structure of your agent, and is compatible with any numerical computation library, such as TensorFlow or Theano. You can use it from Python code, and soon from other languages.

If you’re not sure where to start, we recommend beginning with the docs on our site.

A whitepaper for OpenAI Gym is available at http://arxiv.org/abs/1606.01540, and here’s a BibTeX entry that you can use to cite it in a publication:

@misc{1606.01540,
        Author = {Greg Brockman and Vicki Cheung and Ludwig Pettersson and Jonas Schneider and John Schulman and Jie Tang and Wojciech Zaremba},
        Title = {OpenAI Gym},
        Year = {2016},
        Eprint = {arXiv:1606.01540},
}

Basics

There are two basic concepts in reinforcement learning: the environment (namely, the outside world) and the agent (namely, the algorithm you are writing). The agent sends actions to the environment, and the environment replies with observations and rewards (that is, a score).

The core gym interface is Env, which is the unified environment interface. There is no interface for agents; that part is left to you. The following are the Env methods you should know:

reset(self): Reset the environment’s state. Returns observation.
step(self, action): Step the environment by one timestep. Returns observation, reward, done, info.
render(self, mode=’human’, close=False): Render one frame of the environment. The default mode will do something human friendly, such as pop up a window. Passing the close flag signals the renderer to close any such windows.

Installation

You can perform a minimal install of gym with:

git clone https://github.com/openai/gym.git
cd gym
pip install -e .

If you prefer, you can do a minimal install of the packaged version directly from PyPI:

pip install gym

You’ll be able to run a few environments right away:

algorithmic
toy_text
classic_control (you’ll need pyglet to render though)

We recommend playing with those environments at first, and then later installing the dependencies for the remaining environments.

Installing everything

To install the full set of environments, you’ll need to have some system packages installed. We’ll build out the list here over time; please let us know what you end up installing on your platform.

On OSX:

brew install cmake boost boost-python sdl2 swig wget

On Ubuntu 14.04:

apt-get install -y python-numpy python-dev cmake zlib1g-dev libjpeg-dev xvfb libav-tools xorg-dev python-opengl libboost-all-dev libsdl2-dev swig

MuJoCo has a proprietary dependency we can’t set up for you. Follow the instructions in the mujoco-py package for help.

Once you’re ready to install everything, run pip install -e '.[all]' (or pip install 'gym[all]').

Supported systems

We currently support Linux and OS X running Python 2.7 or 3.5. Some users on OSX + Python3 may need to run

brew install boost-python --with-python3

If you want to access Gym from languages other than python, we have limited support for non-python frameworks, such as lua/Torch, using the OpenAI Gym HTTP API.

Pip version

To run pip install -e '.[all]', you’ll need a semi-recent pip. Please make sure your pip is at least at version 1.5.0. You can upgrade using the following: pip install --ignore-installed pip. Alternatively, you can open setup.py and install the dependencies by hand.

Rendering on a server

If you’re trying to render video on a server, you’ll need to connect a fake display. The easiest way to do this is by running under xvfb-run (on Ubuntu, install the xvfb package):

xvfb-run -s "-screen 0 1400x900x24" bash

Installing dependencies for specific environments

If you’d like to install the dependencies for only specific environments, see setup.py. We maintain the lists of dependencies on a per-environment group basis.

Environments

The code for each environment group is housed in its own subdirectory gym/envs. The specification of each task is in gym/envs/__init__.py. It’s worth browsing through both.

Algorithmic

These are a variety of algorithmic tasks, such as learning to copy a sequence.

import gym
env = gym.make('Copy-v0')
env.reset()
env.render()

Atari

The Atari environments are a variety of Atari video games. If you didn’t do the full install, you can install dependencies via pip install -e '.[atari]' (you’ll need cmake installed) and then get started as follow:

import gym
env = gym.make('SpaceInvaders-v0')
env.reset()
env.render()

This will install atari-py, which automatically compiles the Arcade Learning Environment. This can take quite a while (a few minutes on a decent laptop), so just be prepared.

Board games

The board game environments are a variety of board games. If you didn’t do the full install, you can install dependencies via pip install -e '.[board_game]' (you’ll need cmake installed) and then get started as follow:

import gym
env = gym.make('Go9x9-v0')
env.reset()
env.render()

Box2d

Box2d is a 2D physics engine. You can install it via pip install -e '.[box2d]' and then get started as follow:

import gym
env = gym.make('LunarLander-v2')
env.reset()
env.render()

Classic control

These are a variety of classic control tasks, which would appear in a typical reinforcement learning textbook. If you didn’t do the full install, you will need to run pip install -e '.[classic_control]' to enable rendering. You can get started with them via:

import gym
env = gym.make('CartPole-v0')
env.reset()
env.render()

MuJoCo

MuJoCo is a physics engine which can do very detailed efficient simulations with contacts. It’s not open-source, so you’ll have to follow the instructions in mujoco-py to set it up. You’ll have to also run pip install -e '.[mujoco]' if you didn’t do the full install.

import gym
env = gym.make('Humanoid-v0')
env.reset()
env.render()

Toy text

Toy environments which are text-based. There’s no extra dependency to install, so to get started, you can just do:

import gym
env = gym.make('FrozenLake-v0')
env.reset()
env.render()

Examples

See the examples directory.

Run examples/agents/random_agent.py to run an simple random agent and upload the results to the scoreboard.
Run examples/agents/cem.py to run an actual learning agent (using the cross-entropy method) and upload the results to the scoreboard.
Run examples/scripts/list_envs to generate a list of all environments. (You see also just browse the list on our site. - Run examples/scripts/upload to upload the recorded output from random_agent.py or cem.py. Make sure to obtain an API key.

Testing

We are using nose2 for tests. You can run them via:

nose2

You can also run tests in a specific directory by using the -s option, or by passing in the specific name of the test. See the nose2 docs for more details.

What’s new

2016-12-27: BACKWARDS INCOMPATIBILITY: The gym monitor is now a wrapper. Rather than starting monitoring as env.monitor.start(directory), envs are now wrapped as follows: env = wrappers.Monitor(env, directory). This change is on master and will be released with 0.7.0.
2016-11-1: Several experimental changes to how a running monitor interacts with environments. The monitor will now raise an error if reset() is called when the env has not returned done=True. The monitor will only record complete episodes where done=True. Finally, the monitor no longer calls seed() on the underlying env, nor does it record or upload seed information.
2016-10-31: We’re experimentally expanding the environment ID format to include an optional username.
2016-09-21: Switch the Gym automated logger setup to configure the root logger rather than just the ‘gym’ logger.
2016-08-17: Calling close on an env will also close the monitor and any rendering windows.
2016-08-17: The monitor will no longer write manifest files in real-time, unless write_upon_reset=True is passed.
2016-05-28: For controlled reproducibility, envs now support seeding (cf #91 and #135). The monitor records which seeds are used. We will soon add seed information to the display on the scoreboard.

Project details

Release history Release notifications | RSS feed

0.26.2

Oct 4, 2022

0.26.1

Sep 19, 2022

0.26.0

Sep 7, 2022

0.25.2

Aug 19, 2022

0.25.1

Jul 28, 2022

0.25.0

Jul 14, 2022

0.24.1

Jun 7, 2022

0.24.0

May 27, 2022

0.23.1

Mar 14, 2022

0.23.0

Mar 7, 2022

0.22.0

Feb 18, 2022

0.21.0

Oct 6, 2021

0.20.0

Sep 15, 2021

0.19.0

Aug 16, 2021

0.18.3

May 18, 2021

0.18.0

Dec 19, 2020

0.17.3

Sep 30, 2020

0.17.2

May 8, 2020

0.17.1

Mar 5, 2020

0.17.0

Feb 29, 2020

0.16.0

Feb 10, 2020

0.15.7

Feb 14, 2020

0.15.6

Feb 3, 2020

0.15.4

Nov 8, 2019

0.15.3

Oct 9, 2019

0.14.0

Jul 26, 2019

0.13.1

Jul 8, 2019

0.13.0

Jun 22, 2019

0.12.6

Jun 21, 2019

0.12.5

May 29, 2019

0.12.4

May 25, 2019

0.12.1

Mar 25, 2019

0.12.0

Feb 27, 2019

0.11.0

Feb 6, 2019

0.10.11

Jan 30, 2019

0.10.9

Nov 6, 2018

0.10.8

Oct 2, 2018

0.10.5

Apr 5, 2018

0.10.4

Mar 19, 2018

0.10.3

Feb 27, 2018

0.10.2

Feb 26, 2018

0.10.1

Feb 26, 2018

0.10.0

Feb 26, 2018

0.9.7

Feb 9, 2018

0.9.6

Jan 29, 2018

0.9.5

Jan 24, 2018

0.9.4

Oct 11, 2017

0.9.3

Sep 5, 2017

0.9.2

Jun 22, 2017

0.9.1

May 14, 2017

0.9.0

May 14, 2017

0.8.2

May 8, 2017

0.8.1

Mar 22, 2017

0.8.0

Mar 6, 2017

0.8.0.dev0 pre-release

Mar 6, 2017

0.7.4

Mar 5, 2017

This version

0.7.3

Feb 1, 2017

0.7.2

Jan 13, 2017

0.7.1

Jan 8, 2017

0.7.0

Dec 28, 2016

0.6.0

Dec 24, 2016

0.5.7

Dec 17, 2016

0.5.6

Nov 24, 2016

0.5.5

Nov 24, 2016

0.5.4

Nov 14, 2016

0.5.3

Nov 12, 2016

0.5.2

Nov 7, 2016

0.5.1

Nov 2, 2016

0.5.0

Nov 1, 2016

0.4.10

Oct 31, 2016

0.4.9

Oct 24, 2016

0.4.8

Oct 23, 2016

0.4.6

Oct 20, 2016

0.4.5

Oct 18, 2016

0.4.4

Oct 17, 2016

0.4.3

Oct 15, 2016

0.4.2

Oct 2, 2016

0.4.1

Oct 2, 2016

0.4.0

Sep 30, 2016

0.3.0

Sep 21, 2016

0.2.12

Sep 21, 2016

0.2.11

Sep 5, 2016

0.2.10

Sep 5, 2016

0.2.9

Sep 4, 2016

0.2.8

Sep 4, 2016

0.2.7

Sep 3, 2016

0.2.6

Aug 25, 2016

0.2.5

Aug 25, 2016

0.2.4

Aug 24, 2016

0.2.3

Aug 19, 2016

0.2.2

Aug 17, 2016

0.2.1

Aug 17, 2016

0.2.0

Aug 17, 2016

0.1.7

Aug 14, 2016

0.1.6

Aug 11, 2016

0.1.5

Aug 10, 2016

0.1.4

Jul 7, 2016

0.1.3

Jun 2, 2016

0.1.2

May 31, 2016

0.1.1

May 16, 2016

0.1.0

May 2, 2016

0.0.7

May 1, 2016

0.0.6

Apr 29, 2016

0.0.5

Apr 29, 2016

0.0.4

Apr 27, 2016

0.0.3

Apr 27, 2016

0.0.2

Apr 27, 2016

0.0.1

Apr 23, 2016

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gym-0.7.3.tar.gz (150.1 kB view hashes)

Uploaded Feb 1, 2017 Source

Hashes for gym-0.7.3.tar.gz

Hashes for gym-0.7.3.tar.gz
Algorithm	Hash digest
SHA256	`be0435f2ac58fd124c107be200174bc1c2aaf49af19207199f107c2afa6bae05`
MD5	`78ff252f15ebd742551095fcbc4f6f5d`
BLAKE2b-256	`8ddfbd2b2d012a2620ec4a8315eb907d6a927793cafcb6a3b213636a77bdaf77`