General environment interface

Initialization and termination

static Environment.create(environment=None, max_episode_timesteps=None, remote=None, blocking=False, host=None, port=None, **kwargs)

Creates an environment from a specification. In case of “socket-server” remote mode, runs environment in server communication loop until closed.

Parameters:
  • environment (specification | Environment class/object) – JSON file, specification key, configuration dictionary, library module, Environment class/object, or gym.Env (required, invalid for “socket-client” remote mode).
  • max_episode_timesteps (int > 0) – Maximum number of timesteps per episode, overwrites the environment default if defined (default: environment default, invalid for “socket-client” remote mode).
  • remote ("multiprocessing" | "socket-client" | "socket-server") – Communication mode for remote environment execution of parallelized environment execution, “socket-client” mode requires a corresponding “socket-server” running, and “socket-server” mode runs environment in server communication loop until closed (default: local execution).
  • blocking (bool) – Whether remote environment calls should be blocking (default: not blocking, invalid unless “multiprocessing” or “socket-client” remote mode).
  • host (str) – Socket server hostname or IP address (required only for “socket-client” remote mode).
  • port (int) – Socket server port (required only for “socket-client/server” remote mode).
  • kwargs – Additional arguments.
Environment.close()

Closes the environment.

Properties

Environment.states()

Returns the state space specification.

Returns:Arbitrarily nested dictionary of state descriptions with the following attributes:
  • type ("bool" | "int" | "float") – state data type (default: "float").
  • shape (int | iter[int]) – state shape (required).
  • num_states (int > 0) – number of discrete state values (required for type "int").
  • min_value/max_value (float) – minimum/maximum state value (optional for type "float").
Return type:specification
Environment.actions()

Returns the action space specification.

Returns:Arbitrarily nested dictionary of action descriptions with the following attributes:
  • type ("bool" | "int" | "float") – action data type (required).
  • shape (int > 0 | iter[int > 0]) – action shape (default: scalar).
  • num_actions (int > 0) – number of discrete action values (required for type "int").
  • min_value/max_value (float) – minimum/maximum action value (optional for type "float").
Return type:specification
Environment.max_episode_timesteps()

Returns the maximum number of timesteps per episode.

Returns:Maximum number of timesteps per episode.
Return type:int

Interaction functions

Environment.reset()

Resets the environment to start a new episode.

Returns:Dictionary containing initial state(s) and auxiliary information.
Return type:dict[state]
Environment.execute(actions)

Executes the given action(s) and advances the environment by one step.

Parameters:actions (dict[action]) – Dictionary containing action(s) to be executed (required).
Returns:Dictionary containing next state(s), whether a terminal state is reached or 2 if the episode was aborted, and observed reward.
Return type:dict[state], bool | 0 | 1 | 2, float