tensorforce.environments package¶
Submodules¶
tensorforce.environments.environment module¶
-
class
tensorforce.environments.environment.
Environment
¶ Bases:
object
Base environment class.
-
actions
¶ Return the action space. Might include subdicts if multiple actions are available simultaneously.
Returns: dict of action properties (continuous, number of actions)
-
close
()¶ Close environment. No other method calls possible afterwards.
-
execute
(actions)¶ Executes action, observes next state(s) and reward.
Parameters: actions – Actions to execute. Returns: (Dict of) next state(s), boolean indicating terminal, and reward signal.
-
reset
()¶ Reset environment and setup for new episode.
Returns: initial state of reset environment.
-
seed
(seed)¶ Sets the random seed of the environment to the given value (current time, if seed=None). Naturally deterministic Environments (e.g. ALE or some gym Envs) don’t have to implement this method.
Parameters: seed (int) – The seed to use for initializing the pseudo-random number generator (default=epoch time in sec). Returns: The actual seed (int) used OR None if Environment did not override this method (no seeding supported).
-
states
¶ Return the state space. Might include subdicts if multiple states are available simultaneously.
Returns: dict of state properties (shape and type).
-
tensorforce.environments.minimal_test module¶
-
class
tensorforce.environments.minimal_test.
MinimalTest
(specification)¶ Bases:
tensorforce.environments.environment.Environment
-
actions
¶
-
close
()¶
-
execute
(actions)¶
-
reset
()¶
-
states
¶
-
-
tensorforce.environments.minimal_test.
random
() → x in the interval [0, 1).¶
Module contents¶
-
class
tensorforce.environments.
Environment
¶ Bases:
object
Base environment class.
-
actions
¶ Return the action space. Might include subdicts if multiple actions are available simultaneously.
Returns: dict of action properties (continuous, number of actions)
-
close
()¶ Close environment. No other method calls possible afterwards.
-
execute
(actions)¶ Executes action, observes next state(s) and reward.
Parameters: actions – Actions to execute. Returns: (Dict of) next state(s), boolean indicating terminal, and reward signal.
-
reset
()¶ Reset environment and setup for new episode.
Returns: initial state of reset environment.
-
seed
(seed)¶ Sets the random seed of the environment to the given value (current time, if seed=None). Naturally deterministic Environments (e.g. ALE or some gym Envs) don’t have to implement this method.
Parameters: seed (int) – The seed to use for initializing the pseudo-random number generator (default=epoch time in sec). Returns: The actual seed (int) used OR None if Environment did not override this method (no seeding supported).
-
states
¶ Return the state space. Might include subdicts if multiple states are available simultaneously.
Returns: dict of state properties (shape and type).
-