tensorforce.environments package

Submodules

tensorforce.environments.environment module

class tensorforce.environments.environment.Environment

Bases: object

Base environment class.

actions

Return the action space. Might include subdicts if multiple actions are available simultaneously.

Returns: dict of action properties (continuous, number of actions)

close()

Close environment. No other method calls possible afterwards.

execute(actions)

Executes action, observes next state(s) and reward.

Parameters:actions – Actions to execute.
Returns:(Dict of) next state(s), boolean indicating terminal, and reward signal.
reset()

Reset environment and setup for new episode.

Returns:initial state of reset environment.
seed(seed)

Sets the random seed of the environment to the given value (current time, if seed=None). Naturally deterministic Environments (e.g. ALE or some gym Envs) don’t have to implement this method.

Parameters:seed (int) – The seed to use for initializing the pseudo-random number generator (default=epoch time in sec).

Returns: The actual seed (int) used OR None if Environment did not override this method (no seeding supported).

states

Return the state space. Might include subdicts if multiple states are available simultaneously.

Returns: dict of state properties (shape and type).

tensorforce.environments.minimal_test module

class tensorforce.environments.minimal_test.MinimalTest(specification)

Bases: tensorforce.environments.environment.Environment

actions
close()
execute(actions)
reset()
states
tensorforce.environments.minimal_test.random() → x in the interval [0, 1).

Module contents

class tensorforce.environments.Environment

Bases: object

Base environment class.

actions

Return the action space. Might include subdicts if multiple actions are available simultaneously.

Returns: dict of action properties (continuous, number of actions)

close()

Close environment. No other method calls possible afterwards.

execute(actions)

Executes action, observes next state(s) and reward.

Parameters:actions – Actions to execute.
Returns:(Dict of) next state(s), boolean indicating terminal, and reward signal.
reset()

Reset environment and setup for new episode.

Returns:initial state of reset environment.
seed(seed)

Sets the random seed of the environment to the given value (current time, if seed=None). Naturally deterministic Environments (e.g. ALE or some gym Envs) don’t have to implement this method.

Parameters:seed (int) – The seed to use for initializing the pseudo-random number generator (default=epoch time in sec).

Returns: The actual seed (int) used OR None if Environment did not override this method (no seeding supported).

states

Return the state space. Might include subdicts if multiple states are available simultaneously.

Returns: dict of state properties (shape and type).