tensorforce.contrib package¶
Submodules¶
tensorforce.contrib.ale module¶
Arcade Learning Environment (ALE). https://github.com/mgbellemare/Arcade-Learning-Environment
-
class
tensorforce.contrib.ale.
ALE
(rom, frame_skip=1, repeat_action_probability=0.0, loss_of_life_termination=False, loss_of_life_reward=0, display_screen=False, seed=<mtrand.RandomState object>)¶ Bases:
tensorforce.environments.environment.Environment
-
action_names
¶
-
actions
¶
-
close
()¶
-
current_state
¶
-
execute
(actions)¶
-
is_terminal
¶
-
reset
()¶
-
states
¶
-
tensorforce.contrib.deepmind_lab module¶
-
class
tensorforce.contrib.deepmind_lab.
DeepMindLab
(level_id, repeat_action=1, state_attribute='RGB_INTERLACED', settings={'width': '320', 'appendCommand': '', 'fps': '60', 'height': '240'})¶ Bases:
tensorforce.environments.environment.Environment
DeepMind Lab Integration: https://arxiv.org/abs/1612.03801 https://github.com/deepmind/lab
Since DeepMind lab is only available as source code, a manual install via bazel is required. Further, due to the way bazel handles external dependencies, cloning TensorForce into lab is the most convenient way to run it using the bazel BUILD file we provide. To use lab, first download and install it according to instructions https://github.com/deepmind/lab/blob/master/docs/build.md:
git clone https://github.com/deepmind/lab.git
Add to the lab main BUILD file:
Clone TensorForce into the lab directory, then run the TensorForce bazel runner.
Note that using any specific configuration file currently requires changing the Tensorforce BUILD file to adjust environment parameters.
bazel run //tensorforce:lab_runner
Please note that we have not tried to reproduce any lab results yet, and these instructions just explain connectivity in case someone wants to get started there.
-
actions
¶
-
close
()¶ Closes the environment and releases the underlying Quake III Arena instance. No other method calls possible afterwards.
-
execute
(actions)¶ Pass action to universe environment, return reward, next step, terminal state and additional info.
Parameters: action – action to execute as numpy array, should have dtype np.intc and should adhere to the specification given in DeepMindLabEnvironment.action_spec(level_id) Returns: dict containing the next state, the reward, and a boolean indicating if the next state is a terminal state
-
fps
¶ An advisory metric that correlates discrete environment steps (“frames”) with real (wallclock) time: the number of frames per (real) second.
-
num_steps
¶ Number of frames since the last reset() call.
-
reset
()¶ Resets the environment to its initialization state. This method needs to be called to start a new episode after the last episode ended.
Returns: initial state
-
states
¶
-
tensorforce.contrib.maze_explorer module¶
-
class
tensorforce.contrib.maze_explorer.
MazeExplorer
(mode_id=0, visible=True)¶ Bases:
tensorforce.environments.environment.Environment
MazeExplorer Integration: https://github.com/mryellow/maze_explorer.
-
actions
¶
-
close
()¶
-
execute
(actions)¶
-
reset
()¶
-
states
¶
-
tensorforce.contrib.openai_gym module¶
OpenAI Gym Integration: https://gym.openai.com/.
tensorforce.contrib.openai_universe module¶
-
class
tensorforce.contrib.openai_universe.
OpenAIUniverse
(env_id)¶ Bases:
tensorforce.environments.environment.Environment
OpenAI Universe Integration: https://universe.openai.com/. Contains OpenAI Gym: https://gym.openai.com/.
-
actions
¶
-
close
()¶
-
configure
(*args, **kwargs)¶
-
execute
(actions)¶
-
render
(*args, **kwargs)¶
-
reset
()¶
-
states
¶
-
tensorforce.contrib.remote_environment module¶
-
class
tensorforce.contrib.remote_environment.
MsgPackNumpyProtocol
(max_msg_len=8192)¶ Bases:
object
A simple protocol to communicate over tcp sockets, which can be used by RemoteEnvironment implementations. The protocol is based on msgpack-numpy encoding and decoding.
Each message has a simple 8-byte header, which encodes the length of the subsequent msgpack-numpy encoded byte-string. All messages received need to have the ‘status’ field set to ‘ok’. If ‘status’ is set to ‘error’, the field ‘message’ should be populated with some error information.
Examples: client sends: “[8-byte header]msgpack-encoded({“cmd”: “seed”, “value”: 200})” server responds: “[8-byte header]msgpack-encoded({“status”: “ok”, “value”: 200})”
client sends: “[8-byte header]msgpack-encoded({“cmd”: “reset”})” server responds: “[8-byte header]msgpack-encoded({“status”: “ok”})”
client sends: “[8-byte header]msgpack-encoded({“cmd”: “step”, “action”: 5})” server responds: “[8-byte header]msgpack-encoded({“status”: “ok”, “obs_dict”: {… some observations}, “reward”: -10.0, “is_terminal”: False})”
-
recv
(socket_)¶ Receives a message as msgpack-numpy encoded byte-string from the given socket object. Blocks until something was received.
Parameters: socket – The python socket object to use. Returns: The decoded (as dict) message received.
-
send
(message, socket_)¶ Sends a message (dict) to the socket. Message is encoded via msgpack-numpy.
Parameters: - message – The message dict (e.g. {“cmd”: “reset”})
- socket – The python socket object to use.
-
-
class
tensorforce.contrib.remote_environment.
RemoteEnvironment
(host='localhost', port=6025)¶ Bases:
tensorforce.environments.environment.Environment
-
close
()¶ Same as disconnect method.
-
connect
()¶ Starts the server tcp connection on the given host:port.
-
current_state
¶
-
disconnect
()¶ Ends our server tcp connection.
-
tensorforce.contrib.state_settable_environment module¶
-
class
tensorforce.contrib.state_settable_environment.
StateSettableEnvironment
¶ Bases:
tensorforce.environments.environment.Environment
An Environment that implements the set_state method to set the current state to some new state using setter instructions.
-
set_state
(**kwargs)¶ Sets the current state of the environment manually to some other state and returns a new observation.
Parameters: **kwargs – - The set instruction(s) to be executed by the environment.
- A single set instruction usually set a single property of the
state/observation vector to some new value.
Returns: The observation dictionary of the Environment after(!) setting it to the new state.
-
tensorforce.contrib.unreal_engine module¶
-
class
tensorforce.contrib.unreal_engine.
UE4Environment
(host='localhost', port=6025, connect=True, discretize_actions=False, delta_time=0, num_ticks=4)¶ Bases:
tensorforce.contrib.remote_environment.RemoteEnvironment
,tensorforce.contrib.state_settable_environment.StateSettableEnvironment
A special RemoteEnvironment for UE4 game connections. Communicates with the remote to receive information on the definitions of action- and observation spaces. Sends UE4 Action- and Axis-mappings as RL-actions and receives observations back defined by ducandu plugin Observer objects placed in the Game (these could be camera pixels or other observations, e.g. a x/y/z position of some game actor).
-
actions
()¶
-
connect
()¶
-
discretize_action_space_desc
()¶ Creates a list of discrete action(-combinations) in case we want to learn with a discrete set of actions, but only have action-combinations (maybe even continuous) available from the env. E.g. the UE4 game has the following action/axis-mappings:
{ 'Fire': {'type': 'action', 'keys': ('SpaceBar',)}, 'MoveRight': {'type': 'axis', 'keys': (('Right', 1.0), ('Left', -1.0), ('A', -1.0), ('D', 1.0))}, }
-> this method will discretize them into the following 6 discrete actions:
[ [(Right, 0.0),(SpaceBar, False)], [(Right, 0.0),(SpaceBar, True)] [(Right, -1.0),(SpaceBar, False)], [(Right, -1.0),(SpaceBar, True)], [(Right, 1.0),(SpaceBar, False)], [(Right, 1.0),(SpaceBar, True)], ]
-
execute
(actions)¶ Executes a single step in the UE4 game. This step may be comprised of one or more actual game ticks for all of which the same given action- and axis-inputs (or action number in case of discretized actions) are repeated. UE4 distinguishes between action-mappings, which are boolean actions (e.g. jump or dont-jump) and axis-mappings, which are continuous actions like MoveForward with values between -1.0 (run backwards) and 1.0 (run forwards), 0.0 would mean: stop.
-
static
extract_observation
(message)¶
-
reset
()¶ same as step (no kwargs to pass), but needs to block and return observation_dict
- stores the received observation in self.last_observation
-
seed
(seed=None)¶
-
set_state
(setters, **kwargs)¶
-
states
()¶
-
translate_abstract_actions_to_keys
(abstract)¶ Translates a list of tuples ([pretty mapping], [value]) to a list of tuples ([some key], [translated value]) each single item in abstract will undergo the following translation:
Example1: we want: “MoveRight”: 5.0 possible keys for the action are: (“Right”, 1.0), (“Left”, -1.0) result: “Right”: 5.0 * 1.0 = 5.0
Example2: we want: “MoveRight”: -0.5 possible keys for the action are: (“Left”, -1.0), (“Right”, 1.0) result: “Left”: -0.5 * -1.0 = 0.5 (same as “Right”: -0.5)
-