tensorforce.execution package

Submodules

tensorforce.execution.runner module

tensorforce.execution.runner.DistributedTFRunner

alias of Runner

class tensorforce.execution.runner.Runner(agent, environment, repeat_actions=1, history=None, id_=0)

Bases: tensorforce.execution.base_runner.BaseRunner

Simple runner for non-realtime single-process execution.

__init__(agent, environment, repeat_actions=1, history=None, id_=0)

Initialize a single Runner object (one Agent/one Environment).

Parameters:id* – The ID of this Runner (for distributed TF runs).

:type id*: int

close()
episode

Deprecated property episode -> global_episode.

episode_timestep
reset(history=None)

Resets the Runner’s internal stats counters. If history is empty, use default values in history.get().

Parameters:history (dict) – A dictionary containing an already run experiment’s results. Keys should be: episode_rewards (list of rewards), episode_timesteps (lengths of episodes), episode_times (run-times)
run(num_timesteps=None, num_episodes=None, max_episode_timesteps=None, deterministic=False, episode_finished=None, summary_report=None, summary_interval=None, timesteps=None, episodes=None)
Parameters:
  • timesteps (int) – Deprecated; see num_timesteps.
  • episodes (int) – Deprecated; see num_episodes.
timestep

Deprecated property timestep -> global_timestep.

tensorforce.execution.runner.SingleRunner

alias of Runner

tensorforce.execution.threaded_runner module

class tensorforce.execution.threaded_runner.ThreadedRunner(agent, environment, repeat_actions=1, save_path=None, save_episodes=None, save_frequency=None, save_frequency_unit=None, agents=None, environments=None)

Bases: tensorforce.execution.base_runner.BaseRunner

Runner for non-realtime threaded execution of multiple agents.

__init__(agent, environment, repeat_actions=1, save_path=None, save_episodes=None, save_frequency=None, save_frequency_unit=None, agents=None, environments=None)

Initialize a ThreadedRunner object.

Parameters:
  • save_path (str) – Path where to save the shared model.
  • save_episodes (int) – Deprecated: Every how many (global) episodes do we save the shared model?
  • save_frequency (int) – The frequency with which to save the model (could be sec, steps, or episodes).
  • save_frequency_unit (str) – “s” (sec), “t” (timesteps), “e” (episodes)
  • agents (List[Agent]) – Deprecated: List of Agent objects. Use agent, instead.
  • environments (List[Environment]) – Deprecated: List of Environment objects. Use environment, instead.
agents
close()
environments
episode

Deprecated property episode -> global_episode.

episode_lengths
global_step
reset(history=None)

Resets the Runner’s internal stats counters. If history is empty, use default values in history.get().

Parameters:history (dict) – A dictionary containing an already run experiment’s results. Keys should be: episode_rewards (list of rewards), episode_timesteps (lengths of episodes), episode_times (run-times)
run(num_episodes=-1, max_episode_timesteps=-1, episode_finished=None, summary_report=None, summary_interval=0, num_timesteps=None, deterministic=False, episodes=None, max_timesteps=None)

Executes this runner by starting all Agents in parallel (each one in one thread).

Parameters:
  • episodes (int) – Deprecated; see num_episodes.
  • max_timesteps (int) – Deprecated; see max_episode_timesteps.
timestep

Deprecated property timestep -> global_timestep.

tensorforce.execution.threaded_runner.WorkerAgentGenerator(agent_class)

Worker Agent generator, receives an Agent class and creates a Worker Agent class that inherits from that Agent.

tensorforce.execution.threaded_runner.clone_worker_agent(agent, factor, environment, network, agent_config)

Clones a given Agent (factor times) and returns a list of the cloned Agents with the original Agent in the first slot.

Parameters:
  • agent (Agent) – The Agent object to clone.
  • factor (int) – The length of the final list.
  • environment (Environment) – The Environment to use for all cloned agents.
  • network (LayeredNetwork) – The Network to use (or None) for an Agent’s Model.
  • agent_config (dict) – A dict of Agent specifications passed into the Agent’s c’tor as kwargs.
Returns:

The list with factor cloned agents (including the original one).

Module contents

class tensorforce.execution.BaseRunner(agent, environment, repeat_actions=1, history=None)

Bases: object

Base class for all runner classes. Implements the run method.

__init__(agent, environment, repeat_actions=1, history=None)
Parameters:
  • agent (Agent) – Agent object (or list of Agent objects) to use for the run.
  • environment (Environment) – Environment object (or list of Environment objects) to use for the run.
  • repeat_actions (int) – How many times the same given action will be repeated in subsequent calls to Environment’s execute method. Rewards collected in these calls are accumulated and reported as a sum in the following call to Agent’s observe method.
  • history (dict) – A dictionary containing an already run experiment’s results. Keys should be: episode_rewards (list of rewards), episode_timesteps (lengths of episodes), episode_times (run-times)
close()

Should perform clean up operations on Runner’s Agent(s) and Environment(s).

episode

Deprecated property episode -> global_episode.

reset(history=None)

Resets the Runner’s internal stats counters. If history is empty, use default values in history.get().

Parameters:history (dict) – A dictionary containing an already run experiment’s results. Keys should be: episode_rewards (list of rewards), episode_timesteps (lengths of episodes), episode_times (run-times)
run(num_episodes, num_timesteps, max_episode_timesteps, deterministic, episode_finished, summary_report, summary_interval)

Executes this runner by starting to act (via Agent(s)) in the given Environment(s). Stops execution according to certain conditions (e.g. max. number of episodes, etc..). Calls callback functions after each episode and/or after some summary criteria are met.

Parameters:
  • num_episodes (int) – Max. number of episodes to run globally in total (across all threads/workers).
  • num_timesteps (int) – Max. number of time steps to run globally in total (across all threads/workers)
  • max_episode_timesteps (int) – Max. number of timesteps per episode.
  • deterministic (bool) – Whether to use exploration when selecting actions.
  • episode_finished (callable) – A function to be called once an episodes has finished. Should take a BaseRunner object and some worker ID (e.g. thread-ID or task-ID). Can decide for itself every how many episodes it should report something and what to report.
  • summary_report (callable) – Deprecated; Function that could produce a summary over the training progress so far.
  • summary_interval (int) – Deprecated; The number of time steps to execute (globally) before summary_report is called.
timestep

Deprecated property timestep -> global_timestep.

tensorforce.execution.SingleRunner

alias of Runner

tensorforce.execution.DistributedTFRunner

alias of Runner

class tensorforce.execution.Runner(agent, environment, repeat_actions=1, history=None, id_=0)

Bases: tensorforce.execution.base_runner.BaseRunner

Simple runner for non-realtime single-process execution.

__init__(agent, environment, repeat_actions=1, history=None, id_=0)

Initialize a single Runner object (one Agent/one Environment).

Parameters:id* – The ID of this Runner (for distributed TF runs).

:type id*: int

close()
episode

Deprecated property episode -> global_episode.

episode_timestep
reset(history=None)

Resets the Runner’s internal stats counters. If history is empty, use default values in history.get().

Parameters:history (dict) – A dictionary containing an already run experiment’s results. Keys should be: episode_rewards (list of rewards), episode_timesteps (lengths of episodes), episode_times (run-times)
run(num_timesteps=None, num_episodes=None, max_episode_timesteps=None, deterministic=False, episode_finished=None, summary_report=None, summary_interval=None, timesteps=None, episodes=None)
Parameters:
  • timesteps (int) – Deprecated; see num_timesteps.
  • episodes (int) – Deprecated; see num_episodes.
timestep

Deprecated property timestep -> global_timestep.

class tensorforce.execution.ThreadedRunner(agent, environment, repeat_actions=1, save_path=None, save_episodes=None, save_frequency=None, save_frequency_unit=None, agents=None, environments=None)

Bases: tensorforce.execution.base_runner.BaseRunner

Runner for non-realtime threaded execution of multiple agents.

__init__(agent, environment, repeat_actions=1, save_path=None, save_episodes=None, save_frequency=None, save_frequency_unit=None, agents=None, environments=None)

Initialize a ThreadedRunner object.

Parameters:
  • save_path (str) – Path where to save the shared model.
  • save_episodes (int) – Deprecated: Every how many (global) episodes do we save the shared model?
  • save_frequency (int) – The frequency with which to save the model (could be sec, steps, or episodes).
  • save_frequency_unit (str) – “s” (sec), “t” (timesteps), “e” (episodes)
  • agents (List[Agent]) – Deprecated: List of Agent objects. Use agent, instead.
  • environments (List[Environment]) – Deprecated: List of Environment objects. Use environment, instead.
agents
close()
environments
episode

Deprecated property episode -> global_episode.

episode_lengths
global_step
reset(history=None)

Resets the Runner’s internal stats counters. If history is empty, use default values in history.get().

Parameters:history (dict) – A dictionary containing an already run experiment’s results. Keys should be: episode_rewards (list of rewards), episode_timesteps (lengths of episodes), episode_times (run-times)
run(num_episodes=-1, max_episode_timesteps=-1, episode_finished=None, summary_report=None, summary_interval=0, num_timesteps=None, deterministic=False, episodes=None, max_timesteps=None)

Executes this runner by starting all Agents in parallel (each one in one thread).

Parameters:
  • episodes (int) – Deprecated; see num_episodes.
  • max_timesteps (int) – Deprecated; see max_episode_timesteps.
timestep

Deprecated property timestep -> global_timestep.

tensorforce.execution.WorkerAgentGenerator(agent_class)

Worker Agent generator, receives an Agent class and creates a Worker Agent class that inherits from that Agent.