Agent interface

class tensorforce.agents.Agent(states, actions, max_episode_timesteps=None, parallel_interactions=1, buffer_observe=True, seed=None, recorder=None)[source]

Tensorforce agent interface.

act(states, parallel=0, deterministic=False, independent=False, evaluation=False, query=None, **kwargs)[source]

Returns action(s) for the given state(s), needs to be followed by observe(...) unless independent is true.

Parameters:
  • states (dict[state]) – Dictionary containing state(s) to be acted on (required).
  • parallel (int) – Parallel execution index (default: 0).
  • deterministic (bool) – Whether to apply exploration and sampling (default: false).
  • independent (bool) – Whether action is not remembered, and this call is thus not followed by observe (default: false).
  • evaluation (bool) – Whether the agent is currently evaluated, implies and overwrites deterministic and independent (default: false).
  • query (list[str]) – Names of tensors to retrieve (default: none).
  • kwargs – Additional input values, for instance, for dynamic hyperparameters.
Returns:

Dictionary containing action(s), plus queried tensor values if requested.

Return type:

(dict[action], plus optional list[str])

close()[source]

Closes the agent.

static create(agent='tensorforce', environment=None, **kwargs)[source]

Creates an agent from a specification.

Parameters:
  • agent (specification | Agent object) – JSON file, specification key, configuration dictionary, library module, or Agent object (default: Policy agent).
  • environment (Environment object) – Environment which the agent is supposed to be trained on, environment-related arguments like state/action space specifications and maximum episode length will be extract if given (recommended).
  • kwargs – Additional arguments.
get_available_summaries()[source]

Returns the summary labels provided by the agent.

Returns:Available summary labels.
Return type:list[str]
get_output_tensors(function)[source]

Returns the names of output tensors for the given function.

Parameters:function (str) – Function name (required).
Returns:Names of output tensors.
Return type:list[str]
get_query_tensors(function)[source]

Returns the names of queryable tensors for the given function.

Parameters:function (str) – Function name (required).
Returns:Names of queryable tensors.
Return type:list[str]
initialize()[source]

Initializes the agent.

static load(directory, filename=None, environment=None, **kwargs)[source]

Restores an agent from a specification directory/file.

Parameters:
  • directory (str) – Agent directory (required).
  • filename (str) – Agent filename (default: “agent”).
  • environment (Environment object) – Environment which the agent is supposed to be trained on, environment-related arguments like state/action space specifications and maximum episode length will be extract if given (recommended).
  • kwargs – Additional arguments.
observe(reward, terminal=False, parallel=0, query=None, **kwargs)[source]

Observes reward and whether a terminal state is reached, needs to be preceded by act(...).

Parameters:
  • reward (float) – Reward (required).
  • terminal (bool | 0 | 1 | 2) – Whether a terminal state is reached or 2 if the episode was aborted (default: false).
  • parallel (int) – Parallel execution index (default: 0).
  • query (list[str]) – Names of tensors to retrieve (default: none).
  • kwargs – Additional input values, for instance, for dynamic hyperparameters.
Returns:

Whether an update was performed, plus queried tensor values if requested.

Return type:

(bool, optional list[str])

reset()[source]

Resets the agent to start a new episode.

restore(directory=None, filename=None)[source]

Restores the agent.

Parameters:
  • directory (str) – Agent directory (default: directory specified for TensorFlow saver).
  • filename (str) – Agent filename (default: latest checkpoint in directory).
save(directory=None, filename=None, append_timestep=True)[source]

Saves the current state of the agent.

Parameters:
  • directory (str) – Agent directory (default: directory specified for TensorFlow saver).
  • filename (str) – Agent filename (default: filename specified for TensorFlow saver, or “agent”).
  • append_timestep – Whether to append the current timestep to the checkpoint file (default: true).
Returns:

Checkpoint path.

Return type:

str