Agent interface¶

class tensorforce.agents.Agent(states, actions, max_episode_timesteps=None, parallel_interactions=1, buffer_observe=True, seed=None, recorder=None)[source]¶

Tensorforce agent interface.

act(states, parallel=0, deterministic=False, independent=False, evaluation=False, query=None, **kwargs)[source]¶

Returns action(s) for the given state(s), needs to be followed by observe(...) unless independent is true.

Parameters:	states (dict[state]) – Dictionary containing state(s) to be acted on (required). parallel (int) – Parallel execution index (default: 0). deterministic (bool) – Whether to apply exploration and sampling (default: false). independent (bool) – Whether action is not remembered, and this call is thus not followed by observe (default: false). evaluation (bool) – Whether the agent is currently evaluated, implies and overwrites deterministic and independent (default: false). query (list[str]) – Names of tensors to retrieve (default: none). kwargs – Additional input values, for instance, for dynamic hyperparameters.
Returns:	Dictionary containing action(s), plus queried tensor values if requested.
Return type:	(dict[action], plus optional list[str])

close()[source]¶: Closes the agent.

static create(agent='tensorforce', environment=None, **kwargs)[source]¶

Creates an agent from a specification.

Parameters:

agent (specification | Agent object) – JSON file, specification key, configuration dictionary, library module, or Agent object (default: Policy agent).
environment (Environment object) – Environment which the agent is supposed to be trained on, environment-related arguments like state/action space specifications and maximum episode length will be extract if given (recommended).
kwargs – Additional arguments.

get_available_summaries()[source]¶

Returns the summary labels provided by the agent.

Returns:	Available summary labels.
Return type:	list[str]

get_output_tensors(function)[source]¶

Returns the names of output tensors for the given function.

Parameters:	function (str) – Function name (required).
Returns:	Names of output tensors.
Return type:	list[str]

get_query_tensors(function)[source]¶

Returns the names of queryable tensors for the given function.

Parameters:	function (str) – Function name (required).
Returns:	Names of queryable tensors.
Return type:	list[str]

initialize()[source]¶: Initializes the agent.

static load(directory, filename=None, environment=None, **kwargs)[source]¶

Restores an agent from a specification directory/file.

Parameters:

directory (str) – Agent directory (required).
filename (str) – Agent filename (default: “agent”).
environment (Environment object) – Environment which the agent is supposed to be trained on, environment-related arguments like state/action space specifications and maximum episode length will be extract if given (recommended).
kwargs – Additional arguments.

observe(reward, terminal=False, parallel=0, query=None, **kwargs)[source]¶

Observes reward and whether a terminal state is reached, needs to be preceded by act(...).

Parameters:	reward (float) – Reward (required). terminal (bool \| 0 \| 1 \| 2) – Whether a terminal state is reached or 2 if the episode was aborted (default: false). parallel (int) – Parallel execution index (default: 0). query (list[str]) – Names of tensors to retrieve (default: none). kwargs – Additional input values, for instance, for dynamic hyperparameters.
Returns:	Whether an update was performed, plus queried tensor values if requested.
Return type:	(bool, optional list[str])

reset()[source]¶: Resets the agent to start a new episode.

restore(directory=None, filename=None)[source]¶

Restores the agent.

Parameters:	directory (str) – Agent directory (default: directory specified for TensorFlow saver). filename (str) – Agent filename (default: latest checkpoint in directory).

save(directory=None, filename=None, append_timestep=True)[source]¶

Saves the current state of the agent.

Parameters:	directory (str) – Agent directory (default: directory specified for TensorFlow saver). filename (str) – Agent filename (default: filename specified for TensorFlow saver, or “agent”). append_timestep – Whether to append the current timestep to the checkpoint file (default: true).
Returns:	Checkpoint path.
Return type:	str