Constant Agent¶
-
class
tensorforce.agents.
ConstantAgent
(states, actions, max_episode_timesteps=None, action_values=None, config=None, summarizer=None, recorder=None)¶ Agent returning constant action values (specification key:
constant
).Parameters: - states (specification) – States specification
(required, better implicitly specified via
environment
argument forAgent.create(...)
), arbitrarily nested dictionary of state descriptions (usually taken fromEnvironment.states()
) with the following attributes:- type ("bool" | "int" | "float") – state data type (default: "float").
- shape (int | iter[int]) – state shape (required).
- num_values (int > 0) – number of discrete state values (required for type "int").
- min_value/max_value (float) – minimum/maximum state value (optional for type "float").
- actions (specification) – Actions specification
(required, better implicitly specified via
environment
argument forAgent.create(...)
), arbitrarily nested dictionary of action descriptions (usually taken fromEnvironment.actions()
) with the following attributes:- type ("bool" | "int" | "float") – action data type (required).
- shape (int > 0 | iter[int > 0]) – action shape (default: scalar).
- num_values (int > 0) – number of discrete action values (required for type "int").
- min_value/max_value (float) – minimum/maximum action value (optional for type "float").
- max_episode_timesteps (int > 0) – Upper bound for numer of timesteps per episode
(default: not given, better implicitly
specified via
environment
argument forAgent.create(...)
). - action_values (dict[value]) – Constant value per action (default: false for binary boolean actions, 0 for discrete integer actions, 0.0 for continuous actions).
- config (specification) – Additional configuration options:
- name (string) – Agent name, used e.g. for TensorFlow scopes (default: "agent").
- device (string) – Device name (default: TensorFlow default).
- seed (int) – Random seed to set for Python, NumPy (both set globally!) and TensorFlow, environment seed may have to be set separately for fully deterministic execution (default: none).
- buffer_observe (false | "episode" | int > 0) – Number of timesteps within an episode to buffer before calling the internal observe function, to reduce calls to TensorFlow for improved performance (default: configuration-specific maximum number which can be buffered without affecting performance).
- always_apply_exploration (bool) – Whether to always apply exploration, also for independent `act()
- summarizer (specification) – TensorBoard summarizer configuration with the following
attributes (default: no summarizer):
- directory (path) – summarizer directory (required).
- frequency (int > 0) – how frequently in timesteps to record summaries (default: always).
- flush (int > 0) – how frequently in seconds to flush the summary writer (default: 10).
- max-summaries (int > 0) – maximum number of summaries to keep (default: 5).
- custom (dict[spec]) – custom summaries which are recorded via agent.summarize(...), specification with either type "scalar", type "histogram" with optional "buckets", type "image" with optional "max_outputs" (default: 3), or type "audio" (default: no custom summaries).
- labels ("all" | iter[string]) – all or list of summaries to record, from the following labels (default: only "graph"):
- "graph": graph summary
- "parameters": parameter scalars
- recorder (specification) – Experience traces recorder configuration, currently not including
internal states, with the following attributes
(default: no recorder):
- directory (path) – recorder directory (required).
- frequency (int > 0) – how frequently in episodes to record traces (default: every episode).
- start (int >= 0) – how many episodes to skip before starting to record traces (default: 0).
- max-traces (int > 0) – maximum number of traces to keep (default: all).
- states (specification) – States specification
(required, better implicitly specified via