tensorforce.core.memories package

Submodules

tensorforce.core.memories.memory module

class tensorforce.core.memories.memory.Memory(states, internals, actions, include_next_states, scope='memory', summary_labels=None)

Bases: object

Base class for memories.

__init__(states, internals, actions, include_next_states, scope='memory', summary_labels=None)

Memory.

Parameters:
  • states – States specifiction.
  • internals – Internal states specification.
  • actions – Actions specification.
  • include_next_states – Include subsequent state if true.
static from_spec(spec, kwargs=None)

Creates a memory from a specification dict.

get_summaries()

Returns the TensorFlow summaries reported by the memory.

Returns:List of summaries.
get_variables()

Returns the TensorFlow variables used by the memory.

Returns:List of variables.
tf_initialize()

Initializes memory.

tf_retrieve_episodes(n)

Retrieves a given number of episodes from the stored experiences.

Parameters:n – Number of episodes to retrieve.
Returns:Dicts containing the retrieved experiences.
tf_retrieve_sequences(n, sequence_length)

Retrieves a given number of temporally consistent timestep sequences from the stored experiences.

Parameters:
  • n – Number of sequences to retrieve.
  • sequence_length – Length of timestep sequences.
Returns:

Dicts containing the retrieved experiences.

tf_retrieve_timesteps(n)

Retrieves a given number of timesteps from the stored experiences.

Parameters:n – Number of timesteps to retrieve.
Returns:Dicts containing the retrieved experiences.
tf_store(states, internals, actions, terminal, reward)

” Stores experiences, i.e. a batch of timesteps.

Parameters:
  • states – Dict of state tensors.
  • internals – List of prior internal state tensors.
  • actions – Dict of action tensors.
  • terminal – Terminal boolean tensor.
  • reward – Reward tensor.
tf_update_batch(loss_per_instance)

Updates the internal information of the latest batch instances based on their loss.

Parameters:loss_per_instance – Loss per instance tensor.

tensorforce.core.memories.naive_prioritized_replay module

tensorforce.core.memories.prioritized_replay module

class tensorforce.core.memories.prioritized_replay.PrioritizedReplay(states, internals, actions, include_next_states, capacity, prioritization_weight=1.0, buffer_size=100, scope='queue', summary_labels=None)

Bases: tensorforce.core.memories.memory.Memory

Memory organized as a priority queue, which randomly retrieves experiences sampled according their priority values.

__init__(states, internals, actions, include_next_states, capacity, prioritization_weight=1.0, buffer_size=100, scope='queue', summary_labels=None)

Prioritized experience replay.

Parameters:
  • states – States specifiction.
  • internals – Internal states specification.
  • actions – Actions specification.
  • include_next_states – Include subsequent state if true.
  • capacity – Memory capacity.
  • prioritization_weight – Prioritization weight.
  • buffer_size – Buffer size. The buffer is used to insert experiences before experiences have been computed via updates.
from_spec(spec, kwargs=None)

Creates a memory from a specification dict.

get_summaries()

Returns the TensorFlow summaries reported by the memory.

Returns:List of summaries.
get_variables()

Returns the TensorFlow variables used by the memory.

Returns:List of variables.
tf_initialize()
tf_retrieve_episodes(n)
tf_retrieve_indices(buffer_elements, priority_indices)

Fetches experiences for given indices by combining entries from buffer which have no priorities, and entries from priority memory.

Parameters:
  • buffer_elements – Number of buffer elements to retrieve
  • priority_indices – Index tensor for priority memory

Returns: Batch of experiences

tf_retrieve_sequences(n, sequence_length)
tf_retrieve_timesteps(n)
tf_store(states, internals, actions, terminal, reward)
tf_update_batch(loss_per_instance)

Updates priority memory by performing the following steps:

  1. Use saved indices from prior retrieval to reconstruct the batch elements which will have their priorities updated.
  2. Compute priorities for these elements.
  3. Insert buffer elements to memory, potentially overwriting existing elements.
  4. Update priorities of existing memory elements
  5. Resort memory.
  6. Update buffer insertion index.

Note that this implementation could be made more efficient by maintaining a sorted version via sum trees.

Parameters:loss_per_instance – Losses from recent batch to perform priority update

tensorforce.core.memories.replay module

class tensorforce.core.memories.replay.Replay(states, internals, actions, include_next_states, capacity, scope='replay', summary_labels=None)

Bases: tensorforce.core.memories.queue.Queue

Memory which randomly retrieves experiences.

__init__(states, internals, actions, include_next_states, capacity, scope='replay', summary_labels=None)

Replay memory.

Parameters:
  • states – States specification.
  • internals – Internal states specification.
  • actions – Actions specification.
  • include_next_states – Include subsequent state if true.
  • capacity – Memory capacity.
from_spec(spec, kwargs=None)

Creates a memory from a specification dict.

get_summaries()

Returns the TensorFlow summaries reported by the memory.

Returns:List of summaries.
get_variables()

Returns the TensorFlow variables used by the memory.

Returns:List of variables.
tf_initialize()
tf_retrieve_episodes(n)
tf_retrieve_indices(indices)

Fetches experiences for given indices.

Parameters:indices – Index tensor

Returns: Batch of experiences

tf_retrieve_sequences(n, sequence_length)
tf_retrieve_timesteps(n)
tf_store(states, internals, actions, terminal, reward)
tf_update_batch(loss_per_instance)

Updates the internal information of the latest batch instances based on their loss.

Parameters:loss_per_instance – Loss per instance tensor.

Module contents

class tensorforce.core.memories.Memory(states, internals, actions, include_next_states, scope='memory', summary_labels=None)

Bases: object

Base class for memories.

__init__(states, internals, actions, include_next_states, scope='memory', summary_labels=None)

Memory.

Parameters:
  • states – States specifiction.
  • internals – Internal states specification.
  • actions – Actions specification.
  • include_next_states – Include subsequent state if true.
static from_spec(spec, kwargs=None)

Creates a memory from a specification dict.

get_summaries()

Returns the TensorFlow summaries reported by the memory.

Returns:List of summaries.
get_variables()

Returns the TensorFlow variables used by the memory.

Returns:List of variables.
tf_initialize()

Initializes memory.

tf_retrieve_episodes(n)

Retrieves a given number of episodes from the stored experiences.

Parameters:n – Number of episodes to retrieve.
Returns:Dicts containing the retrieved experiences.
tf_retrieve_sequences(n, sequence_length)

Retrieves a given number of temporally consistent timestep sequences from the stored experiences.

Parameters:
  • n – Number of sequences to retrieve.
  • sequence_length – Length of timestep sequences.
Returns:

Dicts containing the retrieved experiences.

tf_retrieve_timesteps(n)

Retrieves a given number of timesteps from the stored experiences.

Parameters:n – Number of timesteps to retrieve.
Returns:Dicts containing the retrieved experiences.
tf_store(states, internals, actions, terminal, reward)

” Stores experiences, i.e. a batch of timesteps.

Parameters:
  • states – Dict of state tensors.
  • internals – List of prior internal state tensors.
  • actions – Dict of action tensors.
  • terminal – Terminal boolean tensor.
  • reward – Reward tensor.
tf_update_batch(loss_per_instance)

Updates the internal information of the latest batch instances based on their loss.

Parameters:loss_per_instance – Loss per instance tensor.
class tensorforce.core.memories.Queue(states, internals, actions, include_next_states, capacity, scope='queue', summary_labels=None)

Bases: tensorforce.core.memories.memory.Memory

Base class for memories organized as a queue (FIFO).

__init__(states, internals, actions, include_next_states, capacity, scope='queue', summary_labels=None)

Queue memory.

Parameters:
  • states – States specifiction.
  • internals – Internal states specification.
  • actions – Actions specification.
  • include_next_states – Include subsequent state if true.
  • capacity – Memory capacity.
from_spec(spec, kwargs=None)

Creates a memory from a specification dict.

get_summaries()

Returns the TensorFlow summaries reported by the memory.

Returns:List of summaries.
get_variables()

Returns the TensorFlow variables used by the memory.

Returns:List of variables.
tf_initialize()
tf_retrieve_episodes(n)

Retrieves a given number of episodes from the stored experiences.

Parameters:n – Number of episodes to retrieve.
Returns:Dicts containing the retrieved experiences.
tf_retrieve_indices(indices)

Fetches experiences for given indices.

Parameters:indices – Index tensor

Returns: Batch of experiences

tf_retrieve_sequences(n, sequence_length)

Retrieves a given number of temporally consistent timestep sequences from the stored experiences.

Parameters:
  • n – Number of sequences to retrieve.
  • sequence_length – Length of timestep sequences.
Returns:

Dicts containing the retrieved experiences.

tf_retrieve_timesteps(n)

Retrieves a given number of timesteps from the stored experiences.

Parameters:n – Number of timesteps to retrieve.
Returns:Dicts containing the retrieved experiences.
tf_store(states, internals, actions, terminal, reward)
tf_update_batch(loss_per_instance)

Updates the internal information of the latest batch instances based on their loss.

Parameters:loss_per_instance – Loss per instance tensor.
class tensorforce.core.memories.Latest(states, internals, actions, include_next_states, capacity, scope='latest', summary_labels=None)

Bases: tensorforce.core.memories.queue.Queue

Memory which always retrieves most recent experiences.

__init__(states, internals, actions, include_next_states, capacity, scope='latest', summary_labels=None)

Latest memory.

Parameters:
  • states – States specifiction.
  • internals – Internal states specification.
  • actions – Actions specification.
  • include_next_states – Include subsequent state if true.
  • capacity – Memory capacity.
from_spec(spec, kwargs=None)

Creates a memory from a specification dict.

get_summaries()

Returns the TensorFlow summaries reported by the memory.

Returns:List of summaries.
get_variables()

Returns the TensorFlow variables used by the memory.

Returns:List of variables.
tf_initialize()
tf_retrieve_episodes(n)
tf_retrieve_indices(indices)

Fetches experiences for given indices.

Parameters:indices – Index tensor

Returns: Batch of experiences

tf_retrieve_sequences(n, sequence_length)
tf_retrieve_timesteps(n)
tf_store(states, internals, actions, terminal, reward)
tf_update_batch(loss_per_instance)

Updates the internal information of the latest batch instances based on their loss.

Parameters:loss_per_instance – Loss per instance tensor.
class tensorforce.core.memories.Replay(states, internals, actions, include_next_states, capacity, scope='replay', summary_labels=None)

Bases: tensorforce.core.memories.queue.Queue

Memory which randomly retrieves experiences.

__init__(states, internals, actions, include_next_states, capacity, scope='replay', summary_labels=None)

Replay memory.

Parameters:
  • states – States specification.
  • internals – Internal states specification.
  • actions – Actions specification.
  • include_next_states – Include subsequent state if true.
  • capacity – Memory capacity.
from_spec(spec, kwargs=None)

Creates a memory from a specification dict.

get_summaries()

Returns the TensorFlow summaries reported by the memory.

Returns:List of summaries.
get_variables()

Returns the TensorFlow variables used by the memory.

Returns:List of variables.
tf_initialize()
tf_retrieve_episodes(n)
tf_retrieve_indices(indices)

Fetches experiences for given indices.

Parameters:indices – Index tensor

Returns: Batch of experiences

tf_retrieve_sequences(n, sequence_length)
tf_retrieve_timesteps(n)
tf_store(states, internals, actions, terminal, reward)
tf_update_batch(loss_per_instance)

Updates the internal information of the latest batch instances based on their loss.

Parameters:loss_per_instance – Loss per instance tensor.
class tensorforce.core.memories.PrioritizedReplay(states, internals, actions, include_next_states, capacity, prioritization_weight=1.0, buffer_size=100, scope='queue', summary_labels=None)

Bases: tensorforce.core.memories.memory.Memory

Memory organized as a priority queue, which randomly retrieves experiences sampled according their priority values.

__init__(states, internals, actions, include_next_states, capacity, prioritization_weight=1.0, buffer_size=100, scope='queue', summary_labels=None)

Prioritized experience replay.

Parameters:
  • states – States specifiction.
  • internals – Internal states specification.
  • actions – Actions specification.
  • include_next_states – Include subsequent state if true.
  • capacity – Memory capacity.
  • prioritization_weight – Prioritization weight.
  • buffer_size – Buffer size. The buffer is used to insert experiences before experiences have been computed via updates.
from_spec(spec, kwargs=None)

Creates a memory from a specification dict.

get_summaries()

Returns the TensorFlow summaries reported by the memory.

Returns:List of summaries.
get_variables()

Returns the TensorFlow variables used by the memory.

Returns:List of variables.
tf_initialize()
tf_retrieve_episodes(n)
tf_retrieve_indices(buffer_elements, priority_indices)

Fetches experiences for given indices by combining entries from buffer which have no priorities, and entries from priority memory.

Parameters:
  • buffer_elements – Number of buffer elements to retrieve
  • priority_indices – Index tensor for priority memory

Returns: Batch of experiences

tf_retrieve_sequences(n, sequence_length)
tf_retrieve_timesteps(n)
tf_store(states, internals, actions, terminal, reward)
tf_update_batch(loss_per_instance)

Updates priority memory by performing the following steps:

  1. Use saved indices from prior retrieval to reconstruct the batch elements which will have their priorities updated.
  2. Compute priorities for these elements.
  3. Insert buffer elements to memory, potentially overwriting existing elements.
  4. Update priorities of existing memory elements
  5. Resort memory.
  6. Update buffer insertion index.

Note that this implementation could be made more efficient by maintaining a sorted version via sum trees.

Parameters:loss_per_instance – Losses from recent batch to perform priority update