tensorforce.core.memories package¶
Submodules¶
tensorforce.core.memories.memory module¶
-
class
tensorforce.core.memories.memory.
Memory
(states, internals, actions, include_next_states, scope='memory', summary_labels=None)¶ Bases:
object
Base class for memories.
-
__init__
(states, internals, actions, include_next_states, scope='memory', summary_labels=None)¶ Memory.
Parameters: - states – States specifiction.
- internals – Internal states specification.
- actions – Actions specification.
- include_next_states – Include subsequent state if true.
-
static
from_spec
(spec, kwargs=None)¶ Creates a memory from a specification dict.
-
get_summaries
()¶ Returns the TensorFlow summaries reported by the memory.
Returns: List of summaries.
-
get_variables
()¶ Returns the TensorFlow variables used by the memory.
Returns: List of variables.
-
tf_initialize
()¶ Initializes memory.
-
tf_retrieve_episodes
(n)¶ Retrieves a given number of episodes from the stored experiences.
Parameters: n – Number of episodes to retrieve. Returns: Dicts containing the retrieved experiences.
-
tf_retrieve_sequences
(n, sequence_length)¶ Retrieves a given number of temporally consistent timestep sequences from the stored experiences.
Parameters: - n – Number of sequences to retrieve.
- sequence_length – Length of timestep sequences.
Returns: Dicts containing the retrieved experiences.
-
tf_retrieve_timesteps
(n)¶ Retrieves a given number of timesteps from the stored experiences.
Parameters: n – Number of timesteps to retrieve. Returns: Dicts containing the retrieved experiences.
-
tf_store
(states, internals, actions, terminal, reward)¶ ” Stores experiences, i.e. a batch of timesteps.
Parameters: - states – Dict of state tensors.
- internals – List of prior internal state tensors.
- actions – Dict of action tensors.
- terminal – Terminal boolean tensor.
- reward – Reward tensor.
-
tf_update_batch
(loss_per_instance)¶ Updates the internal information of the latest batch instances based on their loss.
Parameters: loss_per_instance – Loss per instance tensor.
-
tensorforce.core.memories.naive_prioritized_replay module¶
tensorforce.core.memories.prioritized_replay module¶
-
class
tensorforce.core.memories.prioritized_replay.
PrioritizedReplay
(states, internals, actions, include_next_states, capacity, prioritization_weight=1.0, buffer_size=100, scope='queue', summary_labels=None)¶ Bases:
tensorforce.core.memories.memory.Memory
Memory organized as a priority queue, which randomly retrieves experiences sampled according their priority values.
-
__init__
(states, internals, actions, include_next_states, capacity, prioritization_weight=1.0, buffer_size=100, scope='queue', summary_labels=None)¶ Prioritized experience replay.
Parameters: - states – States specifiction.
- internals – Internal states specification.
- actions – Actions specification.
- include_next_states – Include subsequent state if true.
- capacity – Memory capacity.
- prioritization_weight – Prioritization weight.
- buffer_size – Buffer size. The buffer is used to insert experiences before experiences have been computed via updates.
-
from_spec
(spec, kwargs=None)¶ Creates a memory from a specification dict.
-
get_summaries
()¶ Returns the TensorFlow summaries reported by the memory.
Returns: List of summaries.
-
get_variables
()¶ Returns the TensorFlow variables used by the memory.
Returns: List of variables.
-
tf_initialize
()¶
-
tf_retrieve_episodes
(n)¶
-
tf_retrieve_indices
(buffer_elements, priority_indices)¶ Fetches experiences for given indices by combining entries from buffer which have no priorities, and entries from priority memory.
Parameters: - buffer_elements – Number of buffer elements to retrieve
- priority_indices – Index tensor for priority memory
Returns: Batch of experiences
-
tf_retrieve_sequences
(n, sequence_length)¶
-
tf_retrieve_timesteps
(n)¶
-
tf_store
(states, internals, actions, terminal, reward)¶
-
tf_update_batch
(loss_per_instance)¶ Updates priority memory by performing the following steps:
- Use saved indices from prior retrieval to reconstruct the batch elements which will have their priorities updated.
- Compute priorities for these elements.
- Insert buffer elements to memory, potentially overwriting existing elements.
- Update priorities of existing memory elements
- Resort memory.
- Update buffer insertion index.
Note that this implementation could be made more efficient by maintaining a sorted version via sum trees.
Parameters: loss_per_instance – Losses from recent batch to perform priority update
-
tensorforce.core.memories.replay module¶
-
class
tensorforce.core.memories.replay.
Replay
(states, internals, actions, include_next_states, capacity, scope='replay', summary_labels=None)¶ Bases:
tensorforce.core.memories.queue.Queue
Memory which randomly retrieves experiences.
-
__init__
(states, internals, actions, include_next_states, capacity, scope='replay', summary_labels=None)¶ Replay memory.
Parameters: - states – States specification.
- internals – Internal states specification.
- actions – Actions specification.
- include_next_states – Include subsequent state if true.
- capacity – Memory capacity.
-
from_spec
(spec, kwargs=None)¶ Creates a memory from a specification dict.
-
get_summaries
()¶ Returns the TensorFlow summaries reported by the memory.
Returns: List of summaries.
-
get_variables
()¶ Returns the TensorFlow variables used by the memory.
Returns: List of variables.
-
tf_initialize
()¶
-
tf_retrieve_episodes
(n)¶
-
tf_retrieve_indices
(indices)¶ Fetches experiences for given indices.
Parameters: indices – Index tensor Returns: Batch of experiences
-
tf_retrieve_sequences
(n, sequence_length)¶
-
tf_retrieve_timesteps
(n)¶
-
tf_store
(states, internals, actions, terminal, reward)¶
-
tf_update_batch
(loss_per_instance)¶ Updates the internal information of the latest batch instances based on their loss.
Parameters: loss_per_instance – Loss per instance tensor.
-
Module contents¶
-
class
tensorforce.core.memories.
Memory
(states, internals, actions, include_next_states, scope='memory', summary_labels=None)¶ Bases:
object
Base class for memories.
-
__init__
(states, internals, actions, include_next_states, scope='memory', summary_labels=None)¶ Memory.
Parameters: - states – States specifiction.
- internals – Internal states specification.
- actions – Actions specification.
- include_next_states – Include subsequent state if true.
-
static
from_spec
(spec, kwargs=None)¶ Creates a memory from a specification dict.
-
get_summaries
()¶ Returns the TensorFlow summaries reported by the memory.
Returns: List of summaries.
-
get_variables
()¶ Returns the TensorFlow variables used by the memory.
Returns: List of variables.
-
tf_initialize
()¶ Initializes memory.
-
tf_retrieve_episodes
(n)¶ Retrieves a given number of episodes from the stored experiences.
Parameters: n – Number of episodes to retrieve. Returns: Dicts containing the retrieved experiences.
-
tf_retrieve_sequences
(n, sequence_length)¶ Retrieves a given number of temporally consistent timestep sequences from the stored experiences.
Parameters: - n – Number of sequences to retrieve.
- sequence_length – Length of timestep sequences.
Returns: Dicts containing the retrieved experiences.
-
tf_retrieve_timesteps
(n)¶ Retrieves a given number of timesteps from the stored experiences.
Parameters: n – Number of timesteps to retrieve. Returns: Dicts containing the retrieved experiences.
-
tf_store
(states, internals, actions, terminal, reward)¶ ” Stores experiences, i.e. a batch of timesteps.
Parameters: - states – Dict of state tensors.
- internals – List of prior internal state tensors.
- actions – Dict of action tensors.
- terminal – Terminal boolean tensor.
- reward – Reward tensor.
-
tf_update_batch
(loss_per_instance)¶ Updates the internal information of the latest batch instances based on their loss.
Parameters: loss_per_instance – Loss per instance tensor.
-
-
class
tensorforce.core.memories.
Queue
(states, internals, actions, include_next_states, capacity, scope='queue', summary_labels=None)¶ Bases:
tensorforce.core.memories.memory.Memory
Base class for memories organized as a queue (FIFO).
-
__init__
(states, internals, actions, include_next_states, capacity, scope='queue', summary_labels=None)¶ Queue memory.
Parameters: - states – States specifiction.
- internals – Internal states specification.
- actions – Actions specification.
- include_next_states – Include subsequent state if true.
- capacity – Memory capacity.
-
from_spec
(spec, kwargs=None)¶ Creates a memory from a specification dict.
-
get_summaries
()¶ Returns the TensorFlow summaries reported by the memory.
Returns: List of summaries.
-
get_variables
()¶ Returns the TensorFlow variables used by the memory.
Returns: List of variables.
-
tf_initialize
()¶
-
tf_retrieve_episodes
(n)¶ Retrieves a given number of episodes from the stored experiences.
Parameters: n – Number of episodes to retrieve. Returns: Dicts containing the retrieved experiences.
-
tf_retrieve_indices
(indices)¶ Fetches experiences for given indices.
Parameters: indices – Index tensor Returns: Batch of experiences
-
tf_retrieve_sequences
(n, sequence_length)¶ Retrieves a given number of temporally consistent timestep sequences from the stored experiences.
Parameters: - n – Number of sequences to retrieve.
- sequence_length – Length of timestep sequences.
Returns: Dicts containing the retrieved experiences.
-
tf_retrieve_timesteps
(n)¶ Retrieves a given number of timesteps from the stored experiences.
Parameters: n – Number of timesteps to retrieve. Returns: Dicts containing the retrieved experiences.
-
tf_store
(states, internals, actions, terminal, reward)¶
-
tf_update_batch
(loss_per_instance)¶ Updates the internal information of the latest batch instances based on their loss.
Parameters: loss_per_instance – Loss per instance tensor.
-
-
class
tensorforce.core.memories.
Latest
(states, internals, actions, include_next_states, capacity, scope='latest', summary_labels=None)¶ Bases:
tensorforce.core.memories.queue.Queue
Memory which always retrieves most recent experiences.
-
__init__
(states, internals, actions, include_next_states, capacity, scope='latest', summary_labels=None)¶ Latest memory.
Parameters: - states – States specifiction.
- internals – Internal states specification.
- actions – Actions specification.
- include_next_states – Include subsequent state if true.
- capacity – Memory capacity.
-
from_spec
(spec, kwargs=None)¶ Creates a memory from a specification dict.
-
get_summaries
()¶ Returns the TensorFlow summaries reported by the memory.
Returns: List of summaries.
-
get_variables
()¶ Returns the TensorFlow variables used by the memory.
Returns: List of variables.
-
tf_initialize
()¶
-
tf_retrieve_episodes
(n)¶
-
tf_retrieve_indices
(indices)¶ Fetches experiences for given indices.
Parameters: indices – Index tensor Returns: Batch of experiences
-
tf_retrieve_sequences
(n, sequence_length)¶
-
tf_retrieve_timesteps
(n)¶
-
tf_store
(states, internals, actions, terminal, reward)¶
-
tf_update_batch
(loss_per_instance)¶ Updates the internal information of the latest batch instances based on their loss.
Parameters: loss_per_instance – Loss per instance tensor.
-
-
class
tensorforce.core.memories.
Replay
(states, internals, actions, include_next_states, capacity, scope='replay', summary_labels=None)¶ Bases:
tensorforce.core.memories.queue.Queue
Memory which randomly retrieves experiences.
-
__init__
(states, internals, actions, include_next_states, capacity, scope='replay', summary_labels=None)¶ Replay memory.
Parameters: - states – States specification.
- internals – Internal states specification.
- actions – Actions specification.
- include_next_states – Include subsequent state if true.
- capacity – Memory capacity.
-
from_spec
(spec, kwargs=None)¶ Creates a memory from a specification dict.
-
get_summaries
()¶ Returns the TensorFlow summaries reported by the memory.
Returns: List of summaries.
-
get_variables
()¶ Returns the TensorFlow variables used by the memory.
Returns: List of variables.
-
tf_initialize
()¶
-
tf_retrieve_episodes
(n)¶
-
tf_retrieve_indices
(indices)¶ Fetches experiences for given indices.
Parameters: indices – Index tensor Returns: Batch of experiences
-
tf_retrieve_sequences
(n, sequence_length)¶
-
tf_retrieve_timesteps
(n)¶
-
tf_store
(states, internals, actions, terminal, reward)¶
-
tf_update_batch
(loss_per_instance)¶ Updates the internal information of the latest batch instances based on their loss.
Parameters: loss_per_instance – Loss per instance tensor.
-
-
class
tensorforce.core.memories.
PrioritizedReplay
(states, internals, actions, include_next_states, capacity, prioritization_weight=1.0, buffer_size=100, scope='queue', summary_labels=None)¶ Bases:
tensorforce.core.memories.memory.Memory
Memory organized as a priority queue, which randomly retrieves experiences sampled according their priority values.
-
__init__
(states, internals, actions, include_next_states, capacity, prioritization_weight=1.0, buffer_size=100, scope='queue', summary_labels=None)¶ Prioritized experience replay.
Parameters: - states – States specifiction.
- internals – Internal states specification.
- actions – Actions specification.
- include_next_states – Include subsequent state if true.
- capacity – Memory capacity.
- prioritization_weight – Prioritization weight.
- buffer_size – Buffer size. The buffer is used to insert experiences before experiences have been computed via updates.
-
from_spec
(spec, kwargs=None)¶ Creates a memory from a specification dict.
-
get_summaries
()¶ Returns the TensorFlow summaries reported by the memory.
Returns: List of summaries.
-
get_variables
()¶ Returns the TensorFlow variables used by the memory.
Returns: List of variables.
-
tf_initialize
()¶
-
tf_retrieve_episodes
(n)¶
-
tf_retrieve_indices
(buffer_elements, priority_indices)¶ Fetches experiences for given indices by combining entries from buffer which have no priorities, and entries from priority memory.
Parameters: - buffer_elements – Number of buffer elements to retrieve
- priority_indices – Index tensor for priority memory
Returns: Batch of experiences
-
tf_retrieve_sequences
(n, sequence_length)¶
-
tf_retrieve_timesteps
(n)¶
-
tf_store
(states, internals, actions, terminal, reward)¶
-
tf_update_batch
(loss_per_instance)¶ Updates priority memory by performing the following steps:
- Use saved indices from prior retrieval to reconstruct the batch elements which will have their priorities updated.
- Compute priorities for these elements.
- Insert buffer elements to memory, potentially overwriting existing elements.
- Update priorities of existing memory elements
- Resort memory.
- Update buffer insertion index.
Note that this implementation could be made more efficient by maintaining a sorted version via sum trees.
Parameters: loss_per_instance – Losses from recent batch to perform priority update
-