tensorforce.core.distributions package

Submodules

tensorforce.core.distributions.bernoulli module

class tensorforce.core.distributions.bernoulli.Bernoulli(shape, probability=0.5, scope='bernoulli', summary_labels=())

Bases: tensorforce.core.distributions.distribution.Distribution

Bernoulli distribution, for binary boolean actions.

__init__(shape, probability=0.5, scope='bernoulli', summary_labels=())

Bernoulli distribution.

Parameters:
  • shape – Action shape.
  • probability – Optional distribution bias.
from_spec(spec, kwargs=None)

Creates a distribution from a specification dict.

get_summaries()
get_variables(include_nontrainable=False)
state_action_value(distr_params, action=None)
state_value(distr_params)
tf_entropy(distr_params)
tf_kl_divergence(distr_params1, distr_params2)
tf_log_probability(distr_params, action)
tf_parameterize(x)
tf_regularization_loss()
tf_sample(distr_params, deterministic)

tensorforce.core.distributions.beta module

class tensorforce.core.distributions.beta.Beta(shape, min_value, max_value, alpha=0.0, beta=0.0, scope='beta', summary_labels=())

Bases: tensorforce.core.distributions.distribution.Distribution

Beta distribution, for bounded continuous actions.

__init__(shape, min_value, max_value, alpha=0.0, beta=0.0, scope='beta', summary_labels=())

Beta distribution.

Parameters:
  • shape – Action shape.
  • min_value – Minimum value of continuous actions.
  • max_value – Maximum value of continuous actions.
  • alpha – Optional distribution bias for the alpha value.
  • beta – Optional distribution bias for the beta value.
from_spec(spec, kwargs=None)

Creates a distribution from a specification dict.

get_summaries()
get_variables(include_nontrainable=False)
tf_entropy(distr_params)
tf_kl_divergence(distr_params1, distr_params2)
tf_log_probability(distr_params, action)
tf_parameterize(x)
tf_regularization_loss()
tf_sample(distr_params, deterministic)

tensorforce.core.distributions.categorical module

class tensorforce.core.distributions.categorical.Categorical(shape, num_actions, probabilities=None, scope='categorical', summary_labels=())

Bases: tensorforce.core.distributions.distribution.Distribution

Categorical distribution, for discrete actions.

__init__(shape, num_actions, probabilities=None, scope='categorical', summary_labels=())

Categorical distribution.

Parameters:
  • shape – Action shape.
  • num_actions – Number of discrete action alternatives.
  • probabilities – Optional distribution bias.
from_spec(spec, kwargs=None)

Creates a distribution from a specification dict.

get_summaries()
get_variables(include_nontrainable=False)
state_action_value(distr_params, action=None)
state_value(distr_params)
tf_entropy(distr_params)
tf_kl_divergence(distr_params1, distr_params2)
tf_log_probability(distr_params, action)
tf_parameterize(x)
tf_regularization_loss()
tf_sample(distr_params, deterministic)

tensorforce.core.distributions.distribution module

class tensorforce.core.distributions.distribution.Distribution(shape, scope='distribution', summary_labels=None)

Bases: object

Base class for policy distributions.

__init__(shape, scope='distribution', summary_labels=None)

Distribution.

Parameters:shape – Action shape.
static from_spec(spec, kwargs=None)

Creates a distribution from a specification dict.

get_summaries()

Returns the TensorFlow summaries reported by the distribution.

Returns:List of summaries.
get_variables(include_nontrainable=False)

Returns the TensorFlow variables used by the distribution.

Returns:List of variables.
tf_entropy(distr_params)

Creates the TensorFlow operations for calculating the entropy of a distribution.

Parameters:distr_params – Tuple of distribution parameter tensors.
Returns:Entropy tensor.
tf_kl_divergence(distr_params1, distr_params2)

Creates the TensorFlow operations for calculating the KL divergence between two distributions.

Parameters:
  • distr_params1 – Tuple of parameter tensors for first distribution.
  • distr_params2 – Tuple of parameter tensors for second distribution.
Returns:

KL divergence tensor.

tf_log_probability(distr_params, action)

Creates the TensorFlow operations for calculating the log probability of an action for a distribution.

Parameters:
  • distr_params – Tuple of distribution parameter tensors.
  • action – Action tensor.
Returns:

KL divergence tensor.

tf_parameterize(x)

Creates the TensorFlow operations for parameterizing a distribution conditioned on the given input.

Parameters:x – Input tensor which the distribution is conditioned on.
Returns:Tuple of distribution parameter tensors.
tf_regularization_loss()

Creates the TensorFlow operations for the distribution regularization loss.

Returns:Regularization loss tensor.
tf_sample(distr_params, deterministic)

Creates the TensorFlow operations for sampling an action based on a distribution.

Parameters:
  • distr_params – Tuple of distribution parameter tensors.
  • deterministic – Boolean input tensor indicating whether the maximum likelihood action should be returned.
Returns:

Sampled action tensor.

tensorforce.core.distributions.gaussian module

class tensorforce.core.distributions.gaussian.Gaussian(shape, mean=0.0, log_stddev=0.0, scope='gaussian', summary_labels=())

Bases: tensorforce.core.distributions.distribution.Distribution

Gaussian distribution, for unbounded continuous actions.

__init__(shape, mean=0.0, log_stddev=0.0, scope='gaussian', summary_labels=())

Categorical distribution.

Parameters:
  • shape – Action shape.
  • mean – Optional distribution bias for the mean.
  • log_stddev – Optional distribution bias for the standard deviation.
from_spec(spec, kwargs=None)

Creates a distribution from a specification dict.

get_summaries()
get_variables(include_nontrainable=False)
state_action_value(distr_params, action)
state_value(distr_params)
tf_entropy(distr_params)
tf_kl_divergence(distr_params1, distr_params2)
tf_log_probability(distr_params, action)
tf_parameterize(x)
tf_regularization_loss()
tf_sample(distr_params, deterministic)

Module contents

class tensorforce.core.distributions.Distribution(shape, scope='distribution', summary_labels=None)

Bases: object

Base class for policy distributions.

__init__(shape, scope='distribution', summary_labels=None)

Distribution.

Parameters:shape – Action shape.
static from_spec(spec, kwargs=None)

Creates a distribution from a specification dict.

get_summaries()

Returns the TensorFlow summaries reported by the distribution.

Returns:List of summaries.
get_variables(include_nontrainable=False)

Returns the TensorFlow variables used by the distribution.

Returns:List of variables.
tf_entropy(distr_params)

Creates the TensorFlow operations for calculating the entropy of a distribution.

Parameters:distr_params – Tuple of distribution parameter tensors.
Returns:Entropy tensor.
tf_kl_divergence(distr_params1, distr_params2)

Creates the TensorFlow operations for calculating the KL divergence between two distributions.

Parameters:
  • distr_params1 – Tuple of parameter tensors for first distribution.
  • distr_params2 – Tuple of parameter tensors for second distribution.
Returns:

KL divergence tensor.

tf_log_probability(distr_params, action)

Creates the TensorFlow operations for calculating the log probability of an action for a distribution.

Parameters:
  • distr_params – Tuple of distribution parameter tensors.
  • action – Action tensor.
Returns:

KL divergence tensor.

tf_parameterize(x)

Creates the TensorFlow operations for parameterizing a distribution conditioned on the given input.

Parameters:x – Input tensor which the distribution is conditioned on.
Returns:Tuple of distribution parameter tensors.
tf_regularization_loss()

Creates the TensorFlow operations for the distribution regularization loss.

Returns:Regularization loss tensor.
tf_sample(distr_params, deterministic)

Creates the TensorFlow operations for sampling an action based on a distribution.

Parameters:
  • distr_params – Tuple of distribution parameter tensors.
  • deterministic – Boolean input tensor indicating whether the maximum likelihood action should be returned.
Returns:

Sampled action tensor.

class tensorforce.core.distributions.Bernoulli(shape, probability=0.5, scope='bernoulli', summary_labels=())

Bases: tensorforce.core.distributions.distribution.Distribution

Bernoulli distribution, for binary boolean actions.

__init__(shape, probability=0.5, scope='bernoulli', summary_labels=())

Bernoulli distribution.

Parameters:
  • shape – Action shape.
  • probability – Optional distribution bias.
from_spec(spec, kwargs=None)

Creates a distribution from a specification dict.

get_summaries()
get_variables(include_nontrainable=False)
state_action_value(distr_params, action=None)
state_value(distr_params)
tf_entropy(distr_params)
tf_kl_divergence(distr_params1, distr_params2)
tf_log_probability(distr_params, action)
tf_parameterize(x)
tf_regularization_loss()
tf_sample(distr_params, deterministic)
class tensorforce.core.distributions.Categorical(shape, num_actions, probabilities=None, scope='categorical', summary_labels=())

Bases: tensorforce.core.distributions.distribution.Distribution

Categorical distribution, for discrete actions.

__init__(shape, num_actions, probabilities=None, scope='categorical', summary_labels=())

Categorical distribution.

Parameters:
  • shape – Action shape.
  • num_actions – Number of discrete action alternatives.
  • probabilities – Optional distribution bias.
from_spec(spec, kwargs=None)

Creates a distribution from a specification dict.

get_summaries()
get_variables(include_nontrainable=False)
state_action_value(distr_params, action=None)
state_value(distr_params)
tf_entropy(distr_params)
tf_kl_divergence(distr_params1, distr_params2)
tf_log_probability(distr_params, action)
tf_parameterize(x)
tf_regularization_loss()
tf_sample(distr_params, deterministic)
class tensorforce.core.distributions.Gaussian(shape, mean=0.0, log_stddev=0.0, scope='gaussian', summary_labels=())

Bases: tensorforce.core.distributions.distribution.Distribution

Gaussian distribution, for unbounded continuous actions.

__init__(shape, mean=0.0, log_stddev=0.0, scope='gaussian', summary_labels=())

Categorical distribution.

Parameters:
  • shape – Action shape.
  • mean – Optional distribution bias for the mean.
  • log_stddev – Optional distribution bias for the standard deviation.
from_spec(spec, kwargs=None)

Creates a distribution from a specification dict.

get_summaries()
get_variables(include_nontrainable=False)
state_action_value(distr_params, action)
state_value(distr_params)
tf_entropy(distr_params)
tf_kl_divergence(distr_params1, distr_params2)
tf_log_probability(distr_params, action)
tf_parameterize(x)
tf_regularization_loss()
tf_sample(distr_params, deterministic)
class tensorforce.core.distributions.Beta(shape, min_value, max_value, alpha=0.0, beta=0.0, scope='beta', summary_labels=())

Bases: tensorforce.core.distributions.distribution.Distribution

Beta distribution, for bounded continuous actions.

__init__(shape, min_value, max_value, alpha=0.0, beta=0.0, scope='beta', summary_labels=())

Beta distribution.

Parameters:
  • shape – Action shape.
  • min_value – Minimum value of continuous actions.
  • max_value – Maximum value of continuous actions.
  • alpha – Optional distribution bias for the alpha value.
  • beta – Optional distribution bias for the beta value.
from_spec(spec, kwargs=None)

Creates a distribution from a specification dict.

get_summaries()
get_variables(include_nontrainable=False)
tf_entropy(distr_params)
tf_kl_divergence(distr_params1, distr_params2)
tf_log_probability(distr_params, action)
tf_parameterize(x)
tf_regularization_loss()
tf_sample(distr_params, deterministic)