tensorforce.core.distributions package

Submodules

tensorforce.core.distributions.bernoulli module

class tensorforce.core.distributions.bernoulli.Bernoulli(shape, probability=0.5, scope='bernoulli', summary_labels=())

Bases: tensorforce.core.distributions.distribution.Distribution

Bernoulli distribution for binary actions.

get_summaries()
get_variables(include_non_trainable=False)
state_action_value(distr_params, action)
state_value(distr_params)
tf_entropy(distr_params)
tf_kl_divergence(distr_params1, distr_params2)
tf_log_probability(distr_params, action)
tf_parameterize(x)
tf_regularization_loss()
tf_sample(distr_params, deterministic)

tensorforce.core.distributions.beta module

class tensorforce.core.distributions.beta.Beta(shape, min_value, max_value, alpha=0.0, beta=0.0, scope='beta', summary_labels=())

Bases: tensorforce.core.distributions.distribution.Distribution

Beta distribution, for bounded continuous actions

get_summaries()
get_variables(include_non_trainable=False)
tf_entropy(distr_params)
tf_kl_divergence(distr_params1, distr_params2)
tf_log_probability(distr_params, action)
tf_parameterize(x)
tf_regularization_loss()
tf_sample(distr_params, deterministic)

tensorforce.core.distributions.categorical module

class tensorforce.core.distributions.categorical.Categorical(shape, num_actions, probabilities=None, scope='categorical', summary_labels=())

Bases: tensorforce.core.distributions.distribution.Distribution

Categorical distribution, for discrete actions

get_summaries()
get_variables(include_non_trainable=False)
state_action_value(distr_params, action)
state_value(distr_params)
tf_entropy(distr_params)
tf_kl_divergence(distr_params1, distr_params2)
tf_log_probability(distr_params, action)
tf_parameterize(x)
tf_regularization_loss()
tf_sample(distr_params, deterministic)

tensorforce.core.distributions.distribution module

class tensorforce.core.distributions.distribution.Distribution(scope='distribution', summary_labels=None)

Bases: object

Base class for policy distributions.

static from_spec(spec, kwargs=None)

Creates a distribution from a specification dict.

get_summaries()

Returns the TensorFlow summaries reported by the distribution.

Returns:List of summaries.
get_variables(include_non_trainable=False)

Returns the TensorFlow variables used by the distribution.

Returns:List of variables.
tf_entropy(distr_params)

Creates the TensorFlow operations for calculating the entropy of a distribution.

Parameters:distr_params – Tuple of distribution parameter tensors.
Returns:Entropy tensor.
tf_kl_divergence(distr_params1, distr_params2)

Creates the TensorFlow operations for calculating the KL divergence between two distributions.

Parameters:
  • distr_params1 – Tuple of parameter tensors for first distribution.
  • distr_params2 – Tuple of parameter tensors for second distribution.
Returns:

KL divergence tensor.

tf_log_probability(distr_params, action)

Creates the TensorFlow operations for calculating the log probability of an action for a distribution.

Parameters:
  • distr_params – Tuple of distribution parameter tensors.
  • action – Action tensor.
Returns:

KL divergence tensor.

tf_parameterize(x)

Creates the TensorFlow operations for parameterizing a distribution conditioned on the given input.

Parameters:x – Input tensor which the distribution is conditioned on.
Returns:Tuple of distribution parameter tensors.
tf_regularization_loss()

Creates the TensorFlow operations for the distribution regularization loss.

Returns:Regularization loss tensor.
tf_sample(distr_params, deterministic)

Creates the TensorFlow operations for sampling an action based on a distribution.

Parameters:
  • distr_params – Tuple of distribution parameter tensors.
  • deterministic – Boolean input tensor indicating whether the maximum likelihood action
  • be returned. (should) –
Returns:

Sampled action tensor.

tensorforce.core.distributions.gaussian module

class tensorforce.core.distributions.gaussian.Gaussian(shape, mean=0.0, log_stddev=0.0, scope='gaussian', summary_labels=())

Bases: tensorforce.core.distributions.distribution.Distribution

Gaussian distribution, for unbounded continuous actions.

get_summaries()
get_variables(include_non_trainable=False)
state_action_value(distr_params, action)
state_value(distr_params)
tf_entropy(distr_params)
tf_kl_divergence(distr_params1, distr_params2)
tf_log_probability(distr_params, action)
tf_parameterize(x)
tf_regularization_loss()
tf_sample(distr_params, deterministic)

Module contents

class tensorforce.core.distributions.Distribution(scope='distribution', summary_labels=None)

Bases: object

Base class for policy distributions.

static from_spec(spec, kwargs=None)

Creates a distribution from a specification dict.

get_summaries()

Returns the TensorFlow summaries reported by the distribution.

Returns:List of summaries.
get_variables(include_non_trainable=False)

Returns the TensorFlow variables used by the distribution.

Returns:List of variables.
tf_entropy(distr_params)

Creates the TensorFlow operations for calculating the entropy of a distribution.

Parameters:distr_params – Tuple of distribution parameter tensors.
Returns:Entropy tensor.
tf_kl_divergence(distr_params1, distr_params2)

Creates the TensorFlow operations for calculating the KL divergence between two distributions.

Parameters:
  • distr_params1 – Tuple of parameter tensors for first distribution.
  • distr_params2 – Tuple of parameter tensors for second distribution.
Returns:

KL divergence tensor.

tf_log_probability(distr_params, action)

Creates the TensorFlow operations for calculating the log probability of an action for a distribution.

Parameters:
  • distr_params – Tuple of distribution parameter tensors.
  • action – Action tensor.
Returns:

KL divergence tensor.

tf_parameterize(x)

Creates the TensorFlow operations for parameterizing a distribution conditioned on the given input.

Parameters:x – Input tensor which the distribution is conditioned on.
Returns:Tuple of distribution parameter tensors.
tf_regularization_loss()

Creates the TensorFlow operations for the distribution regularization loss.

Returns:Regularization loss tensor.
tf_sample(distr_params, deterministic)

Creates the TensorFlow operations for sampling an action based on a distribution.

Parameters:
  • distr_params – Tuple of distribution parameter tensors.
  • deterministic – Boolean input tensor indicating whether the maximum likelihood action
  • be returned. (should) –
Returns:

Sampled action tensor.

class tensorforce.core.distributions.Bernoulli(shape, probability=0.5, scope='bernoulli', summary_labels=())

Bases: tensorforce.core.distributions.distribution.Distribution

Bernoulli distribution for binary actions.

get_summaries()
get_variables(include_non_trainable=False)
state_action_value(distr_params, action)
state_value(distr_params)
tf_entropy(distr_params)
tf_kl_divergence(distr_params1, distr_params2)
tf_log_probability(distr_params, action)
tf_parameterize(x)
tf_regularization_loss()
tf_sample(distr_params, deterministic)
class tensorforce.core.distributions.Categorical(shape, num_actions, probabilities=None, scope='categorical', summary_labels=())

Bases: tensorforce.core.distributions.distribution.Distribution

Categorical distribution, for discrete actions

get_summaries()
get_variables(include_non_trainable=False)
state_action_value(distr_params, action)
state_value(distr_params)
tf_entropy(distr_params)
tf_kl_divergence(distr_params1, distr_params2)
tf_log_probability(distr_params, action)
tf_parameterize(x)
tf_regularization_loss()
tf_sample(distr_params, deterministic)
class tensorforce.core.distributions.Gaussian(shape, mean=0.0, log_stddev=0.0, scope='gaussian', summary_labels=())

Bases: tensorforce.core.distributions.distribution.Distribution

Gaussian distribution, for unbounded continuous actions.

get_summaries()
get_variables(include_non_trainable=False)
state_action_value(distr_params, action)
state_value(distr_params)
tf_entropy(distr_params)
tf_kl_divergence(distr_params1, distr_params2)
tf_log_probability(distr_params, action)
tf_parameterize(x)
tf_regularization_loss()
tf_sample(distr_params, deterministic)
class tensorforce.core.distributions.Beta(shape, min_value, max_value, alpha=0.0, beta=0.0, scope='beta', summary_labels=())

Bases: tensorforce.core.distributions.distribution.Distribution

Beta distribution, for bounded continuous actions

get_summaries()
get_variables(include_non_trainable=False)
tf_entropy(distr_params)
tf_kl_divergence(distr_params1, distr_params2)
tf_log_probability(distr_params, action)
tf_parameterize(x)
tf_regularization_loss()
tf_sample(distr_params, deterministic)