Optimizers

Default optimizer: MetaOptimizerWrapper

class tensorforce.core.optimizers.ClippingStep(name, optimizer, threshold, mode='global_norm', summary_labels=None)[source]

Clipping-step meta optimizer, which clips the updates of the given optimizer (specification key: clipping_step).

Parameters:
  • name (string) – Module name (internal use).
  • optimizer (specification) – Optimizer configuration (required).
  • threshold (parameter, float >= 0.0) – Clipping threshold (required).
  • mode ('global_norm' | 'norm' | 'value') – Clipping mode (default: ‘global_norm’).
  • summary_labels ('all' | iter[string]) – Labels of summaries to record (default: inherit value of parent module).
class tensorforce.core.optimizers.Evolutionary(name, learning_rate, num_samples=1, unroll_loop=False, summary_labels=None)[source]

Evolutionary optimizer, which samples random perturbations and applies them either as positive or negative update depending on their improvement of the loss (specification key: evolutionary).

Parameters:
  • name (string) – Module name (internal use).
  • learning_rate (parameter, float >= 0.0) – Learning rate (required).
  • num_samples (parameter, int >= 0) – Number of sampled perturbations (default: 1).
  • unroll_loop (bool) – Whether to unroll the sampling loop (default: false).
  • summary_labels ('all' | iter[string]) – Labels of summaries to record (default: inherit value of parent module).
class tensorforce.core.optimizers.GlobalOptimizer(name, optimizer, summary_labels=None)[source]

Global meta optimizer, which applies the given optimizer to the local variables, then applies the update to a corresponding set of global variables, and subsequently updates the local variables to the value of the global variables; will likely change in the future (specification key: global_optimizer).

Parameters:
  • name (string) – Module name (internal use).
  • optimizer (specification) – Optimizer configuration (required).
  • summary_labels ('all' | iter[string]) – Labels of summaries to record (default: inherit value of parent module).
class tensorforce.core.optimizers.MetaOptimizerWrapper(name, optimizer, multi_step=1, subsampling_fraction=1.0, clipping_threshold=None, optimizing_iterations=0, summary_labels=None, **kwargs)[source]

Meta optimizer wrapper (specification key: meta_optimizer_wrapper).

Parameters:
  • name (string) – Module name (internal use).
  • optimizer (specification) – Optimizer configuration (required).
  • multi_step (parameter, int > 0) – Number of optimization steps (default: single step).
  • subsampling_fraction (parameter, 0.0 < float <= 1.0) – Fraction of batch timesteps to subsample (default: no subsampling).
  • clipping_threshold (parameter, float > 0.0) – Clipping threshold (default: no clipping).
  • optimizing_iterations (parameter, int >= 0) – Maximum number of line search iterations (default: no optimizing).
  • summary_labels ('all' | iter[string]) – Labels of summaries to record (default: inherit value of parent module).
class tensorforce.core.optimizers.MultiStep(name, optimizer, num_steps, unroll_loop=False, summary_labels=None)[source]

Multi-step meta optimizer, which applies the given optimizer for a number of times (specification key: multi_step).

Parameters:
  • name (string) – Module name (internal use).
  • optimizer (specification) – Optimizer configuration (required).
  • num_steps (parameter, int >= 0) – Number of optimization steps (required).
  • unroll_loop (bool) – Whether to unroll the repetition loop (default: false).
  • summary_labels ('all' | iter[string]) – Labels of summaries to record (default: inherit value of parent module).
class tensorforce.core.optimizers.NaturalGradient(name, learning_rate, cg_max_iterations=10, cg_damping=0.001, cg_unroll_loop=False, summary_labels=None)[source]

Natural gradient optimizer (specification key: natural_gradient).

Parameters:
  • name (string) – Module name (internal use).
  • learning_rate (parameter, float >= 0.0) – Learning rate as KL-divergence of distributions between optimization steps (required).
  • cg_max_iterations (int >= 0) – Maximum number of conjugate gradient iterations. (default: 10).
  • cg_damping (0.0 <= float <= 1.0) – Conjugate gradient damping factor. (default: 1e-3).
  • cg_unroll_loop (bool) – Whether to unroll the conjugate gradient loop (default: false).
  • summary_labels ('all' | iter[string]) – Labels of summaries to record (default: inherit value of parent module).
class tensorforce.core.optimizers.OptimizingStep(name, optimizer, ls_max_iterations=10, ls_accept_ratio=0.9, ls_mode='exponential', ls_parameter=0.5, ls_unroll_loop=False, summary_labels=None)[source]

Optimizing-step meta optimizer, which applies line search to the given optimizer to find a more optimal step size (specification key: optimizing_step).

Parameters:
  • name (string) – Module name (internal use).
  • optimizer (specification) – Optimizer configuration (required).
  • ls_max_iterations (parameter, int >= 0) – Maximum number of line search iterations (default: 10).
  • ls_accept_ratio (parameter, 0.0 <= float <= 1.0) – Line search acceptance ratio (default: 0.9).
  • ls_mode ('exponential' | 'linear') – Line search mode, see line search solver (default: ‘exponential’).
  • ls_parameter (parameter, 0.0 <= float <= 1.0) – Line search parameter, see line search solver (default: 0.5).
  • ls_unroll_loop (bool) – Whether to unroll the line search loop (default: false).
  • summary_labels ('all' | iter[string]) – Labels of summaries to record (default: inherit value of parent module).
class tensorforce.core.optimizers.Plus(name, optimizer1, optimizer2, summary_labels=None)[source]

Additive combination of two optimizers (specification key: plus).

Parameters:
  • name (string) – Module name (internal use).
  • optimizer1 (specification) – First optimizer configuration (required).
  • optimizer2 (specification) – Second optimizer configuration (required).
  • summary_labels ('all' | iter[string]) – Labels of summaries to record (default: inherit value of parent module).
class tensorforce.core.optimizers.SubsamplingStep(name, optimizer, fraction, summary_labels=None)[source]

Subsampling-step meta optimizer, which randomly samples a subset of batch instances before applying the given optimizer (specification key: subsampling_step).

Parameters:
  • name (string) – Module name (internal use).
  • optimizer (specification) – Optimizer configuration (required).
  • fraction (parameter, 0.0 <= float <= 1.0) – Fraction of batch timesteps to subsample (required).
  • summary_labels ('all' | iter[string]) – Labels of summaries to record (default: inherit value of parent module).
class tensorforce.core.optimizers.Synchronization(name, sync_frequency=1, update_weight=1.0, summary_labels=None)[source]

Synchronization optimizer, which updates variables periodically to the value of a corresponding set of source variables (specification key: synchronization).

Parameters:
  • name (string) – Module name (internal use).
  • optimizer (specification) – Optimizer configuration (required).
  • sync_frequency (parameter, int >= 1) – Interval between updates which also perform a synchronization step (default: every update).
  • update_weight (parameter, 0.0 <= float <= 1.0) – Update weight (default: 1.0).
  • summary_labels ('all' | iter[string]) – Labels of summaries to record (default: inherit value of parent module).
class tensorforce.core.optimizers.TFOptimizer(name, optimizer, learning_rate=0.0003, gradient_norm_clipping=1.0, summary_labels=None, **kwargs)[source]

TensorFlow optimizer (specification key: tf_optimizer, adadelta, adagrad, adam, adamax, adamw, ftrl, lazyadam, nadam, radam, ranger, rmsprop, sgd, sgdw)

Parameters:
  • name (string) – Module name (internal use).
  • optimizer (adadelta | adagrad | adam | adamax | adamw | ftrl | lazyadam | nadam | radam | ranger | rmsprop | sgd | sgdw) – TensorFlow optimizer name, see TensorFlow docs and TensorFlow Addons docs (required unless given by specification key).
  • learning_rate (parameter, float >= 0.0) – Learning rate (default: 3e-4).
  • gradient_norm_clipping (parameter, float >= 0.0) – Clip gradients by the ratio of the sum of their norms (default: 1.0).
  • summary_labels ('all' | iter[string]) – Labels of summaries to record (default: inherit value of parent module).
  • kwargs – Arguments for the TensorFlow optimizer, special values “decoupled_weight_decay”, “lookahead” and “moving_average”, see TensorFlow docs and TensorFlow Addons docs.