Optimizers¶
Default optimizer: OptimizerWrapper
which offers additional update modifier options, so instead of using TFOptimizer
directly, a customized Adam optimizer can be specified via:
Agent.create(
...
optimizer=dict(
optimizer='adam', learning_rate=1e-3, clipping_threshold=1e-2,
multi_step=10, subsampling_fraction=64, linesearch_iterations=5,
doublecheck_update=True
),
...
)
-
class
tensorforce.core.optimizers.
OptimizerWrapper
(optimizer, *, learning_rate=0.001, clipping_threshold=None, multi_step=1, subsampling_fraction=1.0, linesearch_iterations=0, doublecheck_update=False, name=None, arguments_spec=None, optimizing_iterations=None, **kwargs)¶ Optimizer wrapper, which performs additional update modifications, argument order indicates modifier nesting from outside to inside (specification key:
optimizer_wrapper
).Parameters: - optimizer (specification) – Optimizer (required).
- learning_rate (parameter, float > 0.0) – Learning rate (default: 1e-3).
- clipping_threshold (parameter, float > 0.0) – Clipping threshold (default: no clipping).
- multi_step (parameter, int >= 1) – Number of optimization steps (default: single step).
- subsampling_fraction (parameter, int > 0 | 0.0 < float <= 1.0) – Absolute/relative fraction of batch timesteps to subsample, update_frequency * multi_step should be at least 1 if relative subsampling_fraction (default: no subsampling).
- linesearch_iterations (parameter, int >= 0) – Maximum number of line search iterations, using a backtracking factor of 0.75 (default: no line search).
- doublecheck_update (bool) – Check whether update has decreased loss and otherwise reverse it
- name (string) – (internal use).
- arguments_spec (specification) – internal use.
-
class
tensorforce.core.optimizers.
TFOptimizer
(*, optimizer, learning_rate, gradient_norm_clipping=None, name=None, arguments_spec=None, **kwargs)¶ TensorFlow optimizer (specification key:
tf_optimizer
,adadelta
,adagrad
,adam
,adamax
,adamw
,ftrl
,lazyadam
,nadam
,radam
,ranger
,rmsprop
,sgd
,sgdw
)Parameters: - optimizer (
adadelta
|adagrad
|adam
|adamax
|adamw
|ftrl
|lazyadam
|nadam
|radam
|ranger
|rmsprop
|sgd
|sgdw
) – TensorFlow optimizer name, see TensorFlow docs and TensorFlow Addons docs (required unless given by specification key). - learning_rate (parameter, float > 0.0) – Learning rate (required).
- gradient_norm_clipping (parameter, float > 0.0) – Clip gradients by the ratio of the sum of their norms (default: 1.0).
- name (string) – (internal use).
- arguments_spec (specification) – internal use.
- kwargs – Arguments for the TensorFlow optimizer, special values “decoupled_weight_decay”, “lookahead” and “moving_average”, see TensorFlow docs and TensorFlow Addons docs.
- optimizer (
-
class
tensorforce.core.optimizers.
NaturalGradient
(*, learning_rate, cg_max_iterations=10, cg_damping=0.1, only_positive_updates=True, name=None, arguments_spec=None)¶ Natural gradient optimizer (specification key:
natural_gradient
).Parameters: - learning_rate (parameter, float > 0.0) – Learning rate as KL-divergence of distributions between optimization steps (required).
- cg_max_iterations (int >= 1) – Maximum number of conjugate gradient iterations. (default: 10).
- cg_damping (0.0 <= float <= 1.0) – Conjugate gradient damping factor. (default: 0.1).
- only_positive_updates (bool) – Whether to only perform updates with positive improvement estimate (default: true).
- name (string) – (internal use).
- arguments_spec (specification) – internal use.
-
class
tensorforce.core.optimizers.
Evolutionary
(*, learning_rate, num_samples=1, name=None, arguments_spec=None)¶ Evolutionary optimizer, which samples random perturbations and applies them either as positive or negative update depending on their improvement of the loss (specification key:
evolutionary
).Parameters: - learning_rate (parameter, float > 0.0) – Learning rate (required).
- num_samples (parameter, int >= 1) – Number of sampled perturbations (default: 1).
- name (string) – (internal use).
- arguments_spec (specification) – internal use.
-
class
tensorforce.core.optimizers.
ClippingStep
(*, optimizer, threshold, mode='global_norm', name=None, arguments_spec=None)¶ Clipping-step update modifier, which clips the updates of the given optimizer (specification key:
clipping_step
).Parameters: - optimizer (specification) – Optimizer configuration (required).
- threshold (parameter, float > 0.0) – Clipping threshold (required).
- mode ('global_norm' | 'norm' | 'value') – Clipping mode (default: ‘global_norm’).
- name (string) – (internal use).
- arguments_spec (specification) – internal use.
-
class
tensorforce.core.optimizers.
MultiStep
(*, optimizer, num_steps, name=None, arguments_spec=None)¶ Multi-step update modifier, which applies the given optimizer for a number of times (specification key:
multi_step
).Parameters: - optimizer (specification) – Optimizer configuration (required).
- num_steps (parameter, int >= 1) – Number of optimization steps (required).
- name (string) – (internal use).
- arguments_spec (specification) – internal use.
-
class
tensorforce.core.optimizers.
DoublecheckStep
(*, optimizer, name=None, arguments_spec=None)¶ Double-check update modifier, which checks whether the update of the given optimizer has decreased the loss and otherwise reverses it (specification key:
doublecheck_step
).Parameters: - optimizer (specification) – Optimizer configuration (required).
- name (string) – (internal use).
- arguments_spec (specification) – internal use.
-
class
tensorforce.core.optimizers.
LinesearchStep
(*, optimizer, max_iterations, backtracking_factor=0.75, name=None, arguments_spec=None)¶ Line-search-step update modifier, which performs a line search on the update step returned by the given optimizer to find a potentially superior smaller step size (specification key:
linesearch_step
).Parameters: - optimizer (specification) – Optimizer configuration (required).
- max_iterations (parameter, int >= 1) – Maximum number of line search iterations (required).
- backtracking_factor (parameter, 0.0 < float < 1.0) – Line search backtracking factor (default: 0.75).
- name (string) – (internal use).
- arguments_spec (specification) – internal use.
-
class
tensorforce.core.optimizers.
SubsamplingStep
(*, optimizer, fraction, name=None, arguments_spec=None)¶ Subsampling-step update modifier, which randomly samples a subset of batch instances before applying the given optimizer (specification key:
subsampling_step
).Parameters: - optimizer (specification) – Optimizer configuration (required).
- fraction (parameter, int > 0 | 0.0 < float <= 1.0) – Absolute/relative fraction of batch timesteps to subsample (required).
- name (string) – (internal use).
- arguments_spec (specification) – internal use.
-
class
tensorforce.core.optimizers.
Synchronization
(*, update_weight, sync_frequency=None, name=None, arguments_spec=None)¶ Synchronization optimizer, which updates variables periodically to the value of a corresponding set of source variables (specification key:
synchronization
).Parameters: - optimizer (specification) – Optimizer configuration (required).
- update_weight (parameter, 0.0 < float <= 1.0) – Update weight (required).
- sync_frequency (parameter, int >= 1) – Interval between updates which also perform a synchronization step (default: every update).
- name (string) – (internal use).
- arguments_spec (specification) – internal use.
-
class
tensorforce.core.optimizers.
Plus
(*, optimizer1, optimizer2, name=None, arguments_spec=None)¶ Additive combination of two optimizers (specification key:
plus
).Parameters: - optimizer1 (specification) – First optimizer configuration (required).
- optimizer2 (specification) – Second optimizer configuration (required).
- name (string) – (internal use).
- arguments_spec (specification) – internal use.