Networks

Default network: LayeredNetwork with default argument layers, so a list is a short-form specification of a sequential layer-stack network architecture:

Agent.create(
    ...
    policy=dict(network=[
        dict(type='dense', size=64, activation='tanh'),
        dict(type='dense', size=64, activation='tanh')
    ]),
    ...
)

The AutoNetwork automatically configures a suitable network architecture based on input types and shapes, and offers high-level customization.

Details about the network layer architecture (policy, baseline, state-preprocessing) can be accessed via agent.get_architecture().

Note that the final action/value layer of the policy/baseline network is implicitly added, so the network output can be of arbitrary size and use any activation function, and is only required to be a rank-one embedding vector, or optionally have the same shape as the action in the case of a higher-rank action shape.

Multi-input and other non-sequential networks are specified as nested list of lists of layers, where each of the inner lists forms a sequential component of the overall network architecture. The following example illustrates how to specify such a more complex network, by using the special layers Register and Retrieve to combine the sequential network components:

Agent.create(
    states=dict(
        observation=dict(type='float', shape=(16, 16, 3), min_value=-1.0, max_value=1.0),
        attributes=dict(type='int', shape=(4, 2), num_values=5)
    ),
    ...
    policy=[
        [
            dict(type='retrieve', tensors=['observation']),
            dict(type='conv2d', size=32),
            dict(type='flatten'),
            dict(type='register', tensor='obs-embedding')
        ],
        [
            dict(type='retrieve', tensors=['attributes']),
            dict(type='embedding', size=32),
            dict(type='flatten'),
            dict(type='register', tensor='attr-embedding')
        ],
        [
            dict(
                type='retrieve', aggregation='concat',
                tensors=['obs-embedding', 'attr-embedding']
            ),
            dict(type='dense', size=64)
        ]
    ],
    ...
)

In the case of multiple action components, some policy types, like parametrized_distributions, support the specification of additional network outputs for some/all actions via registered tensors:

Agent.create(
    ...
    actions=dict(
        action1=dict(type='int', shape=(), num_values=5),
        action2=dict(type='float', shape=(), min_value=-1.0, max_value=1.0)
    ),
    ...
    policy=dict(
        type='parametrized_distributions',
        network=[
            dict(type='dense', size=64),
            dict(type='register', tensor='action1-embedding'),
            dict(type='dense', size=64)
            # Final output implicitly used for remaining actions
        ],
        single_output=False
    )
    ...
)
class tensorforce.core.networks.AutoNetwork(*, size=64, depth=2, final_size=None, final_depth=1, rnn=False, device=None, l2_regularization=None, name=None, inputs_spec=None, outputs=None, internal_rnn=None)

Network whose architecture is automatically configured based on input types and shapes, offering high-level customization (specification key: auto).

Parameters:
  • size (int > 0) – Layer size, before concatenation if multiple states (default: 64).
  • depth (int > 0) – Number of layers per state, before concatenation if multiple states (default: 2).
  • final_size (int > 0) – Layer size after concatenation if multiple states (default: layer size).
  • final_depth (int > 0) – Number of layers after concatenation if multiple states (default: 1).
  • rnn (false | parameter, int >= 0) – Whether to add an LSTM cell with internal state as last layer, and if so, horizon of the LSTM for truncated backpropagation through time (default: false).
  • device (string) – Device name (default: inherit value of parent module).
  • l2_regularization (float >= 0.0) – Scalar controlling L2 regularization (default: inherit value of parent module).
  • name (string) – internal use.
  • inputs_spec (specification) – internal use.
  • outputs (iter[string]) – internal use.
class tensorforce.core.networks.LayeredNetwork(layers, *, device=None, l2_regularization=None, name=None, inputs_spec=None, outputs=None)

Network consisting of Tensorforce layers (specification key: custom or layered), which can be specified as either a list of layer specifications in the case of a standard sequential layer-stack architecture, or as a list of list of layer specifications in the case of a more complex architecture consisting of multiple sequential layer-stacks. Note that the final action/value layer of the policy/baseline network is implicitly added, so the network output can be of arbitrary size and use any activation function, and is only required to be a rank-one embedding vector, or optionally have the same shape as the action in the case of a higher-rank action shape.

Parameters:
  • layers (iter[specification] | iter[iter[specification]]) – Layers configuration, see the layers documentation (required).
  • device (string) – Device name (default: inherit value of parent module).
  • l2_regularization (float >= 0.0) – Scalar controlling L2 regularization (default: inherit value of parent module).
  • name (string) – internal use.
  • inputs_spec (specification) – internal use.
  • outputs (iter[string]) – internal use.
class tensorforce.core.networks.KerasNetwork(*, model, device=None, l2_regularization=None, name=None, inputs_spec=None, outputs=None, **kwargs)

Wrapper class for networks specified as Keras model (specification key: keras).

Parameters:
  • model (tf.keras.Model) – Keras model (required).
  • device (string) – Device name (default: inherit value of parent module).
  • l2_regularization (float >= 0.0) – Scalar controlling L2 regularization (default: inherit value of parent module).
  • name (string) – internal use.
  • inputs_spec (specification) – internal use.
  • outputs (iter[string]) – internal use.
  • kwargs – Arguments for the Keras model.