Layers

../_images/feedforward_neuron.svg

In a standard feedforward neural network layer, each node \(i\) in layer \(k\) receives inputs from all nodes in layer \(k-1\), then transforms the weighted sum of these inputs:

\[z_i^k = \sigma\left( b_i^k + \sum_{j=1}^{n_{k-1}} w^k_{ji} z_j^{k-1} \right)\]

where \(\sigma: \mathbb{R} \to \mathbb{R}\) is an activation function.

In addition to standard feedforward layers, other types of layers are also commonly used:

Available Layers

This module contains classes for different types of network layers.

Layer([name]) Base class for network layers.
Input([name, ndim, sparse]) A layer that receives external input data.
Concatenate([name]) Concatenate multiple inputs along the last axis.
Flatten([name]) Flatten all but the batch index of the input.
Product([name]) Multiply several inputs together elementwise.
Reshape([name]) Reshape an input to have different numbers of dimensions.

Feedforward

Feedforward layers for neural network computation graphs.

Classifier(**kwargs) A classifier layer performs a softmax over a linear input transform.
Feedforward([name]) A feedforward neural network layer performs a transform of its input.
Tied(partner, **kwargs) A tied-weights feedforward layer shadows weights from another layer.

Convolution

Convolutional layers “scan” over input data.

Conv1(filter_size[, stride, border_mode]) 1-dimensional convolutions run over one data axis.
Conv2(filter_size[, stride, border_mode]) 2-dimensional convolutions run over two data axes.
Pool1([name])
Pool2([name])

Recurrent

Recurrent layers allow time dependencies in the computation graph.

RNN([h_0]) Standard recurrent network layer.
RRNN([rate]) An RNN with an update rate for each unit.
MUT1([h_0]) “MUT1” evolved recurrent layer.
GRU([h_0]) Gated Recurrent Unit layer.
LSTM([c_0]) Long Short-Term Memory (LSTM) layer.
MRNN([factors]) A recurrent network layer with multiplicative dynamics.
SCRN([rate, s_0, context_size]) Structurally Constrained Recurrent Network layer.
Clockwork(periods, **kwargs) A Clockwork RNN layer updates “modules” of neurons at specific rates.
Bidirectional([worker]) A bidirectional recurrent layer runs worker models forward and backward.

Layer Attributes

Now that we’ve seen how to specify values for the attributes of each layer in your model, we’ll look at the available attributes that can be customized. For many of these settings, you’ll want to use a dictionary (or create a theanets.Layer instance yourself) to specify non-default values.

  • size: The number of “neurons” in the layer. This value must be specified by the modeler when creating the layer. It can be specified by providing an integer, or as a tuple that contains an integer.

  • form: A string specifying the type of layer to use (see above). This defaults to “feedforward” but can be the name of any existing theanets.Layer subclass (including Custom Layers that you have defined).

  • name: A string name for the layer. If this isn’t provided when creating a layer, the layer will be assigned a default name. The default names for the first and last layers in a network are 'in' and 'out' respectively, and the layers in between are assigned the name “hidN” where N is the number of existing layers.

    If you create a layer instance manually, the default name is 'layerN' where N is the number of existing layers.

  • activation: A string describing the activation function to use for the layer. This defaults to 'relu'.

  • inputs: An integer or dictionary describing the sizes of the inputs that this layer expects. This is normally optional and defaults to the size of the preceding layer in a chain-like model. However, providing a dictionary here permits arbitrary layer interconnections. See Computation Graphs for more details.

  • mean: A float specifying the mean of the initial parameter values to use in the layer. Defaults to 0. This value applies to all parameters in the model that don’t have mean values specified for them directly.

  • mean_ABC: A float specifying the mean of the initial parameter values to use in the layer’s 'ABC' parameter. Defaults to 0. This can be used to specify the mean of the initial values used for a specific parameter in the model.

  • std: A float specifying the standard deviation of the initial parameter values to use in the layer. Defaults to 1. This value applies to all parameters in the model that don’t have standard deviations specified directly.

  • std_ABC: A float specifying the standard deviation of the initial parameter values to use in the layer’s 'ABC' parameter. Defaults to 1. This can be used to specify the standard deviation of the initial values used for a specific parameter in the model.

  • sparsity: A float giving the proportion of parameter values in the layer that should be initialized to zero. Nonzero values in the parameters will be drawn from a Gaussian with the specified mean and standard deviation as above, and then an appropriate number of these parameter values will randomly be reset to zero to make the parameter “sparse.”

  • sparsity_ABC: A float or vector of floats used to initialize the parameters in the layer’s 'ABC' parameter. This can be used to set the initial sparsity level for a particular parameter in the layer.

  • diagonal: A float or vector of floats used to initialize the parameters in the layer. If this is provided, weight matrices in the layer will be initialized to all zeros, with this value or values placed along the diagonal.

  • diagonal_ABC: A float or vector of floats used to initialize the parameters in the layer’s 'ABC' parameter. If this is provided, the relevant weight matrix in the layer will be initialized to all zeros, with this value or values placed along the diagonal.

  • rng: An integer or numpy random number generator. If specified the given random number generator will be used to create the initial values for the parameters in the layer. This can be useful for repeatable runs of a model.

In addition to these configuration values, each layer can also be provided with keyword arguments specific to that layer. For example, the MRNN recurrent layer type requires a factors argument, and the Conv1 1D convolutional layer requires a filter_size argument.

Custom Layers

Layers are the real workhorse in theanets; custom layers can be created to do all sorts of fun stuff. To create a custom layer, just create a subclass of theanets.Layer and give it the functionality you want.

As a very simple example, let’s suppose you wanted to create a normal feedforward layer but did not want to include a bias term:

import theanets
import theano.tensor as TT

class NoBias(theanets.Layer):
    def transform(self, inputs):
        return TT.dot(inputs, self.find('w'))

    def setup(self):
        self.add_weights('w', nin=self.input_size, nout=self.output_size)

Once you’ve set up your new layer class, it will automatically be registered and available in theanets.Layer.build using the name of your class:

layer = theanets.Layer.build('nobias', size=4)

or, while creating a model:

net = theanets.Autoencoder(
    layers=(4, (3, 'nobias', 'linear'), (4, 'tied', 'linear')),
)

This example shows how fast it is to create a PCA-like model that will learn the subspace of your dataset that spans the most variance—the same subspace spanned by the principal components.