This code is an implementation of the feed forward neural network described
in:
L.
K. Saul and M. I. Jordan (2000). Attractor dynamics in feedforward neural
networks.
Neural Computation 12:6, 1313-1335.
Also, chapter 5 in the book "Graphical Models; Foundation
in Neural Computing"
The Code
The code is written in Java2 using SDKv1.4
There are 3 source files:
There are two utility files:
API Summary:
[also see the example "main" in SJNet for an example]
You can create a net of any number of layers, each
with any number of nodes.
To train (using the attractor dynamics discussed
in the paper), you essentially use
three methods:
1) Clamp(mask, data)
2) ComputeNet()
3) Adjust Weights()
where "data" and "mask" are, ragged array of doubles
and booleans repsectively, with the
same dimensions as the net. Visible (or "evidence")
nodes are "clamped" where there are
"true" values in mask. Learning rates, time step
size and number of iterations per
data presentation are passed in to these functions,
but default values are available for
use (see below).
To run (or test) the net, you can use either
ForwardRun()
which uses the standard one-pass feed-forward computation,
where each node value
is the weighted sum of nodes in the previous layer,
or
ComputeNet()
which uses the same attractor dynamics that are
used for training - and can be used to,
for example, "run the net backwards" by clamping
the output layer, and then examining the
input layer. Using ComputeNet(), any nodes in any
layer can be considered as input or
output (by clamping a pattern as "input" with an
appropriate mask, and examing the others.)
To retrieve values from the net after either kind
of run, use
GetValLayer(layer_num)
- after a feedforward() run
or
GetMuLayer(layer_num)
- after a ComputeNet() run
Note:
1) The way these nets are typically used (unlike the example "main"
in SJNET), is just as HMM models are used in speech.
A large collection of models is trained, each on a different class
of data. Then the collection is used by presenting new data to each model,
and choosing the model that is most likely to have generated the input
signal.
Thus in the case of this network, each class of training data is used to train a different net by "clamping" the output layer, "relaxing" the net and updating the weights over several epochs through the data. Then the test data is presented to each network in the collection, and the Lyaponuv value is used as a stand-in for the liklihood. The net that generates the best Lyaponov score is chosen as the net that identifies the input.
2) Normalize greyscale inputs to [0,1]