## SOM Algorithm

The SOM consists of two layers (Figure 7): the first one (input layer) is connected to each vector of the data set, the second one (output layer) forms a two-dimensional array of nodes (computational units). In the output layer, the units of the grid (reference vectors) give a representation of the distribution of the data set in an ordered way. Input and output layers are connected by the connection intensities represented in reference vectors. When an input vector x is sent through the network, each neuron k of the network computes the distance between the weight vector w and the input vector x. The output layer consists of D output neurons (units) which usually are arranged into a two-dimensional grid in order to better visualization. Patterned input vectors are assigned in each output unit according to the similarities of the input vectors. Therefore, output units are considered virtual units for input vectors. In two-dimensional map, rectangular and hexagonal configurations are commonly used. However, a hexagonal lattice is preferred, because it does not favor horizontal and vertical directions as much as the rectangular array.

Observed values

Observed values

(d)

0.5 0.4

0.3

0.2

• •

05

-0.2

-0.3

Estimated values

d te at

 1 R2 = 0.673 6 • • 0.5 • • • • • • 1. • n • Observed values id si e Estimated values d te at Estimated values Observed values Observed values Estimated values Figure 6 Relations between observed values and calculated values by the MLP model: (a) training data set, (b) validation data set, and (c) testing data set. (d-f) Residuals of output values respectively for training, validation, and testing data sets. Ecological community (species) Data matrix ^ Species1 Sample1 o Sample2 o Samplew1 o Sample„ O Input Ä layer Self-organizing map Output layer Species,. Speciesn o o o o x x Output layer Figure 7 A two-dimensional SOM. Each sphere symbolizes each neuron in the input layer and the output layer. Among all D output neurons, the best matching unit (BMU), which has minimum distance between weight and input vectors, is the winner. As other clustering algorithm, many different kinds of distance measure algorithms can be applied. Euclidean distance is also one of the distance measures. With ecological data which have high variance and many zero values of variables, Bray-Curtis distance can also be applied as in the cluster analysis. For the BMU and its neighborhood neurons, the weight vectors w are updated by the SOM learning rule as follows: where wij(t) is a weight between a neuron i in the input layer and a neuron j in the output layer at iteration time t, a(t) is a learning-rate factor which is a decreasing function of the iteration time t, and hjc(t) is a neighborhood function (a smoothing kernel defined over the lattice points) that defines the size of neighborhood of the winner neuron (c) to be updated during the learning process. The explicit form of hjC(t) is as follows: where the parameter c(t) is a decreasing function of iteration time t defining the width of the kernel and \\rj - rc 11 is the distance in the output map between the winner neuron (c) and its neighbor neuron j). In other words, neighborhood function, hjc(t), depends on iteration time t and radius distance r in the map between the winning unit and the unit to be updated. For the choice of the shape and the adaptation of the width of the neighborhood, various neighborhood functions are available. If the SOM is not very large, selection of process parameters is not very crucial. Special caution, however, is required in the choice of the size of neighborhood function index. If the neighborhood is too small to start with, the map will not be ordered globally. The initial radius can be even more than half the diameter of the network. This learning process is continued until a stopping criterion is met, usually, when weight vectors stabilize or when a number of iterations are completed. This learning process results in training the network to pattern the input vectors and preserves the connection intensities in the weight vectors. The training is usually done in two phases: at first a rough training for ordering with a large neighborhood radius, and then a fine-tuning with a small radius. For good statistical accuracy, the number of iteration steps must be at least 500 times the number of network units. The algorithm is summarized in Box 2. The training SOM results in classifying the input vectors by the weight vectors they are closest to and produces virtual communities in each virtual unit (output unit) of the SOM. Each unit represents typical type of input vectors (i.e., species composition) assigned in each unit consisting ofconnection intensities ofeach species. In other words, the weight vectors of each SOM output unit are representative for input vectors assigned in each SOM output unit. Box 2 SOM learning algorithm 1. Initialize weights, Wfi), to small random values. 2. Present an input vector (x). 3. Compute the distance between the weight vectors and the input vector, dj(f), for all units. 4. Determine the best matching units (BMU; winner node),/*, for the input vector such as dj.(t) — Min(dj(f)). 5. Determine neighbors whose distance to the winner node on the feature map of the network is less than or equal to r(f). If the node is winner or its neighbor, assign Z/ — 1, otherwise Z/ — 0. 6. Update weights, Wj{f)-q(Z) and r(t) are decreased with time as convergence is reached. 7. Go to step 3 and repeat the process for all input vectors until the total distance for the winner nodes for all input vectors is sufficiently small.