The aim of this assignment is for you to demonstrate:
You must write a program, in Java, C, C++, Python, or Matlab (other languages OK with permission, if you have a good reason) which implements a Kohonen network for a 2 -dimensional map. The input data set is a list of 2-dimensional vectors in the file data24.txt. There are 200 vectors in the file, one per line.
You first task is to plot each of the inputs using a package such as
gnuplot
, or a
similar plotting package that does scatter plots. I don't mind whether
you use gnuplot
or some other plotting tool:
gnuplot
is available on the CSE Linux workstations.
Produce a scatter plot,
where each point in the plot is a vector from the input. You will notice
that the inputs form a number of clusters.
[To produce your
scatterplot in gnuplot
, get yourself a copy of e.g. data24.txt,
and then while in the same working directory as datat24.txt, type
gnuplot
at the Unix prompt, and then plot "data24.txt"
at the gnuplot>
prompt. If you had two files of plot data, say
"data24.txt" and "result.dat", then you could plot the two sets of data
together, byplot "data24.txt", "result.dat"
. The two data
sets get different colours.
To print your plot, one way is to do the following:
set output "filename.ps"
where filename.ps is the name of the output file.
set terminal postscript
replot
set output
set terminal x11
lpr -P
<your favourite printer>
filename.ps
Type exit
to leave gnuplot
.]
The Unix utility ps2epsi may be useful in assembling your plots into a submittable document.
If you prefer .png
image files, you can do it this way:
set output "filename.png"
set terminal png
plot "data24.txt", "myresults.dat"
The program that you write will produce a self organised map to cover
the clusters. The output map is a two dimensional array of neurons of
size OutputSizeX
by OutputSizeY
. The weights
connecting the input units to the output map should be initialised to
random values in the range -0.1 to 0.1.
Recall that for each input vector, a Kohonen network adapts by:
The algorithm you will use (which is described below) differs in some regards from the one presented in lectures. The one presented below is a little simpler than the one in lectures. The main change is that the convergence phase is omitted.
For the purposes of explaining the algorithm, let w[i][j] be the weight from input i to output j, and let x[i] be the value of the i-th component of an input vector x. To compute the similarity match, compute the following for each output (i.e. map) neuron,
m[j] = sqrt ( sum over i: (w[i][j] - x[i])2)
The winning output unit is the one with the smallest value for m.
To adjust the weights, for each weight, w[i][j], you will need to do the following
w[i][j] := w[i][j] + LR * h * (x[i] - w[i][j])
The parameter LR will have an initial value and decrease throughout processing. The parameter h is a function of the distance between j and the winning neuron, and is given by
h = exp (- d2/ (2 * nbdwidth2))
where d is the Euclidean distance, in the map, between j and the winning neuron. For example, if the winning neuron was (1,2) and j is (0,4), then
d = sqrt ((1 - 0)2 + (2 - 4)2)= 2.236
The parameter nbdwidth will have an initial value and decrease throughout processing.
The assignment will use exponentially decaying values for LR
and nbdwidth. The initial values should be set to InitialLR
and InitialNbdWidth
respectively. At the end of each
epoch adjust the parameters using
LR = LR * (1 - LRDecay
)
nbdwidth = nbdwidth * (1 - NbdWidthDecay
)
Your program should define at least the following constants:
InputSize |
The size of the input vectors (2) | m |
OutputSizeX |
The size of the output map in the x dimension | |
OutputSizeY |
The size of the output map in the y dimension | |
InitialLR |
The initial value for learning rate | η0 |
LRDecay |
The decay rate for learning rate | 1 - exp(-1/τ2) |
InitialNbdWidth |
The initial value for neighbourhood width | σ0 |
NbdWidthDecay |
The decay rate for neighbourhood width | 1 - exp(-1/τ1) |
You will need to write code to read the input file and write out the weight matrix. The output should be one line for each output unit, j, in the following format:
(MapX[j],MapY[j])
w[1][j]
w[2][j]
where MapX[j] is the x position of j in the
output map, MapY[j] is the y position of j
in the output map, and w[1][j] w[2][j]
are the weights from the inputs to unit j. For example see sampleout.txt
. The weight matrices, in
the format shown in sampleout.txt
, should be included in your
report.
For each run, you should use 5000 epochs. An epoch = a presentation of each input vector (+ consequent update of the weights).
Presenting the input vectors in a fixed order may bias the learning process, so you should randomise the order of the input vectors at the beginning of each epoch.
Randomisation is, in a sense, the opposite of sorting. You can adapt the selection-sort algorithm to randomise your vectors. Here is pseudocode for randomising N objects:
for i = 1 to N do select an object at random from among objecti+1 ... objectN swap the selected object with objecti
Run your program on the provided test data, data24.txt, for 5000 epochs using a 4 x 2 output map, with the following parameter values:
InitialLR |
0.1 |
LRDecay |
0.001 |
InitialNbdWidth |
2.0 |
NbdWidthDecay |
0.001 |
Use your plotting application to plot the final weights (i.e., a scatter plot of columns 2 and 3 of the output). How do the weights relate to the clusters? What is the relationship between the output map and clusters?
Change the output map to 4 x 4, and re-run. Plot the output weights as before and describe how the new map relates to the clusters. What is your explanation for the results?
Run the program again with a 2 x 3 map. Plot the output weights as before and describe how the weights map to the clusters. What is your explanation for the results?
Reconfigure your program to use a 4 x 2 output map (as per run 1), but use the input file data33.txt. What observations do you make? In particular, what is the relationship between the topology of the input cluster and the topology of the resulting map?
Extend your program to handle 1-D SOMs, and run 1-D SOM training on the input file data33.txt. What observations do you make?
You may want to refer to the course notes on Kohonen networks.
One of the aims of the assignment is to help you wrap your head around selforganising maps, and you will probably get the most benefit of this type if you code it yourself from scratch. There will be a question on SoMs on the final exam, and doing this assignment will help you prepare for it. Whatever programming language you use, you must write useful comments and use meaningful identifiers in your code!
This assignment will contribute to 10% of your final mark (subject to the assessment scheme for this course).
You will be assessed on the following points:
Your code should not depend on library code that provides SOM support.
It is not necessary that you use arrays in your program in the same fashion as the formulas presented above. It is up to you to select appropriate data structures.
Note: you will not be assessed on the volume submitted! It is expected that your results and discussion will be no more than 5 pages (including diagrams) and your program less than 5 pages. Any more than this and you may be wasting your time and making it difficult for the marker to give you a good mark.
Remember the aim of the assignment. This is your opportunity to demonstrate your competence in implementing Kohonen networks and your understanding of the principles and issues. Filling your report with reams of output or trivial analysis will not help your case.
You should aim for depth of analysis in your report. Do not treat it as a question and answer exercise. The questions above are meant to be prompts to encourage you to think about what is happening in these explorations.
You will need to submit, via give
(see below), your code and also
a pdf report that includes:
Submit your work using the following give
command on a CSE Linux workstation:
give cs9444 kohonen som.c som.pdf
(you can replacesom.c
withsom.java
,som.cpp
, etc.)
Deadline: 11.55pm on Friday 1 November 2013
© Bill Wilson and the University of New South Wales, 2013