It is best not to read the answers until you've tried to answer the questions yourself.
Answer: Two or more layers:
Layer 1: input layer - units are undifferentiated, and take binary
values (active and inactive.
Layer 2 and beyond: units are partitioned into clusters of mutually inhibitory,
winner-take-all units: activations lie between 0 and 1.
Inter-layer Connections: every unit in a non-input layer has an
incoming connection from every unit in the layer below.
Incoming weights to each unit from the layer below sum to 1.
All such weights are positive.
Answer: Weights are initialised randomly (subject to the contraints mentioned in Q1). Only winning units learn. Learning occurs by adjusting the weights between layers. Each winning unit gives up a proportion g of its incoming weight for distribution between its active incoming connections. Modifications can be made to make sure units that never win actually do learn.
Answer: Clusters learn to act as feature detectors, and a k-unit cluster becomes a detector for a k-valued feature.
Copyright © Bill Wilson, 2009.
Bill Wilson's contact info
UNSW's CRICOS Provider No. is 00098G