Abstract:
In recent years it has been shown that first order recurrent neural networks
trained by gradient-descent can learn not only regular but also simple
context-free and context-sensitive languages.
However, the success rate was generally low and severe instability issues
were encountered.
...
Comparative experiments with and without incremental learning indicated that
incremental learning can accelerate and facilitate training.
Furthermore, incrementally trained networks generally resulted in
monotonic trajectories in hidden unit activation space, while the
trajectories of non-incrementally trained networks were oscillating.
The non-incrementally trained networks were more likely to generalise.
Abstract:
Loanword formation seems to provide a good test bed for the growing
field of computational phonology, since it occurs in a more tightly
controlled environment than other language processing tasks.
We show how feedforward neural networks and decision trees can be
trained to predict the phonological structure of English loanwords
in Japanese, and compare the performance of the two paradigms.
In each case the system produces a phonemic representation of the
Japanese form, after receiving as input the phonological feature
matrix of the current and surrounding phonemes.
The performance is improved with the inclusion of information about
the stress pattern, orthography of reduced vowels and location of
word boundaries.
Abstract:
Recent work by Siegelmann has shown that the computational
power of neural networks matches that of Turing Machines.
Proofs are based on a fractal encoding of states to simulate
the memory and operations of stacks.
In the present work, it is shown that similar stack-like
dynamics can be learned in recurrent neural networks from
simple sequence prediction tasks.
Two main types of network solutions are found and described
qualitatively as dynamical systems: damped oscillation
and entangled spiraling around fixed points.
The potential and limitations of each solution type are
established in terms of generalization on two different
context-free languages.
Both solution types constitute novel stack implementations
- generally in line with Siegelmann's theoretical work
- which supply insights into how embedded structures
of languages can be handled in analog hardware.
Abstract:
Following Tesauro's work on TD-Gammon, we used a 4000 parameter
feed-forward neural network to develop a competitive backgammon
evaluation function. Play proceeds by a roll of the dice,
application of the network to all legal moves, and choosing the
move with the highest evaluation. However, no back-propagation,
reinforcement learning methods were employed. Instead we apply
simple hill-climbing in a relative fitness environment. We start
with an initial champion of all zero weights and proceed simply by
playing the current champion network against a slightly mutated
challenger and changing weights if the challenger
wins. Surprisingly, this worked rather well. We investigate how
the peculiar dynamics of this domain enabled a previously
discarded weak method to succeed, by preventing suboptimal
equilibria in a `meta-game' of self-learning.
Abstract:
There is growing evidence to suggest that the success of a
co-evolutionary learning system may depend critically on the nature of
the environment in which the learner is placed, and on certain
attributes of the task domain, rather than the details of the
particular learning algorithm employed. We discuss how a learning
system can be modeled as a meta-level game between abstract entities
which we call performer, infiltrator
and evaluator. Learning can sometimes fail due to
collusive suboptimal equilibria in this meta-game of
learning. But some domains have special attributes which
seem to prevent such collusions and thereby facilitate co-evolutionary
advancement. A better understanding of these issues may help to
improve the design of co-evolutionary learning systems in the future.
Abstract:
Pollack (1991) demonstrated that second-order recurrent neural
networks can act as dynamical recognizers for formal languages
when trained on positive and negative examples, and observed both
phase transitions in learning and IFS-like fractal state
sets. Follow-on work focused mainly on the extraction and
minimization of a finite state automaton (FSA) from the trained
network. However, such networks are capable of inducing languages
which are not regular, and therefore not equivalent to any
FSA. Indeed, it may be simpler for a small network to fit its
training data by inducing such a non-regular language. But when is
the network's language not regular? In this paper, using a low
dimensional network capable of learning all the Tomita data sets,
we present an empirical method for testing whether the language
induced by the network is regular or not. We also provide a
detailed &epsilon-machine analysis of trained networks for both
regular and non-regular languages.
Abstract:
A framework for the study of path integrals on
adelic spaces is developed, and it is shown
that a family of path space measures on the
localizations of an algebraic number field may,
under certain conditions, be combined to form a
Abstract:
In this paper, we present a real-time algorithm that plans mostly optimal
trajectories for multiple robots in a dynamic environment.
This approach combines the use of a Delaunay triangulation to discretise the
environment, and a novel cubic spline representation for a robot trajectory
that meets the kinematic and dynamic constraints of the robot.
We show that for complex environments the shortest-distance path is not
always the shortest-time path due to these constraints.
The algorithm has been implemented on real robots, and we present
experimental results in cluttered environments.
Abstract:
We present a new paradigm extending the Iterated Prisoner's Dilemma
to multiple players. Our model is unique in granting players information
about past interactions between all pairs of players - allowing for
much more sophisticated social behaviour. We provide an overview of
preliminary results and discuss the implications in terms of the
evolutionary dynamics of strategies.
Abstract:
This paper outlines our project to construct an architecture for implementing
characters with believable personality in interactive software fiction.
There is a need for a detailed model of personality to drive the actions of
simulated character agents in areas such as simulation and the computer game
industry. While there are a number of related projects in similar areas,
this project's emphasis is on creating a versatile and efficient model,
suitable for use on standard hardware such as a home personal computer.
The paper also outlines the prototypes being developed and a brief description
of the project's future goals.
Abstract:
We report simulations of Cohen, Dunbar and McClelland's (CDM) model of the
Stroop effect showing how damace to the network can replicate empirical
data on older individuals and those with dementia of the Alzheimer's type (DAT).
The study is significant on three counts: firstly, the specific patterns of
damage required are theoretically justifiable, leading to a deeper understanding
of the role of connectionist models in understanding DAT; secondly, the simulations
show that in specific neurodegenerative conditions, the CDM model can replicate
behavioral effects, providing increased confidence in the model and countering
recently published claims to the contrary; and thirdly, a new component of the
model, the preliminary time constant, is identified as a major factor in the
ability of our damaged model to successfully replicate empirical data, indicating
a potential role for deficits in attention to task instructions.
Abstract:
As a test-bed for studying evolutionary and other machine
learning techniques, we have developed a simulated hockey game
called Shock in which players attempt to shoot
a puck into their enemy's goal during a fixed time period.
Multiple players may participate -
one can be controlled by a human user, while
the others are guided by artificial controllers.
In previous work, we introduced the
Shock environment and presented players that
received global input (as if from an overhead camera)
and were trained on a restricted task,
using an evolutionary hill-climbing algorithm,
with a staged learning approach.
Here, we expand upon this work by developing
players which instead receive input from local,
Braitenberg-style sensors.
These players are able to learn the task with fewer restrictions,
using a simpler fitness measure based purely
on whether or not a goal was scored.
Moreover, they evolve to develop robust strategies
for moving around the rink and scoring goals.
Abstract:
Robustness has long been recognised as a critical issue for
co-evolutionary learning. It has been achieved in a number of cases,
though usually in domains which involve some form of
non-determinism. We examine a deterministic domain
- a pseudo real-time two-player game called Tron - and evolve a
neural network player using a simple hill-climbing algorithm. The
results call into question the importance of determinism as a
requirement for successful co-evolutionary learning, and provide a
good opportunity to examine the relative importance of other factors.
Abstract:
A simulated hockey environment is introduced as a test bed for
studying adaptive behavior and evolution of robot controllers. A
near-frictionless playing surface is employed, partially mimicking
zero gravity conditions. We show how a neural network using a simple
evolutionary algorithm can develop nimble strategies for moving about
the rink and scoring goals quickly and effectively.
Abstract:
We consider a new form of co-evolutionary learning in which human
students and software tutors become partners in a cooperative learning
process. While the students are learning from the tutors, the tutors
will also be learning how to do their job more effectively through
their interactions with the students. Earlier work on game-playing
machine learners has brought to light important issues for
co-evolutionary learning; in particular, that these interactions can
be modeled as a meta-game between teachers and
students, and that learning may fail due to suboptimal
equilibria - for example, because of
collusion between the individual players - in this
meta-game of learning. We discuss some of the challenges that these
issues may raise for the designers of intelligent software tutoring
systems and describe a prototype Java applet that we have built in an
effort to explore how these challenges may best be met.
Abstract:
A simple recurrent neural network is trained on a one-step
look ahead prediction task for symbol sequences of the
context-sensitive anbncn language.
Using an evolutionary hill climbing strategy for incremental learning the network learns to predict sequences of strings up to depth n = 12. Experiments and the algorithms used are described. The activation of the hidden units of the trained network is displayed in a 3-D graph and analysed.
Abstract:
We develop a new framework for studying the biases that
recurrent networks bring to language processing tasks.
A semantic concept represented by a point in Euclidean space
is translated into a symbol sequence by an encoder network.
This sequence is then fed into a decoder network which
attempts to translate it back to the original concept.
We show how a pair of recurrent networks acting as encoder
and decoder can develop their own symbolic language that is
serially transmitted between them, either forwards or backwards.
The encoder and decoder bring different constraints to the task,
and the conflicting nature of these constraints, reflected in
the language that ultimately emerges, may provide important
clues to the structure of human languages.
Abstract:
Loanword phonology seeks to model the process by which foreign
words are `nativised' or incorporated into the phonological
system of the borrowing language ...
In this paper, we report the results obtained from a feed-forward
neural network, trained on an 1100 word corpus of American English
loanwords borrowed into Japanese...
Abstract:
Although recurrent neural networks can be trained to recognize formal
languages from positive and negative examples, it is difficult for
them to learn large and complex data sets due to the geometry of their
state space, which tends to push activations into the corners and
dilute the effect of differentials backpropagated through multiple
levels of recurrence. We propose a new framework for dynamical
recognizers in which the state space is the surface of an
n-sphere and the maps are either orthogonal or
Abstract:
This study shows experimentally how first
order recurrent neural networks learn to predict the mildly
context-sensitive language MA3 = {anbncn;
n >=1}. The training algorithm is evolutionary hill
climbing with an incremental learning strategy. The experiments
explore generalisation abilities of the networks emerging in the
process of evolution. Generalization with respect to depth n of
the language strings and regarding their ordering in the training
sequence is evaluated. Experiments without an incremental
learning strategy indicate that the latter accelerates or
facilitates training. Interestingly networks trained
incrementally and non-incrementally show a qualitatively
different pattern of hidden unit activity. Training on the
context-free anbn language and the context-sensitive
anbncn language is compared. The results show that
even when the networks are identical and the data sets have the
same size, learning of the context-free language is easier than
learning of the context-sensitive language.
Abstract:
Modifications to Recursive Auto-Associative Memory are presented,
which allow it to store deeper and more complex data structures than
previously reported. These modifications include adding extra layers
to the compressor and reconstructor networks, employing integer rather
than real-valued representations, pre-conditioning the weights and
pre-setting the representations to be compatible with them. The
resulting system is tested on a data set of syntactic trees extracted
from the Penn Treebank.