| Topics of each lecture |
| Wk. |
Date |
Topic/Goal |
| 1 |
Tu 14 Jan |
Introduction: Models in general; connectionist models in
particular. (See web notes for the
first few lectures.)
- Modeling in general.
- Connectionist models in particular.
|
| Th 16 Jan |
- Overview of progression of architectures.
- Class demo of the PDP software: Its look and feel.
|
| 2 |
Tu 21 Jan |
- Details of files in PDP software.
- Example of modeling: a=F/m.
Linear associators
- Linear algebra:
1. Why care about linear functions? (The principle
of superposition.)
|
| Th 23 Jan |
- Linear algebra, continued:
2. Vector spaces and linear transformations thereof.
3. Matrix representation:
3.1. "Basis" of a vector space.
3.2. Representation of vectors w.r.t. a basis.
|
| 3 |
Tu 28 Jan |
- Linear algebra, continued:
3.3. Representation of linear transformations w.r.t. a basis.
4. Eigenvectors and eigenvalues: geometrically and in matrix
representation.
- Hebbian learning in linear networks:
0. Notation
|
| Th 30 Jan |
- Hebbian learning in linear networks, continued:
1. Local motivation - neurons.
2. Global motivation - gradient ascent on "goodness".
3. Properties of Hebbian learning:
3.1. Weight "explosion"
3.1.1. Weight decay - local and global motivations.
|
| 4 |
Tu 4 Feb |
- Hebbian learning in linear networks, continued:
3.2. Perfect recall for orthonormal inputs.
3.3. Output is always a linear combination of teacher patterns.
- An application of unsupervised Hebbian learning:
Center-surround receptive field development. Simulation and
analysis in terms of maximal information preservation, by Linsker.
(Not covered due to lack of time.)
- Learning by error reduction:
1. Global motivation (gradient descent on error) yields local
learning mechanism.
|
| Th 6 Feb |
- Learning by error reduction, continued:
2. Properties of the delta-rule in linear nets:
2.1. Perfect recall for linearly independent inputs.
2.2. Same as Hebbian for orthonormal inputs.
2.3. Output is always a linear combination of teacher patterns.
2.4. The case of auto-association.
- An application of delta-rule learning: Pattern completion in Kohnonen et al.'s auto-encoder.
|
| 5 |
Tu 11 Feb |
- More properties of the delta-rule in linear nets:
Relation of delta rule to multiple linear regression.
Relation between the Hebb rule and delta rule.
- Levels of description in linear networks:
1. Feature basis and pattern basis.
|
| Th 13 Feb |
- Levels of description in linear networks, continued:
2. Change of basis.
3. Feature and pattern level descriptions
3.1. are isomorphic at asymptote.
3.2. are non-isomorphic under localized damage.
3.2. are non-isomorphic during learning.
3.2. are non-isomorphic when non-linearities are introduced.
|
| 6 |
Tu 18 Feb |
Single-layer non-linear networks: Perceptrons
- The perceptron defined.
- Computational abilities and limitations.
|
| Th 20 Feb |
- Learning: The perceptron convergence procedure.
- An application: Past tense acquisition by Rumelhart and
McClelland.
|
| 7 |
Tu 25 Feb |
Multi-layer non-linear networks: Backprop.
- Computational power of multi-layer networks.
1. Examples of complex functions computed by a random multi-layer network.
2. Theorem: A single layer suffices for approximating any function.
|
| Th 27 Feb |
- Learning by back-propagation of error.
1. Global motivation, viz., gradient descent on error,
results in local learning algorithm: the generalized delta rule.
|
| 8 |
Tu 4 Mar |
- An application: NETtalk by Sejnowski and Rosenberg.
- Analyzing hidden-layer representations
1. Cluster analysis in high-dimensionality layers (e.g., NETtalk)
2. Graphs for 2 or 3-D layers (e.g., 4-2-4 encoder)
|
| Th 6 Mar |
- Analyzing hidden-layer representations, continued:
3. "Hinton diagrams" for medium-dim spaces or topologically arrayed layers
(e.g., Lehky and Sejnowski shape from shading)
- Extensions of backprop and application to human category learning
1. Exemplar-based hidden nodes in ALCOVE
(catastrophic forgetting in standard backprop).
cf. Sparse Distributed Memory
|
| 9 |
Tu 11 Mar |
- Extensions of backprop and application to human category learning,
continued:
2. Dimensional attention on input nodes in ALCOVE
(lack of filtration advantage in standard backprop).
Multi-Dimensional Scaling also described
|
| Th 13 Mar |
- Extensions of backprop and application to human category learning,
continued:
3. Mixture of networks in ATRIUM (lack of rule-like extrapolation in
exemplar-based models).
|
| Br. |
18, 20 Mar |
Spring Break |
| 10 |
Tu 25 Mar |
Simple Recurrent Networks (SRN's).
- Cascaded activation for a fixed input.
- Sequential input/output patterns.
1. Jordan networks.
|
| Th 27 Mar |
- Sequential input/output patterns, continued:
2. Elman networks (a.k.a. SRN's).
- Applications of SRN's
1. Grammar learning (Elman)
|
| 11 |
Tu 1 Apr |
- Applications of SRN's, continued:
2. Course of learning in an SRN
|
| Th 3 Apr |
- Applications of SRN's, continued:
3. Modeling human sequence learning (Cleeremans and McClelland)
|
| 12 |
Tu 8 Apr |
Symmetric Recurrent Networks.
- Notion of constraint satisfaction by recurrent activation.
- Hopfield's proof of stability for discrete-valued activations.
|
| Th 10 Apr |
- Hopfield's proof of stability for continuous-valued activations.
|
| 13 |
Tu 15 Apr |
- Applications of constraint satisfaction networks.
Sentence disambiguation (Waltz and Pollack)
Analogy making (Holyoak and Thagard)
Stereopsis (Marr and Poggio)
|
| Th 17 Apr |
- More applications.
Dyslexia (Plaut, Hinton and Shallice)
- From goodness to network design.
If an application has a cost function that can be expressed
as a "quadratic form", then a Hopfield-type network can optimize it.
- Learning in Boltzmann machines.
Another case of global motivation resulting in a local learning rule!
Speculations about the function of (REM) sleep.
|
| 14 |
Tu 22 Apr |
Competitive Learning.
- Local motivation: Best representative moves closer.
The special case used in program cl.
Examples.
"Leaky" learning.
- Global motivation: Maximize representation.
|
| Th 24 Apr |
- Adaptive Resonance Theory (ART 1).
- Kohonen's self-organized feature maps.
|
| 15 |
Tu 29 Apr |
Modeling Psychological Data.
- The ia model of the word-superiority effect.
- An interactive activation model of face priming.
|
| Th 1 May |
|
| (Fin.) |
(Th 8 May) |
NO Final Exam for this course (NOT happening at 5:00-7:00 pm) |