Computer programs are useful as (i) a computational tool for analysing mathematical models of cognitive processes, (ii) an algorithmic representation of cognitive processes, and (iii) a tool for parameter estimation using experimental data. In most neural network models the computer uses a promising algorithm, such as backprop, to represent the cognitive operations required to complete a task. Once it is stimulated with stimulus-response sequences, the parameters of the model adjust until a stable solution is achieved. Since algorithms such as backprop serve as universal function approximators it is not surprising that this algorithm, and its recursive variant, can accommodate a wealth of cognitive data. However, does this mean that we can use the backprop algorithm as a model of cognitive processes?
The typical approach of mathematical psychologists is to propose alternative quantitative representations of a cognitive process and compute their experimental predictions. The aim is to obtain nonparametric predictions that permit the rejection of whole classes of competing models, followed by parameter estimation for the models which satisfy these preliminary tests. Since a computational tool such as backprop is a universal function approximator it will not serve as a suitable alternative cognitive model. However, it could very well be useful for nonparametric parameter estimation should it be possible to represent the mathematical model in backprop terms. However, it is important to note that the backprop parameters (weights, activation, bias) are microscopic, whereas the psychologically interesting parameters of associative memory models (e.g. those used in TODAM [Murdock, 1982]), apply to the update of the whole memory system and so are macroscopic in comparison. Whether we can predict these microscopic parameters in any nontrivial way is an important consideration for cognitive modelers.
The need for empirical tests of competing models is highlighted by the discovery that backprop suffers severe retroactive memory interference, unless suitably adapted. The importance of a proper representation of the stimulus input is highlighted in the rather more successful adaptation of backprop using the ALCOVE algorithm (Kruschke, 1993) and in the use of an empirically more reasonable representation of morphemes in Coltheart's dual process connectionist model for reading (Coltheart, Curtis & Atkins, 1993). It is only through careful empirical investigation of human cognition using appropriate mathematical models that these representations can be optimised. Andrews also emphasises the importance of using the correct representation of the input.
My approach to memory modeling involves the derivation of a general associative memory model that has two versions, one based on a generalisation of nonlinear system identification (NSI), and the other involving a generalisation of Kohonen's(1977) novelty filter idea. The latter Error Accumulation (EA) Model (Heath, 1991b, 1993a) contains macroscopic parameters representing long-term memory permanence, short-term memory averaging and novelty sensitivities. Depending on the values of these parameters, the model can represent associative memory models such as the Matrix model (Anderson et al, 1977) and TODAM, as well as adaptive memories models such as that described in Metcalfe (1993). I have shown how these parameters can be estimated using serial position data (Heath & Fulham, 1988). In its most general version the EA model is dissipative and can predict both categorical judgements and response time distributions in recognition and cued recall tasks, as suggested by Dennis (also see Chappell & Humphreys, 1994).
The NSI model (Heath, 1991a) is a generalisation of the Matrix model to allow for space-time interactions in memory. The memory system's response is determined by the contributions of individual items, pairs of items, triples of items etc. As suggested by Halford, we need only bother with fifth order terms, since the NSI model represents these higher order interactions in terms of fifth order tensors. This model has been used to represent the psychomotor responses in cursive writing and due to its sigma-pi unit storage, can emulate the learning of nonlinear functions, such as XOR and more interestingly, the zero-crossings of the possibly chaotic sunspot series. Unfortunately, the model is heavily parameterised and we need to combine the EA and NSI models in order to employ the macroscopic parameters of the former to complement the large number of microscopic parameters in the latter.
From the mathematical modeling point of view, there is little value in fitting connectionist models with large numbers of parameters to explicate cognitive processes. For this reason, Stevens' model, although very interesting and important, provides us with a psychometric representation of the data without much elucidation of the underlying cognitive processes. The models reported by Dennis and Halford have obvious similarities to the models referenced in this note and so serve as useful candidates for comparison with competitors. Wiles has provided an excellent summary of the current state of the field, particularly from the point of view of Computer Science. However, it is apparent that these alternative ideas, which are all candidate procedures for nonlinear optimisation need to be tested against data obtained from human subjects, possibly using data structure primitives so clearly defined in Humphreys, Wiles & Dennis (1994). Finally, Latimer's plea that cognitive psychologists should use computer modeling to elucidate verbal vagaries in their theories is timely. I would want to extend this plea to include the need to use mathematical analysis as an important complementary strategy for advancing our knowledge of human cognition.
Chappell, M., & Humphreys, M.S. (1994). An auto-associative neural network for sparse representations: Analysis and application to models of recognition and cued recall. Psychological Review, 101, 103-128.
Coltheart, M., Curtis, B., & Atkins, P. (1993). Models of reading aloud: Dual-route and parallel-distributed-processing approaches. Psychological Review, 100, 589-608.
Heath, R.A. (1991a). A nonlinear model for human associative memory based on system identification. In M. Jabri (Ed.), Proceedings of the Second Australian Conference on Neural Networks. (pp. 184-188). Sydney: University of Sydney Electrical Engineering.
Heath, R.A. (1991b). A general adaptive filter model for human associative memory. In J.-P. Doignon & J.-C. Falmagne (Eds), Mathematical psychology: Current developments. (pp. 415-436). New York: Springer-Verlag.
Heath, R.A. (1993a). A nonlinear model for human associative memory based on error accumulation. In Leong, P. & Jabri, M. (Eds), proceedings of the Fourth Australian Conference on Neural Networks. (pp. 130-133). Sydney: University of Sydney Electrical Engineering.
Heath, R.A. & Fulham, R. (1988). An adaptive filter model for recognition memory. British Journal of Mathematical and Statistical Psychology, 41, 119-144.
Humphreys, M.S., Wiles, J., & Dennis, S. (1994). Toward a theory of human memory: Data structures and access processes. Behavioral and Brain Sciences (in press).
Kohonen, T. (1977). Associative memory: A system-theoretical approach. Berlin: Springer-Verlag.
Kruschke, J.K. (1993). Human category learning: Implications for backpropagation models. Connection Science, 5, 3-36.
Metcalfe, J. (1993). Novelty monitoring, metacognition, and control in a composite holographic associative recall model: Implications for Korsakoff amnesia. Psychological Review, 100, 3-22.
Murdock, B.B.Jr. (1982). A theory for the storage and retrieval of item and associative information. Psychological Review, 89, 609-626.