Connectionism, Psychology and Science

George Oliphant,
Department of Psychology
University of Sydney

Dr. Latimer in his paper made reference to Plato and John Stuart Mill and John Locke; I'd like to increment the citation count of a more neglected thinker, neglected because he hardly ever said anything or wrote anything that was of any interest whatsoever. The scholar I have in mind is Reichsmarschall Hermann Goering, who (so it is reported) once declared "Whenever I hear the word 'culture' I reach for my gun".

Whenever I hear the word "connectionism" I reach for my critical faculties. This reaction is triggered by a belief which I share with Cyril, and with all other right-thinking persons, the belief that scientific psychologists, whose profession it is to try to discover some fragment of the truth by theorising and systematically testing their theories, should in their scientific publications say clearly what their theories are, and how they are being tested. It is incumbent on any scientist to make clear in his or her publications exactly what he or she is doing, and in that respect connectionists (as I'll call psychologists whose work involves so-called neural networks) very generally fail; they do not say clearly what they are doing.

I'll shortly enumerate a number of things they might be doing, activities in which connectionist networks might play some part, but generally in connectionist publications it is unclear which, if any, of these activities the author is supposed to be engaged in, and in those reports in which there are indications that the author is under the impression that some psychological theory is being proposed and/or tested, it is generally quite unclear what the theory is supposed to be. In consequence, connectionism has contributed very little to psychology, or, indeed, to any branch of science (perhaps there has been the occasional incidental drawing attention to some phenomenon, as in Cyrils movement illusion example - although in that case the machine was not a connectionist network); moreover, in psychology, connectionism has absorbed a great deal of time and effort and journal space which might have been put to better use; weighing the debits against the credits, the overall value of connectionism to psychology has been negative. On the debit side of the ledger also is the effect on our students; the traditional sloppiness ("traditional" is one of the qualifiers which psychologists these days apply to any activity which has been going on for at least four months; another such qualifier is "classical") - the traditional sloppiness of the connectionist literature encourages our students, those great imitators, to produce woolly-headed reports and essays in the connectionist vocabulary and style.

I don't want to distract attention from my thesis here, which is that generally the connectionists don't seem to know what they're doing, the evidence for that thesis being that they don't say what they're doing, but I would like to make two preliminary points to clear away some underbrush, two lemmas of a sort . I say "of a sort" because they are not so much new results as reminders.

(1) The first lemma concerns a proposition which can be found in a number of places in the psychological simulation literature, e.g. in Frijda's 1967 Behavioural Science paper and in Apter's 1970 book on the computer simulation of behaviour, and which Cyril echoes in his paper. The proposition is that if a machine can be constructed (or if a program can be written) to instantiate some theory (if, as Dr. Latimer puts it, the theory can be "cast . . . within the explicit, logical framework of a machine") then the theory must be logically consistent (free from internal contradiction, that is), and it must be free from ambiguity; as Dr. Latimer puts it, there is "no room for . . . vagueness and legerdemain". A theory for which there exists a simulation program or a machine model is not necessarily true, but at least it's clearly specified and it's logically sound.

That's the proposition. However, it's quite false. On the matter of consistency, one simple exercise which our Computer Applications students perform is to write a program the input to which is to be an integer, and the output is to be a statement declaring whether the input is an odd number, or whether it is an even number. It's very common for students at some stage of their work on that exercise to have in the machine a program which, when they run it with some particular input, prints THIS NUMBER IS ODD, THIS NUMBER IS EVEN (both of these), a contradiction, implied by their program as it then stands. In this case the output very clearly reveals that the program implies, and therefore embodies, a contradiction - and that is the point here; programs and equivalent machines, computer models or connectionist network models can be logically inconsistent, and it's possible for any of those to be inconsistent yet a finite number of runs of the machine not expose the logical defect.

As for clarity, in psychological simulation studies any vaguenesses in, for example, the terms of the theory which is being simulated are commonly simply carried over into the program or into the machine model; e.g. in Colby's Artificial Paranoia program (Colby, 1975) one of the ambiguous terms in his theory is "frustration"; in the program a variable is labelled FRUS, and nothing in the program clarifies what the label denotes. Similarly in any connectionist project, the construction of a network does nothing to remove whatever ambiguities there are in e.g. the labels attached to input or to output units.

Lemma 1, then, is that the existence of a simulation program or a connectionist network model or a computer model for a theory does not imply that the theory is logically consistent and does not imply that the theory is clear.

Now, it is the case that one means of proving the consistency of some set of propositions (such as a theory, or a set of axioms for a deductive system) is to identify some real situation or "system" in which the real entities and real relations stand in certain precise correspondences with the substantive terms and the relational terms of the given set of propositions, i.e. of the theoretical system whose consistency is in question; if such a real system/situation exists the logicians call it a "model" for the theoretical system, and the existence of the model proves the internal logical consistency of that theoretical system. But creating a program or a connectionist network, for example, and just calling it a model doesn't make the program or the machine a model, in that logical sense; for the machine to be that kind of model the appropriate correspondences must obtain, and, in general, in simulation programs and in network so-called models, they do not.

(2) So much for the first lemma. The second is that computers on the one hand and connectionist networks on the other are exactly equivalent in processing capabilities. Whatever functions one of these kinds of machine can perform, so can the other. It is well-known that, without an indefinite supply of memory, the computer has the power of a Finite State Machine, and that with an indefinitely large memory it has the power of a Turing Machine - it can produce any specifiable function, it's the most powerful conceivable kind of processor.

Now it's also the case that any computer can be constructed entirely from multiple copies of one or two very simple binary elements; in particular the elements OR and NOT are sufficient. In that case, since the elements OR (the inclusive disjunction) and NOT (the negator or inverter, i.e. the element with just one input which, when active, inhibits the otherwise positive output) are simple types of network unit, it follows that, for any specified input-output function, a network exists (i.e. can be devised) which performs that function. That is to say, the connectionist network, like the computer, is the most powerful conceivable kind of processor.

It is also provable that whatever a connectionist network can learn (using that term in the same sense as connectionists do) so can a computer, and of course vice versa. Here I'll not give an analytic proof, but just point out that the examples of learning networks in the literature are also examples of learning computers; in those simulations the imaginary changes to imaginary networks are real changes to real computers.

Lemma 2, then, is that, computationally, computers and connectionist networks are equivalent to each other; and each class of machine can perform, and learn to perform, any specified function.

So much for the lemmas.

Returning now to the thesis, which is that connectionists do not seem to know what they are doing. As I said, there are a number of things they might be doing, several kinds of scientific and other activities in which connectionist networks might play a part; these include especially

Logical/mathematical projects. Analytic investigation of network properties.
Artificial Intelligence. Research into the design of networks for performing certain tasks, whether or not such networks bear on any neuroscientific or psychological facts or theories. AI is one kind of engineering.
Neuroscientific research. This is a scientific activity in which theories about neurons or aggregates of neurons are proposed and tested, and
Psychological research. This is a scientific activity in which theories about mental processes or about behaviour are proposed and tested.

Each of these kinds of activity is different from every other. It is possible to engage in more than one of them in some single research enterprise, but in any such case the author describing the enterprise should of course make it clear which part of the enterprise is AI, which part neuroscience, which part psychology, which part mathematics.

Considering the psychological connectionist literature, then:

Of (1) there are some examples in that literature; the presentation and proof of the back propagation algorithm, and Smolensky's Harmony Theory (Smolensky, 1986), for example. But also in that literature many authors make rather low-key, rather off-hand assertions which amount to mild claims that this or that piece of connectionist work has some mathematical or logical justification, usually as an existence proof, when in fact it has no such justification. This practice is very widespread; I'll give just two examples: in a paper in Cognitive Science Touretzky & Hinton (1988) write that one aim in their paper is to demonstrate that "distributed connectionist systems are capable of representing and using the explicit rules of a Production System", and, closer to home, Sally today said of Plaut and McClelland's (1993) model that it "establishes that a [connectionist network] can learn to respond appropriately both to words and to nonwords. But the "demonstration" in the one case and the "establishing" in the other are not part of any justification for these projects; we already have in lemma 2 a very general existence proof which includes those particular cases, we already knew from that lemma that connectionist networks "are capable of representing and using the explicit rules of a Production System" and that they "can learn to respond appropriately both to words and to nonwords", because we knew that they can perform, or learn to perform, any specifiable function.

It may be that certain properties of particular networks which perform this or that function could be of interest for reasons to do with AI, or with neuroscience, or with psychology, but that's a different matter, and any case of that kind (if there should be one) would need to be considered under one or more of those headings.

(2) AI is AI and not psychology, so I won't dwell on it, but I'll make just two remarks: (i) it may be that, for pragmatic reasons, connectionist networks are more useful than computers in some applications, although I'm not aware of any such; the "networks" so far investigated in AI work are actually (with odd exceptions) simulation programs, and (ii) the results of some AI exercises, such as Gorman and Sejnowski's (1988) rock and mine detector, which, it seems, can make some discriminations which people can't, seem to be welcomed by some confused psychologists as somehow helping to justify psychological connectionist projects (Bechtel and Abrahamsen, 1991, p.128), which of course they don't.

(3) Neuroscience, again, is not psychology, but again I'll make a remark or two; connectionist psychologists do, after all, make free with the word "neural" and its cognates; "neurally inspired" they say, "neurally plausible" they say. There is, to my knowledge, no study in the psychological connectionist literature in which a neuroscientific theory is clearly stated and tested. Testing such a theory would of course require the examination of real neurons, or real aggregates of neurons, to obtain evidence for or against the theory.

Of the relationship between connectionist networks and real neural networks, the psychologist Quinlan writes, and I would like to believe that he was blushing as he wrote it, "the mapping from connectionist networks to real neural nets remains problematic" (Quinlan, 1991, p. 9). Perhaps Quinlan here is trying for the Guinness Book of Records in the "Understatements" category. Francis Crick (Crick, 1989, p. 130) is more forthright, declaring that "[connectionist} nets are unrealistic in almost every respect", and Poggio (quoted in Morris, 1989, p. 203) is even blunter - "the only thing neural networks have in common with the human brain is the word 'neural'". But let me repeat, there is to my knowledge no study in the connectionist literature in which a neuroscientific theory is clearly stated and tested. To quote Crick again, there is "no sign of [connectionists] clamouring at the doors of neuroscientists, begging them to search for [confirmatory evidence]" (Crick, 1989, p. 131). Connectionists do not, it seems, have any serious neural theory in mind, let alone in print.

(4) To psychology, then. There have been accumulating a number of criticisms of psychological connectionism, notably by Broadbent (1985), by Massaro (1988), by Fodor and Pylyshyn (1988), and by Pinker and Prince (1988); part of the content of these criticisms is that existing connectionist psychological theories are false. That criticism is too generous, because there are no psychological theories clearly stated in the connectionist literature, any more than there are neuroscientific ones. A fortiori, in no connectionist paper does the author make clear what his or her network is supposed to have to do with such a theory. Instead, in connectionist papers, the preambles and discussions are besprinkled with handwaving terms and phrases such as "metaphor", "analogy", "accounting for" and, of course, "model". The authors (bless them) appear to believe that the routine reeling off of one or two of these stock phrases, declaring especially that their network is a "psychological model", or that they are investigating the question of whether the network "accounts for" a certain kind of behaviour, is enough, that the emission of those phrases licenses them to plough on for 20 or 30 pages of descriptions of network details and of alleged simulation runs, and that the final product is a scientific paper.

That kind of thing is not at all good enough, for this reason. The claim that some entity or system (say A) is a metaphor, or a model, for some other entity or system (say B) is a claim that in some respects A is the same as B, but in other respects A is different from B; it's a claim about A, and about B, and about the relationship between the two. Now, in order to be able to evaluate that claim, to be in a position to consider rationally whether it's true or false, we obviously need to know what the relevant characteristics of A are, what the relevant characteristics of B are, and exactly what the mapping is between the first set of characteristics and the second set of characteristics. In the case of connectionist network so-called models, the authors do usually spell out in some detail the characteristics of A, their particular network. But the characteristics of B, whatever psychological theory is allegedly being modelled, are not spelled out, nor of course is the relationship between the components of the network and the components of the unstated theory. Consequently, the assertion that the network is a model, in the required sense, cannot be fully understood and cannot be evaluated. There is the superficial appearance of scientific research, of proposing and testing some theory, but the reality is absent.

The attempts of psychologists to construct connectionist networks which simulate some specified behaviour, including so-called "learning" sequences, are no more than "busy work"; if any such attempt does not succeed it follows from lemma 2 that this can only be because the researcher hasn't tried hard enough, or isn't ingenious enough; we know from that lemma that it's always going to be possible to construct the required network, just as it's always going to be possible to write a computer program to do the very same simulation.

These attempts, these connectionist exercises, then, are not science, but perhaps they're not entirely pointless. There is a fifth kind of activity which I've tabulated; there was no existing agreed label to designate this activity, so I've taken the liberty of christening it

(5) WILKINS MICAWBERING. Micawbering (I went to a boys' school where we referred to each other by surname only) is the activity of constructing connectionist networks to simulate some behaviour, in the absence of any specified neuroscientific or psychological theory, in the hope that something will providentially turn up. Micawbering evidently gives many academic psychologists a great deal of relatively innocent pleasure, so I, for one, would resist any proposal actually to make it illegal, or to insist that it be carried on only by consenting adults behind locked doors. But it is surely time to stop cluttering up the scientific journals with descriptions of Micawbering.

In its short history, scientific psychology has been diverted into more than one blind alley, in which psychologists have spent extended periods busying themselves with this or that futile activity, but the Deity did give us ample warning that this was going to be the case. His warning clues are in cipher, but they are not difficult to decode. The word "Psychology" does, after all, begin with a "sigh", and end with "oh gee, why?". And "Connectionism" begins with a "con".

References

Apter, M.J. (1970) The Computer Simulation of Behaviour. London: Hutchinson.

Bechtel, W. & Abrahamsen, A. (1991) Connectionism and the Mind. Oxford: Basil Blackwell.

Broadbent, D. (1985) A question of levels: Comments on McClelland and Rumelhart. Journal of Experimental Psychology: General, 114, 189-192.

Colby, K.M. (1975) Artificial Paranoia. New York: Pergamon Press.

Crick, F.H.C. (1989) The current excitement about neural networks. Nature, 337, 129-132.

Fodor, J.A. & Pylyshyn, Z.W. (1988) Connectionism and cognitive architecture: A critical analysis. Cognition, 28, 3-71.

Frijda, N.H. (1967) Problems of computer simulation. Behavioral Science, 12, 59-67.

Gorman, R.P. & Sejnowski, T.J. (1988) Learned classification of sonar targets using a massively-parallel network. IEEE Transactions: Acoustics, Speech and Signal Processing, 36, 1135-1140.

Massaro, D.W. (1988) Some criticisms of connectionist models of human performance. Journal of Memory and Language, 27, 213-234.

Morris, R.G.M. (1989) Parallel Distributed Processing. Oxford: Clarendon Press.

Pinker, S. & Prince, A. (1988) On language and connectionism: Analysis of a parallel distributed processing model of language acquisition. Cognition, 28, 73-193.

Plaut, D. & McClelland, J.L. (1993) Generalization with componential attractors: Word and nonword reading in an attractor network. Proceedings of the 15th. Annual Conference of the Cognitive Science Society.

Quinlan, P.T. (1991) Connectionism and Psychology: A Psychological Perspective on New Connectionist Research. Chicago: University of Chicago Press.

Smolensky, P. (1986) Information processing in dynamical systems. In D.E. Rumelhart, J.L. McClelland and the PDP Research Group (Eds.) Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1.