draft: 10/12/98
I. Introduction
Many contemporary philosophers and psychologists would find the traditionally-derived philosophic perspective developed in the previous chapter completely unsatisfactory. They would object that it represents no improvement on the speculative metaphysics of previous centuries and that it is without a contemporary scientific foundation. In this chapter and the following, I examine two contemporary theories about the nature of mental images, one proposed by Pylyshyn and the other proposed by Kosslyn. As indicated in chapter 2, these theories derive primarily from an attempt within psychology to develop new paradigms for understanding imagery. To some extent, they repeat, within the context of contemporary psychology, some of the same issues the have driven the imagist/descriptivist debate within philosophy.
The primary purpose of this and the following chapter is to critically examine these two theories. I shall be primarily concerned with the following issues: (1) the internal consistency of the theories; (2) how well the theories can explain the nature of our imagery experiences; and (3) the evidence used in support of the theories. All of these issues will be discussed from the viewpoint of the computational strictures and computational models both writers have proposed. I shall be primarily concerned with the published works of both writers, attempting to give a clear and fair exposition of their views. For the most part, I shall refrain from speculation about how other, derivative theories, held by philosophers or psychologists might fare against the same sorts of criticisms. My primary concern is to establish how the two main proponents in the current imagery debate within psychology have made their cases, and what evaluation we can make of them.
There is a secondary purpose to these two chapters. This is describe and explain the implications of the background conditions, or context, in which the debate occurs. By means of this, I mean to indirectly support the view expressed in chapter 3. There are two significant general problem areas that derive from the background conditions. One is computational epiphenomenalism and its implications. This forms a background condition to both theories, since Pylyshyn adopts full epiphenomenalism with regard to images and Kosslyn a partial or provisional epiphenomenalism. I indicate briefly some of the general problems with computational epiphenomenalism at the beginning of this chapter. A second general problem area is the strictures imposed by the rules of empirical psychology adopted by the participants in the debate. I indicate the problems with these strictures in a section of chapter 5 ("Problems for Empirical Cognitive Psychology").
To develop a complete critique of computationalism, epiphenomenalism, and methodology in psychology lies beyond the scope of this work. I hope to show, however, that awareness of the background conditions of the debate (particularly when coupled with the specific results claimed by the theories) prompts this question: "do these contemporary theories work from the right sort of general viewpoint from which to understand mental images?" At the close of chapters 4 and 5, I suggest that there is something wrong with this general viewpoint. As long as the endorsement of strict epiphenomenalism (whether actual or defacto) remains part of both theories and the role of subject reports remains ambiguous, neither theory, I suggest, tells us much about imagery. I suggest that the ultimate lesson to be derived from the debate is that we are left with an amalgam view of mental images very much like the one supported in chapter 3.
In this chapter, I discuss the essentials of Pylyshyn's epiphenomenalism. I show that although it is a logical possibility and is quite consistent as a theory, it has serious problems in systematically explaining how and why our experiences are different from the form of processing thought to underlie them. I then turn to a discussion of Kosslyn's theory. Kosslyn has claimed his theory does not suffer from the problem of disparity between processes and experiences because the processes themselves are imagistic. He claims his theory does not reduce the image to an epiphenomenon. I show that computational considerations mean that his is also an epiphenomenal system.
I reach only preliminary conclusions in this chapter,
reserving final comments on the systems and the ultimate
implications of computational epiphenomenalism for the close of
Chapter 5.
II. Background: Epiphenomenal and Computational Systems
A difficulty in the common-sense view of four-way causation
arises when one is called upon to explain transition "d" (see
diagram below). Are physical states 1 and 4 causally related?
Since they are changes in the body, it certainly seems they must
be, and if so they must be subject to the laws governing the
transition from one physical state to another. If this is true,
then ultimately, in a completed science, we would need only
concern ourselves with such physical transitions. If the
scientifically explainable chain of events is strictly physical,
and the mental events that seem to follow one upon the other as
causes and effects in their own right are not, then perhaps the
mental events are not really connected to each other in a causal
chain, but only seem to be. They could be incidental effects of
the strictly physical transitions occuring in the brain. This
solution has the benefit of making all our conscious thought
fully explainable, once we have a completed physical science. It has the disadvantage of making all of our conscious life a meaningless side-show to an irrevocable chain of physical causes. This is epiphenomenalism in the strictest sense.
What I call general or strict epiphenomenalism has what the common man can only understand as astounding consequences. It must hold it technically incorrect to say such things as "I willed to move my finger, and my willing it caused my finger to move." For the present, we shall be content to observe that the common man, and the traditional psychologist, usually rejects epiphenomenalism.
Epiphenomenal systems, when combined with computational theory and the concept of physical symbols, are reputed to solve Brentano's problem, and this is probably the greatest reason for
interest in them today. This reputed solution is accomplished, in outline as follows. Recall that being a symbol means that "X represents Y to Z." But, it can be objected, the ontological status of the symbol as an "object," is inherently ambiguous, and this causes a problem in explaining how it could play a causal role in physical changes. Is it an abstract object? In that case it could have no causal connection with physical things. Is it to be considered "in" the subject's consciousness or a "part" of the subject in some special way (among the possibilities suggested in Chapter 3)? If so, then this "object" is wrapped up with an explanation of consciousness and consciousness itself has no explanation within physics.
I attempted to remove (or partially remove) the ambiguity by saying X is an intentional object and that that X is "in" Z -- in the figurative sense, leaving open the possibility of physical correlates. But now suppose we said X is literally inscribed in the brain in a cognitive code that is used in all our cognitive processes, and these processes are all computational processes that manipulate these codes. These manipulations will all be physical and the transitions between them will all be fully explainable in terms of physics. Under this general senario, it has now become fashionable to say X, a physical symbol, is a representation. That is, within the computational system said to comprise the mind, this physical brain state (X) represents a belief or other unit of information that the subject possesses, and X functions within that system in ways that make it represent Y. This way, the ontological status of X is unambiguous. It is a physical state in the brain and as such we do not need to worry about it going in and out of existence as it appears and disappears from consciousness or about how it can be a cause of behavior.
This is, again roughly, how Brentano's problem is "solved" by the cognitive science approach (see Pylyshyn, 1984, p. 26). But what of the semantic content of these physical symbols? In a computer, the physical states that are symbols have no semantic content. They are raw physical states, utterly meaningless in themselves, having no content at all except what they derive from users of computers.(1) Moreover, the semantic content given to us in conscious awareness through intentionality, is utterly powerless to cause any change within the system. Pylyshyn observes:
As we have already noted, the semantics of representations cannot literally cause a system to behave the way it does; only the material form of the representation is causally efficacious. (Pylyshyn, 1983, p. 39)
If, then, Brentano's problem has been "solved," we are now faced with a new problem: the causal ineffectiveness of semantic content assigned to physical symbol systems. This problem has proven to be one of the most intractable in the contemporary philosophy of mind. It is particularly important in light of computational considerations, for in this case the question is not just how physical symbol systems in general might acquire semantic content, but how physical symbols that operate entirely within computational restrictions could attain content.
Searle's famous articulation of the problem is the Chinese Room argument (Searle, 1980a). Searle argues it is quite impossible for the human mind to literally operate according to the computations of a physical symbol system. In support of this, he argues that (1) all semantic content of computational systems is derivative of the interpreter's view of the system (or, semantic content is extrinsic to the system) and that (2) the semantic content of human minds is intrinsic (or, as he sometimes states it, that meanings are "in the head" but not in form of a computer program). Searle believes that intrinsic semantic content is unique to biological systems. Since the source of semantic content in a computer is utterly different from the source of semantic content in a biological organism, Searle concludes that (3) computationalism cannot be the basis for a true theory of mind.
Needless to say, Searle's notions of what semantic content is and how it is obtained are controversial (Searle, 1980b, 1990). Neither assertion (1) nor (2) above is universally accepted. Theories about how physical symbols might obtain semantic content in computational systems (e.g., functional role semantics) have been proposed. Likewise, it is by no means obvious that meanings are "in the head." If they were, what we mean by "understanding a meaning" could be identified with having certain internal representations in the mind, but it is not obvious that we make such an identification. My internal representations of a beech tree and an elm tree, for example, might be identical, but this does not imply that I mean the same things by thinking of them. By differentiating them, it has been argued, I mean to imply that I rely not on my own internal representations, but on the meaning given to "beech" and "elm" by a wider community of scientific experts (see Putnam, 1975, pp. 215-271).
The above points might be pursued indefinitely. A complete treatment of the origin of semantic content is not our concern, nor is every possible defense of computationalism. Rather, we shall be concerned with how the problem of semantic content has been addressed in Pylyshyn's theory and implications his approach has for the meaning of "imagery" in his system. Although Kosslyn's conclusions are quite different, his understanding of the basic restraints imposed by functionalism and the restraints imposed by empiricism are similar. This brief treatment of Pylyshyn's assumptions will therefore provide an understanding of the background considerations that apply to both theories.
Pylyshyn appears to accept the first of Searle's propositions (Pylyshyn, 1983, p. 43). He agrees that in a computational model that the symbolic codes themselves are devoid of semantic content. In a computational model, all the semantic content given to the symbolic computational states and their results derives from the assignment of meaning by the theorist who either designs or observes the system:
The particular interpretation placed on the states, however, appears to be extrinsic to the model, inasmuch as the model would behave in exactly the same way if some other interpretation had been placed on them. (Pylyshyn, 1984, p. 43)
Furthermore, Pylyshyn acknowledges that if Searle's view is correct, then it "undermines" Pylyshyn's claim that "functional models...address the issue of representation-grounded behavior" (Pylyshyn, 1984, p. 43). Yet, this does not force us to accept Searle's second proposition, Pylyshyn suggests, because (among other reasons) concerns about the source of "meanings" may simply not be relevant to empirical psychology. Pylyshyn suggests that for the purposes of empirical psychology we can focus on another sort of question:
In my view, the question of whether the semantic interpretation resides in the head of the theorist or in the model itself is the wrong question to ask. A better question is: What fixes the semantic interpretation of functional states? or, What latitude does the theorist have in assigning a semantic interpretation to the states of the system? (Pylyshyn, 1983, pp. 43-44)
Pylyshyn suggests that if there were robots who behaved like humans, we would be constrained to attribute semantic content to the internal physical states that governed their behavior. Pylyshyn states that "it would be perverse to deny that these states had the semantic content assigned to them by the theory [that described the robot's behavior]" (Pylyshyn, 1983, p. 44).
Pylyshyn's suggestion that the robot case is appropriate opens up a huge topic for debate on the principles of psychology. We can only consider a few of its implications here. As Pylyhsyn himself acknowledges, whether or not it would be "perverse" not to assign semantic content to the functional states presumed to govern a robot's behavior is itself a highly contentious claim. It is a claim about which we are likely to "run into differences of opinion that could prove unresolvable" (Pylyshyn, 1983, p. 44). We need, therefore, some way to limit the appeal to intuitions about what attributions we might make to robots and concentrate on what Pylyshyn's suggestion implies for the study of imagery in psychology.
There is one sense in which Pylyshyn's suggestion is reasonable. If one feels that we must be constrained to obtain knowledge about the subject matter of psychology through an objective, third-person point of view, then the robot case is useful. Recall, however, that according to traditional psychology, we should not have any expectation that the entire nature (or even the essential features) of the phenomenon in question could be investigated in this way. Wundt cautioned against a purely behavioral approach and demonstrated that even at the base level of behavioral response (reaction time) we can not expect to eliminate the influence of private mental states (chapter 1). Nor should we, in Wundt's view (or Richardson's, or in the view I developed in chapter 3) claim that psychology would still be the same science if it were concerned only with behavior (or the possible models for behavior) and did not address itself to conscious contents. I suggest that what Pylyshyn's sort of robot theory inclines us toward is something quite different than an investigation of consciousness and its objects: it is an investigation of behavior and its possible explanations given a certain set of assumptions about causality, namely, that all casual interactions occur at the physical level.(2)
Thus, the background consideration we must bear in mind (an allow to surface again from time to time) is this: once any "obscure" or "metaphysical" forms of conscious intentionality have been eliminated from the picture and a new set of criteria proposed, what sort of images do these new criteria leave us with? The answer to this question is quite clear, at least in Pylyshyn's theory. In his view, since all operations in any computational system be performed using logico-linguistic elements, there is no room at all for the images in the classical or traditional sense to either exist or to be causally-generative items in the system. There may be mechanisms that transform visual input into usable logico-linguistic data, or even those that transform data as if they retained image-like features, but as images themselves are utterly non-computable, they disappear entirely from the system. (The details of this deduction appear in the section on Pylyshyn, below.)
The concern about this background issue in computational psychology can be put another way. It is sometimes asserted that if we were in possession of a fully-reductive physical science we would not be prompted to attribute mental states to human beings as the causally-effective states in their behavior. Suppose there were a robot whose behavior was sufficiently complex when using visual inputs that we attributed to it the property of "having a mental image." Despite the claims we might make about what is real at the functional level, however, we would know that at the physical level "having an image" was merely an empty attribution. We are already in possession of a fully-reductive science of robots; we would know that at the base level of causation all the micro-events would be of the same "on/off" physical type and would involve no images.
Suppose that we were in possession of a fully-reductive explanation of human behavior. Would we have to say that "having an image" would also be an utterly empty attribution? I submit we would not, because it is not only the behavior of human beings we need to account for but the conscious states they are in. As long as we accept that consciousness is a factor to be accounted for, the addition of reductive science does not allow us to fully equate the explanatory situation with the robot case.(3) This can be seen by considering the consequences of the following: suppose that it can be shown that the reductive neural (or other) states are the sole causes of the conscious states in which images seem to inhere and that the conscious states themselves are utterly without causal effectiveness. In this case, we are still left with the possibility of a metaphysical analysis of the being of consciousness and its contents, since the knowledge of the causes of our conscious mental images alone does not capture their nature. Nor could these reductive states be identified with mental images, since, by hypothesis, they differ from their effects. Note also that except for the fact that we would then know that mental images are true epiphenomena, we would not be an essentially different explanatory situation than we presently find ourselves in: we presume that there are physical causes of our mental states (though these are not the only ones, since we acknowledge mental causes as well), but none of these physical state themselves have the properties of apparent size, color or location in space that mental images have. We would still need an explanation or analysis of these presented properties. Finally, we would also need an explanation of where the apparent intentional content of conscious mental images came from. The question about whether content is intrinsic or extrinsic would reassert itself.
Another point to bear in mind when considering computational
models is that there is an incipient dualism in framing the
explanation of how physical symbols are supposed to relate to
conscious interpreters, or subjects. All of the semantic content
derives from subjects and none from physical symbols themselves.
But what is the subject? Is it not, in the final analysis,
supposed to be identical with, that is, be composed of and
operate according to, some system of physical symbols? How then
does the subject reappear in the system, gaining access to the
interpretive stance outside the system through which meaning and
interpretations are made possible?
III. Fundamentals of the Contemporary Debate
Two fundamental terminological distinctions structure the contemporary imagery debate.
(1) The phenomenal image is what we experience, i.e., what we "see" through the mind's eye.
(2) The internal brain representation is the brain state or process that is understood to be physically instantiated in the brain, and is designated as the cause (or sometimes less precisely, the explanatory factor or correlate) of the phenomenal image and the behavior predicated upon it.
The participants in the debate do not question the existence of the phenomenal image or the existence of internal brain representations. The only questions concern how to characterize, within psychology, the functions of each and the relation between them. The first, theoretical, issue is how to characterize the quintessential features of the internal brain representations involved in imagery experiences and imagery-related behavior:
(1) Can these internal brain representations justly be called functional images, in virtue of their distinctive role in cognition?
The second, experimental, issue is:
(2) Are there any root phenomena in the functional architecture of the brain that are intrinsically pictorial or imagistic in character?
Psychological descriptivists answer "no" as a matter of principle to the first question, and "no" as a matter of empirical fact to the second. Imagists answer "yes" to both questions. While the answers to these questions are the central issues, both views also differ markedly on how we are to understand the role of the phenomenal image in cognition. Pylyshyn holds that the phenomenal image is strictly epiphenomenal, while Kosslyn believes that since there are functional images there is a strong reason to believe that phenomenal images are not epiphenomenal. The issue of epiphenomenalism with respect the phenomenal image is also an aspect of the debate and sometimes assumes prominence in the description of what is at issue.
There are other ways to state what is at issue in the contemporary imagery debate. Block (1983, pp. 7-9) believes the question of epiphenomenalism is a red herring from a philosophical perspective. Block argues that several of the proposed grounds for the debate involve a confusion in the meaning of the term "mental image." Block considers three possible meanings for the term and how this affects the claim made by pictorialists that images are not epiphenomenal. First, mental images are understood to denote certain kinds of experiences, Block claims the pictorialists have no grounds for their claim that their experiments prove that our experiences have a certain character. The experimental evidence he states, simply does not address the issue of experiences at all:
...[T]he experiments certainly do not show that what has the causal role is any kind of experience. The experiments cast no light at all on whether the experience are epiphenomenal accompaniments of representations in the brain that are the real causes. (Block, 1981, p. 7)
Second, if "mental image" is thought to refer to either abstract entities or intentional objects, then the issue of epiphenomenalism is moot and the pictorialist's evidence is "irrelevant" (p. 8). Abstract object, he claims, "have no causal commerce of any kind," and intentional objects also fall out of consideration because they "could not be the effects of any causes" (p. 8). Third, if images are thought to denote internal brain representations, then again the whole issue of epiphenomenalism disappears because it can't be the case, according to the imagery debate, that brain representations are causally ineffective. Both parties agree that it must be brain representations that are causally effective. Block then suggests that the debate is about is not epiphenomenalism per se, but about what the properties of these brain representations really are (p. 9). The pictorialist claim that the brain representations have pictorial properties and that these are causally effective. The descriptivists claim the brain representations do not have pictorial properties.
Block's view is partially correct, but he combines his assessment of the debate with a neutral description of it. I have great sympathy for his view, for reasons that will become apparent in chapter 5, that the empirical evidence settles very little about the nature of our experiences. On the other hand, Block also states if the debate concerns determining the nature of the brain representations, then "some of the experiments they [the pictorialists] cite are indeed relevant" (p.8). I try to show the problems with the view that the empirical evidence we have reveals precisely what pictorialists claim, but this general point might also be accepted. It is also quite correct to say that one of the issues, the central one, concerns the properties thought to inhere in the brain representations.
I depart from Block's analysis in his assertion that the
nature and causal effectiveness of experiences is not an issue at
all. This may be Block's judgment about what the disputants are,
or should be, disagreeing about, but does not, in my view,
precisely reflect what Kosslyn has claimed. I have tried to
state what is at issue from the perspective of the disputants,
both from the theoretical and experimental points of view. In
this chapter, we shall be concerned with the first question
(above) at issue in the opposing views: in what theoretical
sense, if any, can there be said to be functional "images" in
psychology, given the strictures of computationalism?
IV. Descriptivism: Pylyshyn's Theory
A. Internal Descriptions and Visual Information
Pylyshyn understands the mechanism of cognition to consist of computational procedures that operate on a specific form of representation that is not actually verbal, but proto-verbal, i.e., a base mental language that can only be suggestively characterized as "propositional," "abstract" or "descriptive." Pylyshyn's view is that when we store the information that something before us, is, say, the Mona Lisa, we store a single type of information that enables us in future contexts both to remember features of the visual image and to verbally identify other images as being similar.
The term "image" is to be avoided in describing this unconscious code. Pylyshyn writes:
Because the representation is so obviously selective and conceptual in nature, referring to it as an image -- a term that has pictorial or projective connotations -- is very misleading. Although there are some who have no objections to speaking of "conceptual images," I prefer the term "description" or "structural description" because it carries certain desirable connotations. (Pylyshyn, 1978, p. 173)
Among the desirable connotations Pylyshyn lists are that the representations are constructed on the basis of concepts, not raw appearances, and they are understood to attain their function by referring to objects rather than by resembling.(4) Pylyshyn adds that the relation between the representations and what they represent is "the primary reason for persisting in calling them 'descriptions'" (Pylyshyn, 1978, p. 177).
The specialized, theory specific, use of the term "proposition" appears in Pylyshyn's analysis of how sentential analogs are used and manipulated in vision. Since knowledge (knowledge in the computational system sense) is propositional and we need to have knowledge of what we see, it is necessary that visual appearances be translated into propositional knowledge. Otherwise, the appearances involved in seeing would remain inert and we would not be able to formulate any knowledge.
When we use the word "see," we refer to a bridge between a pattern of sensory stimulation and knowledge which is propositional. This is not to deny there are such things as appearances, only that if they have a role to play in cognition, the nature of such a role is at present a mystery. We cannot even talk about appearances without, in fact, talking about the propositional content of the appearances. (Pylyshyn, 1973, p. 6, emphasis added)
Why is visually-derived knowledge best understood as ultimately descriptive in nature? If we accept Pylyshyn's account, we agree that "something" stored in the brain must be responsible for our cognitive abilities.
This something is best characterized as a descriptive symbol structure containing perceptual concepts and relations, but having the abstract qualities of propositions rather than the particular qualities of pictorial images. (Pylyshyn, 1973, p. 7)
We see, then, that Pylyshyn follows the previously trodden path of philosophic descriptivists. Like Dennett, he objects to the very idea that we could store images.
In summary, Pylyshyn says that the special abstract qualities of propositions should be understood as implying three characteristics of the stored symbol structures or representations.
1. Raw sensory data are not stored. The results of sensory processes are already "highly abstracted and interpreted" in the stored representations. (Pylyshyn, 1973, p. 7, emphasis added.)
2. The representations are such that they are "not different in principle from the kind of knowledge asserted by a sentence." (Pylyshyn, 1973, p. 7, emphasis added.)
3. Since sensory events are encoded into a finite number of concepts and relations, the stored representations are "formally equivalent...to a finite...number of logically independent descriptive propositions." (Pylyshyn, 1973, p. 7, emphasis added.)
Again, the underlying reasons for all these characteristics
derive from other aspects of his theory. The representations
stored as a result of visual sensory processes must not differ in
kind from other representations; there can be only one mental
code that forms the knowledge we obtain from all sensory sources.
Since whatever knowledge we have can in principle be translated
into linguistic descriptions of some sort, this code more nearly
approximates descriptions than a picture or other form of
preserving and expressing information. Finally, it is assumed
that the processing of sensory data happens mechanically, and
that this sorting of information results in a finite, pre-processed set of data neatly filed into concepts and relations.
If sensory inputs were not processed into a finite set of
information packets, it would be impossible to perform
computational procedures on them in order to interpret them
further. Raw pictorial data are useless for any computational
procedure, since they represent a potentially infinite set of
concepts and relations. We must view sensory data as pre-processed or, as Pylyshyn would have it, "abstracted" and
"interpreted" in order for them to fit into a computational
scheme. The abstracted and interpreted data are most
conveniently understood as descriptions.
B. A View of Memory
Pylyshyn argues that the imagist view, that memory contains raw visual data, not conceptually processed or organized, simply does not hold up. The descriptivist view, he maintains, supplies a superior account of the structure an mechanics of memory.
Pylyshyn notes several undisputed facts about memory. (1) Memories of a given topic are often both global and specific. I may recall some general features of the party last month (many well-dressed guests) as well as very specific details that relate to the same event (Julia wore a gold necklace). (2) If visual data are forgotten, they disappear in discrete conceptual parts rather than pictorial parts. We have no memories in which a pictorial part of a normally complete mental image is missing. We do not have mental images of, say, half a desk in a room, as we would if raw visual data could be forgotten in the same way part of a photograph could be torn away or mutilated. (3) The process of interpretation may not end when simple recollection takes place. It is often possible to discover additional details or make deductions based on what is recalled. The phenomenology of recall is that I may remember Julia at the party, and in the process of this recollection (which may seem to me to involve the inspection of a mental image) I may also recall that she was the one wearing a gold necklace.
Pylyshyn argues that (1) and (2) can be explained most simply if visual memory depends entirely on a network of conceptual links. This would explain the speed with which we are able to recall various details stored in memory. When asked if someone at the party last month wore a gold necklace, we may be able to respond in an instant. This suggests that something like a hierarchical index of memories exists that can be accessed in the manner of a computer data base. In this case, the indexical hierarchy might correspond to: last month, parties, place, time, hosts, visitors, unusual items, watch, gold necklace. A procedure involving a search through a data base structured in this way would be much faster than one in which entire visual scenes had to be replayed and individual items picked out in order to accomplish the recollection. If unprocessed images were involved in recalling details of this type, the imagist theory would have to hold that inspection of them occurs at an unconscious level, and if the images were searched qua images, this would imply a homunculus. As we have no conscious experience of searching through randomly sorted images for details related to various topics, the homunculus option seems the only one available to the imagist.
In addition to the concerns with the speed of operations based on images, Pylyshyn points out that if uninterpreted images were stored, it would place a huge burden on the storage capacity of the brain. Since images would have to be both broad enough (the party) and detailed enough (a necklace) to meet the needs of later recall, virtually every aspect of experience would have to be recorded. The more probable mechanism, Pylyshyn thinks, would involve the selective storage of discrete conceptual data.
The same considerations apply to loss of memory. The kinds of things we forget tend to fall into specific categories or features originally derived from visual impressions. We may forget the color or exact location of something, but not its size. This indicates that the features of items are structured in memory by way of discrete conceptual links rather than in holistic images.
An objection can be raised at this point. Suppose that
there is a hierarchically arranged conceptual map of the items in
memory. That does not rule out the possibility that there are
images filed, as it were, in certain spots. This would combine
the speed of a conceptually-driven search with the possibility
that the search could terminate in an image. This would account
for the speed and storage problems and still maintain images that
could conform to our experience. Pylyshyn responds that this is
a pointless arrangement. The content of an image has to be
interpreted in the first place in order for it to be properly
placed in the set of conceptual links. Once the content of the
image is established, what is the point of maintaining the image
in storage?
C. Models, Recall and Problem Solving
A major objection Pylyshyn has to the imagist view is that it appears to rest on ordinary experience and simple examples rather than a scientific theory:
This "image retrieval before perception" view is phenomenally very powerful and is implicit in the everyday sense of the word "image." It is also present in all the illustrative examples used by psychologists to persuade their colleagues of the reality of images. (Pylyshyn, 1973, p. 9)
The various improvements to the imagist position during the period since 1973, including a massive amount of experimental data, have not changed Pylyshyn's assessment of the imagist position. In a more recent book, Pylyshyn again charges that the pictorialist theory just reduces to the phenomenological truth that mental imagery is similar to seeing (Pylyshyn, 1984, p. 251).
As a case in point demonstrating how intuition takes the place of theory, Pylyshyn mentions the influential window-counting example (originally used by Shepard): one is asked to count the number of windows in one's house; almost invariably, subjects report having phenomenal images of interior views of rooms or of exterior views of the house (Pylyshyn, 1973, p. 18). Citing examples of this sort, imagists have claimed that since the phenomenal image appears to be a necessary element in the process of recall, images must be part of the operative cognitive architecture of the mind.
Pylyshyn theorizes that so-called necessary imagery is explained entirely by the operation of unconscious algorithms, or computational procedures, on previously stored and interpreted perceptual information. Consciously, it seems to us that we extract new information from images, but this can be explained by unconscious procedures operating on previously stored information. Strictly speaking, no new information is being generated; we can not create more information than is already stored. We can only improve upon or transform the levels of access we have to previously stored information. The various procedures applied to the information contained in the model yield information that was not previously made explicit, resulting in the impression that the image is being inspected or interpreted directly.
Why do visual images appear if they are not necessary from a computational standpoint? When we encounter a problem that is too difficult to solve by conceptual memory or simple deduction alone, more cognitive resources are required. Pylyshyn suggests that the problem parameters are moved to a "workspace" or area where there are sufficient cognitive resources to solve the problem, and in which a "more detailed representation may be generated...than is, in fact, called for by a particular cognitive task" (Pylyshyn, 1973, p.19). In a computer, the use of a workspace can be created by simply allocating more free memory to a problem. In the human brain, Pylyshyn suggests, the workspace may involve the physical locations responsible for producing visual presentations. The generation of a visual presentation is an automatic process that uses the information derived from the application of algorithms to the original descriptive information, but the generated image itself does not play a part in obtaining a solution to the problem being worked on. The solution is arrived at by means of other processes, and the image is simply a left-over manifestation of that fact.
We can understand what Pylyshyn means by analogy to the case of a computer instructed to deduce the consequences of Euclid's postulates. The proofs themselves proceed by means of procedures applied to symbolic encodings that represent relations such as equality, inequality, greater than, as well as items such as lines, points, distances, and so on. If these symbols were then translated to instructions to produce visual images by means of a mechanical writing pen, it is obvious that the accompanying drawings would have nothing to do with the computing processes themselves. Pylyshyn's suggestion is that human mental imagery is like the computer case: mental figures, mental drawings, the entire subjective feeling of the "use" of mental images, are entirely epiphenomenal.
Pylyshyn identifies this theory of image generation as the procedural model theory. One can recognize that the mind produces models or visual representations of cognitive processes without implying that these models are themselves operative in cognitive processes. Although this point has already been established in theory, it is one that needs further emphasis and clarification. It might be thought that in allowing that images are models, the view Pylyshyn is promoting is not very different from the imagist position after all. Pylyshyn is well aware of this. He writes:
Such considerations might suggest that we are tending towards the view (favored, e.g., by Chase and Clark, 1972) that while picture-like entities are not stored in memory, they can be constructed during processing, used for making new interpretations (i.e., propositional representations) and then discarded. ...There is little harm in using the metaphor in this context so long as one can resist the temptation of assuming that the relation of the model to its cognitive representation is like the relation of any physical object to its representation. (Pylyshyn, 1973, p. 19)
Pylyshyn means that we must not think there is any use of an inner visual model that corresponds to perceiving a physical stimulus. Chase and Clark take the imagist view on this point. They state, "...the subjects are in fact abstracting new information from the visual image in much the same way they would from a physical stimulus" (Chase and Clark, 1972, p. 229). The problem with such analogies, Pylyshyn points out, is that a physical model of something has properties that are independent of whatever knowledge was used to construct it. A drawing produced on paper or a computer screen is of a certain size, a certain color, and is visible to the eye. The knowledge used to construct it has none of these properties. Furthermore, the physical properties of a model, its size, color, and other visible features, are completely irrelevant in deducing additional properties about what it represents. What it represents is determined by the conceptual information from which it was derived. For these reasons, Pylyshyn argues the model can not enter into the process of the interpretation of knowledge already stored except in performing the ancillary function of "making what was implicit in the [stored] description more explicit, accessible, and manipulable" (Pylyshyn, 1973. p. 19).
It is clear from these considerations that in Pylyshyn's theory the translation from description to model does not represent any shift in the kind or form of knowledge represented. There can be no creation of new, specifically visual information in the model. This would be creation of information ex nihilo. The model, just as the original data from which it was formed, can be reduced to a list of propositions. The processes that operate on the model are not perceptual processes, but formal, computational processes. Thus, Pylyshyn concludes, "the representation corresponding to the 'image' is more like a description than a picture" and "there is nothing in the representation corresponding to the notion of 'appearance'" (Pylyshyn, 1973, p. 22).
In the procedural model theory of mental images, the use of
models in computational theory forms the basis for describing the
meaning of our experience of mental images. Like a visual model
generated on a computer screen during a computer's computations,
the image has no causal role in our own consciousness.
D. Critique of Pylyshyn's Theory
1. Knowledge, Abstract Objects, and Explanation
I suggest that much of the plausibility of Pylyshyn's account trades on the philosophic idea of abstract entities. Pylyshyn follows philosophers in pointing out that no specific encoding or expression of a proposition is identical with its content or meaning. Neither Pylyshyn nor philosophers are inclined to doubt that expressions have meaning. But philosophers are circumspect about how specific expressions acquire the meaning. Some kind of theory about meaning is required, whether or not it ultimately ends in including propositions as abstract objects. Pylyshyn, on the other hand, assumes that while meaning and its expression are separate, meaning is just equivalent to what has been derived from sensory inputs and coded into a finite set of descriptive elements in a mental interlingua. If this were true, meaning would be encompassed in a finite set of particulars, regardless of how "abstract," "interpreted," or "propositional" one may choose to understand them. This just reestablishes the problem of meaning on a new level. Instead of explaining how individual verbal expressions acquire meaning, we now have to explain how each expression in the mental interlingua acquires meaning.
The confusion reflects, I believe, a fundamental difficulty in a computational, materialistic theory of the mind. If the theory is true, some arrangement of concrete material parts and the processes involved with them must be responsible for the phenomenon of knowledge. The problem is that it is far from obvious how this can be possible. The suggestion that content is due to the syntactic, rule-governed operations over the representations, in such a way that the function of the representations mirrors the content they would have when viewed from outside the system results in a number of problems.
Consider just two problems. First, there is the problem of
establishing the meaning of fundamental real-world relations.
Suppose there is a brain representation of a fundamental relation
such as "is a part of." How is the meaning of the relation
itself generated? One might propose, as some have, there are
certain primitives, such as "is a part of," "to the left of," and
"is the opposite of," innately coded in the brain. This appears
to be an ad hoc assumption made to make the system conform to the
demands of the real world. If it assumed that the system
"learns" these fundamental relations, this leads to a second
problem. All the fundamental spatial relations are inherently
ambiguous and context sensitive in application. Is my hand "on"
the table when only one finger is touching? If the cat has only
its hind legs on the mat is the cat on the mat? This requires a
propitious selection of even more primitive terms -- more
primitive than gross object terms such as "cat" and "mat," or
gross relation terms, such as "on." No automatic mechanism, as
far as I know (PDP networks excluded -- they lie beyond the
present scope of concerns) has been proposed to explain how this
can be achieved.
2. The Structure of Memory
Pylyshyn's argument about the structure of memory has a number of strong points. It is true that there are no "broken" or partial images that appear in consciousness in the manner of torn photographs. It is true that memory information is retrieved in more or less complete conceptual units very rapidly and with no consciousness of searching through a huge number of randomly filed images. This latter point can be amplified. If memory were only a collection of images, there would be no way to tell what they were images of. All this points to the necessity of some form of conceptual organization to memory.
Yet, Pylyshyn's argument does not offer much that is really new in the history of theories of memory. It has been recognized from the earliest times that memory requires some consciousness of the time and context of prior events in personal history, and therefore some conceptual organization. Pylyshyn's complaint that raw visual data can not be stored because the storage capacity of the brain would not allow it, is simply an assumption. As Kosslyn points out in rebuttal to this argument, we simply do not know what the storage capacity of the brain is (see Kosslyn, 1980, p. 20). There is no reason to set an arbitrary limit on the scope or extent of memory. Similar considerations apply to Pylyshyn's objection that there is no point to storing an image once the conceptual information about it is stored. There may be no logical or computational reason, but there are plenty of reasons, derived from evolutionary adaptations, for animals to store information that is currently useless but may have enormous survival value in some future context. Again, we simply do not know precisely what memory actually consists of or why everything in memory is stored the way it is.
Pylyshyn's arguments about memory then, are not so much about how memory is as about how it must be if the principles of computationalism are true. If all information must be propositional in nature there will be no place in memory for non-propositional and non-computable elements such as images. If images are understood to be raw appearances only, having no conceptual content, then they cannot be translated into the computable, "propositional" form necessary for computationally-inspired cognitive psychology. Kosslyn, and others in cognitive science against whom the argument is directed, often speak as if images were raw appearances, yet they also claim to accept the the strictures of computationalism. Therefore, Pylyshyn's arguments about the implications of computationalism should be given careful consideration when his theory is pitted against the other side in the imagery debate.
Some of Pylyshyn's comments actually support the view advanced in Chapter 3 that memory has no specific format. Pylyshyn first states that the amodal information could in principle be stated in a description; later he softens this somewhat, allowing for the possibility of "conceptual images." He agrees that memory information is to be thought of as something that is neither words nor images. While I do not agree that it is the format of brain representations (if there can be said to be such) that we need to worry about, this at least corresponds roughly with the idea that the cognitive capacity (i.e., conscious capacity) of memory is a readiness for multi-modal interaction.
In sum, then, Pylyshyn's view of the structure of memory
just tells us how it must be if his theory is true. As for it
being impossible for there to be images, qua visually inspectable
items in memory, this follows as a matter of course from the
strictures of his theory, the nature of computational
representations, and what he presumes to be the content of
original perceptive input. The challenge to the computational
imagist is to show how to circumvent this difficulty.
3. Problems for the Procedural Model Theory
According to Pylyshyn, when we count the windows in the mental image of our house, we are relating several concepts, such as number, windows, and my house, and applying a set of procedures (Pylyshyn, 1973, p. 18). Pylyshyn suggests there is some procedure, let us call it count, that simultaneously applies to and relates the above concepts, resulting in the output of a specific number. In discussing this example, Pylyshyn implies that since the above concepts and procedure would require considerable computing, they might all be moved to a workspace, resulting in the phenomenal image of a house when performing the count procedure.
Is Pylyshyn's account of window counting credible? Pylyshyn holds the number of windows is not explicit knowledge:
It would be unreasonable to hold that "number of windows in my house" is a static term in my store of explicit knowledge. Indeed it could not be the case in general since there is no limit to the number of propositions of the type "number of Xs in N" which a person potentially possesses. (Pylyshyn, 1973, p. 18)
According to Pylyshyn's theory any knowledge is either implicit
or explicit. The number of windows is knowledge, therefore, that
is implicit and that is made explicit by the count procedure.
How does this occur? Presumably, I automatically store
information about the relations of some primitive constructs that
compose what the visual input is. In the case of a house, I
store information about rooms or exterior sides in terms of
selected primitives, including windows. Now, since the total
number of windows is by hypothesis unknown to me, it must be
presumed that there are what correspond to cognitive slots for
the concept "instances of windows in room (side) X." We can
imagine that within a set of visual data corresponding to a
particular scene, each instance of "windows" is, as it were,
chalked up on a mental score card. The windows are not actually
counted during storage, but each instance is recorded. The
resultant stored knowledge can be symbolized as indicated below.
*
Windows * *
* * * *
* * * * *
Location 1 2 3 4 5 6
For each location, marks serve to record the instances without indicating totals. A procedure, count, will count the marks at each location, yielding a number that can then be used to keep a running subtotal.
If this is an accurate construal of how to understand a descriptivist explanation of the window-counting example, it has the advantage of being rather simple in execution. It also betrays some cleverness in selecting and designing data structures to fit the task. And this, I suggest, is something of a problem. There is too much cleverness or fitness for purpose here. Having no knowledge of any possible future uses (except possibly to answer questions on psychological tests), why is my cognitive system so clever as to provide this interior score card for my counting exercises?
Let us try to consider how such memory problems are consciously solved. Our memory experiences seem to be triggered not by any consciousness of the concept of windows or of windows as numerical ciphers, but by recall of many experiences about interactions with various windows (such as numerous cleanings, openings, etc.) that allow us to form visual mental images. As we recall these various experiences, we are enabled, within the context of a larger constellation of memories, to keep track of windows specifically. These memories generally owe strength to a set of experiences with particular objects at various times. If we do not interact with a particular window in a large house, we are likely to forget it entirely.
According to Pylyshyn, our memory experiences do not figure into the matter. The count procedure does all the work. What degree of disparity is to be allowed between our subjective experiences and any presumed computational explanation? Pylyshyn concludes his 1973 article with the remark that we will make no progress until we "remove the image metaphor" and replace it by the "fine detail information processing model whose relation to the experience of imagery is...really quite a secondary matter" (Pylyshyn, 1973, p. 22).
Some (Block, 1983; Tye 1991) accept the logical possibility that our experiences could be of one form while the processing that accounts for it could be of another. Block points out that as a general principle, it is not correct to assume that like causes produce like effects. We can not reason that because our visual experiences are "picture like" or seem to involve images that there must be literally picture-like representations that produce them. It is indeed a logical possibility, and it may be one philosophers can accept as grounds for entertaining a descriptivist view. As a logical possibility, we give the computational theory-builder a free hand to create the formal representational constraints in any way necessary for a successful model. The thrust of the imagist complaint, I believe, comes from psychologists who want more than what logic allows for in the relation between our experiences and the underlying representations. They ask, in effect, "what constraints, relative to the nature of our experiences, should the theory-builder allow in describing the form of the internal representations?" Suppose that instead of supplying no information on cognitive processes, our subjective experiences supply clues (at least) as to the internal functions of our brains (Kosslyn, Shepard, and others believe this is true, see discussion of Kosslyn's views on epiphenomenalism, below). How does this fit with the hypothesis of epiphenomenalism? Consider the window counting example. I experience quasi-visual images of windows when I count them. I do not experience counting the settings of neurons in by brain. But my cognitive system is going through a procedure that actually counts real states of some sort of individually marked particulars in my brain. Why do I not experience numbers or ciphers in my visual consciousness instead of windows? This would also be consistent with the purported internal processing. Some rationale for the kind and extent of the experiences generated in the process of recollection seems to be needed within the descriptivist account.
Aside from the nature of our own conscious experiences in memory tasks, there is another element in Pylyshyn's argument that needs further explanation. We mentioned the need to explain the foresight necessary to catalog windows as items to be enumerated. Pylyshyn tells us there are an infinite number of things I might record that would ultimately fit the form "the number of X's in N." While my cognitive system was chalking off instances of window, it might just as well have been checking instances of brick, shingles, and lighting fixtures. Why should any one thing be selected as a primitive unit for recording instances?
It seems clear, as I suggested in Chapter 3, there must be
some account of human interest that structures the priority of
the information to be stored as a percept. Once we appeal to
human interest, we again need to explain how this is to be
accounted for in computational terms. Pylyshyn and many others
have recognized this problem and identified it as "the robot
frame problem." This is the problem of how to know when to apply
general or background knowledge when interpreting input or
solving problems. An ordinary (contemporary) "seeing" robot can
not always distinguish between object and background or between
objects it should test by compressing with, say, 500 pounds of
force and those it shouldn't (hands) without enormous
computational resources. As for accounting for interest, the
computationalist is forced to supply either an innate set of
"interest codes" or a long series of other instructions with only
a dubious connection to ordinary (biological/intentional) meaning
of "interest" at best.
V. Pictorialism: Kosslyn's Theory
Overall, Kosslyn's position is relatively clear. He means
to salvage the common-sense notion that images are a part of our
mental life, retaining a place for images in a science of the
mind that does not reduce images to mere epiphenomena. At the
same time, Kosslyn means to defend a specific scientific idea,
directly opposed to Pylyshyn's, namely, that there can be
specific functional structures that have pictorial properties.
In this section I clarify Kosslyn's claims about
epiphenomenalism, and then show that his attempt to salvage
common-sense runs aground because of his commitment to
computationalism. Neither his theory of depictive data
structures nor his theory of inspection shows there are any true,
image-based representations in his system.
A. Kosslyn's Anti-Epiphenomenalism
There are two tracks in Kosslyn's discussions about epiphenomenlism. As these are not always kept separate in his own discussions, we need to discern what they are and how Kosslyn understands their relation in making his claims about epiphenominalism.
The first track invokes a folk-psychologic concern. Kosslyn asks, "Are the spatial, picture-like images people report experiencing a functional part of mental life?" (Kosslyn, 1980, p. 29, my emphasis). This, of course, is part of the question posed in our own inquiry, as we stated in Chapter 1. I believe what Kosslyn wants to say about this sort of epiphenomenalism is relatively clear. Like the traditionalists in psychology, Kosslyn wants to defend the idea that our conscious mental images are causal factors -- that our basic intuitions about mental images are usually sound.
The second track involves a scientific concern. He holds there should be unconscious representations in the brain that function distinctively as images. These unconscious representations are variously referred to as the "functional images," "quasi-pictorial image representations," the underlying "data structures," or the "images per se." Kosslyn opposes Pylyshyn's idea that our experiences and the underlying representations do not match, insisting that our experiences, rather than being deceptive, are mirrors of the form and function of the underlying representations. He wants to maintain there is a parallelism in form and function between the phenomenal and functional images. This position is clear from a remark Kosslyn makes in summary of his position:
It is important to note that the foregoing claims [about the general properties of images] not only posit properties of the representation per se [i.e., the functional representations in the brain], but also that we have the interpretive processes to "see" these properties. (Kosslyn, 1980, p. 34)
The functional representations, as he puts it, "give rise" to the experiences we have and our experiences "index" the depictive properties of the underlying data structures (Kosslyn, 1980, p. 30).
While espousing that there should be parallelism between functional and phenomenal states, Kosslyn is also conspicuously careful about the following claim, in which he recognizes the logical possibility that parallelism may be false:
... As a working hypothesis I began with the notion that the characteristics of an image evident in the experience of imagery did in fact index characteristics of the underlying data-structure, but this need not be so. (Kosslyn, 1980, p. 30, note)
Obviously then, Kosslyn's other reasons for thinking that the epiphenomenalism of experiences is false give way, in the final analysis, to the scientific requirement that he remain neutral on the issue until more evidence is at hand.
What, then, are Kosslyn's actual claims? I believe they amount to the following. First, he claims that it is theoretically possible for there to be data structures and a procedural model that incorporates images, therefore giving him stronger reasons to suppose his parallelism hypothesis is true. Second, he claims that it turns out to be true, based on experimental evidence, that there are image-like data structures.
In summary, there is a level at which Kosslyn's view
represents an attempt to follow common sense and restore mental
images to psychology, particularly cognitive psychology. This
level takes second place, however, to the pursuit of a scientific
ideal, according to which cognitive phenomena are to be explained
in terms of computations on data of a specific type, having
imagistic properties.
B. The Theory of Depictive Functional Images
1. Basics of the Theory
Kosslyn claims that the functional mental images exist in a "spatial" medium. By this he means that mental images are significantly restrained by the computations that can be performed on them. He claims that images can be scanned or rotated only at certain rates determined by the operational parameters of the structures and processes operating in the brain at the time. The "space" in which images are said to exist forms a mental architecture that will place permanent, significant constraints on the way shapes are represented and manipulated by the mind.
Kosslyn states that the computational space for images is like a coordinate space:
Images occur in a spatial medium that is functionally equivalent to a (perhaps Euclidian) coordinate space (Kosslyn, 1980, p. 33)
If the space is Euclidian, this means that the locations of parts of the object imaged are accessed such that the original (perceived) distances between parts of the object are functionally maintained during recall or imaginary reconstruction. Again, this does not mean that the parts of the representation in the brain are literally in the same relative position and same distances apart, only that they are handled by the mind as if they were.
Kosslyn claims this conception of an inner space for mental representations is consistent with the computational notion of a functional space:
A perfect example of this is a simple two-dimensional array stored in a computer's memory: There is no physical matrix in the memory banks, but because of the way in which cells are retrieved, one can sensibly speak of the intercell relations in terms of adjacency, distance, and other geometric properties. (Kosslyn, 1980, p. 33)
Since Kosslyn uses this example in support of his claim that functional images depict, it is appropriate to analyze it in more detail.
Let the following list represent the contents of a set of
memory cells. Each cell, numbered 1 through 12 contains two
values. In computer terminology, this is a two-dimensional
array. (Also see Tye, 1991, Chapter 3, pp. 33-60, for a similar
treatment of arrays in Kosslyn's theory. I am indebted to Tye's
exegesis for my understanding of this aspect of Kosslyn's theory.
Tye, however, accepts the viability of this aspect of Kosslyn's
theory, while I understand the treatment of arrays to demonstrate
the essential conceptual flaw in the computational aspect of
Kosslyn's theory.)
1. 2,2 7. 4,2 2. 2,3 8. 5,5 3. 2,4 9. 5,2 4. 2,5 10. 5,3 5. 3,2 11. 5,4 6. 3,5 12. 5,5
By themselves, the values in each cell have no meaning. A system
of interpretation is necessary to convert the array values into
some meaningful information. For illustration, we shall suppose
the values are interpreted as row and column numbers on a grid.
We shall suppose the grid is finite, consisting of six rows and
six columns. Assume the cells indicate the locations in the grid
that are to be filled with a visible point or marker. Hence, the
information in cell number one indicates that grid location row
2, column 2, is to be filled. This scheme allows the numbers to
be used to indicate the outline of a shape within the grid. The
resulting grid, when filled in by an "X" at each row and column
indicated appears as follows.
1 2 3 4 5 6 1 2 X X X X 3 X X 4 X X 5 X X X X 6
It is obvious that the conversion scheme used has transformed the list into a depiction of a geometrical figure.
The functional intercell relationships to which Kosslyn
refers can now be understood in terms of this example. We may
say that the contents of cell number one is functionally above
cell five because the application of the prescribed procedure
results in a depiction wherein the information in cell one
produced a mark that is above the mark produced by the
information in cell five. Hence, original data, combined with a
translation scheme yields a functional space in which spatial
relations apply.
2. Critique of Functional Depiction
So much for what I take to be the essentials of how Kosslyn means to make depiction compatible with computation. We must now ask, in what sense does the property of depiction belong to the representational data type he calls the image? It is important to recall that according to Kosslyn's theory and his own analogy to the computer array, the image properties need to inhere in the list and what it is used for in the system, not to any actual display, drawing, or even any inner mental "depiction" at all. Has Kosslyn succeeded? Not at all. Clearly, our example demonstrates that the list and what it results in requires a further interpretation. It is not obvious that the outcome of the procedure is an "O" shape. We had to bend the truth of the matter when we stated above that the resultant depiction was "obviously" a geometric shape. The so-called outcome is what it is only to a trained, conscious, and presumably human interpreter. The outcome can only be obtained by supposing a true publicly-visible visual particular. Therefore, the entire sequence falls prey to the homunculus objection.
From the computational standpoint, the series of transformations from list to a series of marks on a physical surface accomplishes nothing. Just as the drawing machine produced images that were irrelevant to the computations made from Euclid's postulates (see discussion of Pylyshyn, above), so the resultant "depiction" here is entirely outside of the computational flow. The sense that the result is an image is brought to us by the insertion of a supposed mechanism by which numerical values were transformed into physical actions that produced visible marks, followed by the imposition of how human consciousness would accept what is physically present in the visual field.
And what if the image were produced in the visual cortex itself? This changes nothing. From a physiological standpoint, he simply establishes the possibility of setting up certain "points" in the visual cortex.(5) These "points" are causally inert until the subject interprets them. This is precisely Pylyshyn's objection. The interpreter can not be the whole conscious mind; that would suggest that the entire visual system is activated and the subject actually sees the image. The homunculus problem recurs. The "interpretation" of the activated points, to the extent there can be said to be any, must come from non-visual content already inhering in the computational state that generated the "O," namely, the content "this figure is an 'O'." (Tye comes to the same conclusion that this is a necessity of computational schemes. He defines images as a "symbol-filled array" to which a "sentential interpretation" is affixed (1991, p.90))
Again, Kosslyn's argument is that an uninterpreted symbol array itself counts as a depiction because of the way it is used to generate a phenomenal appearance to a conscious subject. It is clear, however, that a list of numbers can not be said to depict anything in and of itself. The application of a computational procedure to it does not change its nature. Therefore, depictive properties of assumed list-like data (or other data formats) presumed to exist as the basis for image generation can only be said to be depictive in a metaphorical sense. The list data is metaphorically depictive because it ends up being used to generate an image.
I conclude that although data structures in the cognitive
system (assuming they exist) might be said to be depictive in
virtue of their results in consciousness, their own properties
are not inherently depictive or spatial in any sense that serves
as an explanation for phenomenal appearances. To say these
abstract functional qualities of a system are "depictive"
represents a word choice that is appropriate only by analogy to
ordinary vision. Therefore, Kosslyn does not show that there are
computational entities that are, or should be, recognized as
having the properties of images.
C. Kosslyn's Computational Model for Inspection
1. Basics of the Inspection Model
What shape are Snoopy's ears? Does a Volkswagen Bug have a vent window? Questions like these are likely to elicit imagery experiences in which a mental image is, ostensibly, inspected by the mind's eye. From 1977 through 1994, Kosslyn developed many computer models of the process of inspection. Each one is intended to demonstrate how these imagery experiences can be completely explained in terms of the flow of information and the interpretation of data within a computer. Kosslyn sought to avoid the homunculus objection by showing that the mind's eye in the computational process "is equated with a set of procedures that operate to categorize spatial patterns; it is an interface between propositional representations and depictive representations" (Kosslyn, 1980, p. 160). This strategy, I show, also fails to generate any computational use of images.
Kosslyn's strategy is to introduce information processing modules into his computer emulation that are intended to perform functions analogous to human cognitive processes. Some of these modules are based on conscious processes we are directly familiar with. If we imagine a penny on the sidewalk seen from ten feet away, we can "zoom in" on this image until the penny image becomes large enough to "see" the details on the face of the penny. We can also have experiences in which a mental image is rotated, or our point of view seems to shift across the surface of the image. Kosslyn gave the names ZOOM, ROTATE, and SCAN to the modules in the program that he believes emulate these human capabilities.
These modules operate on information data files designed to emulate information in short term memory (STM). The information in STM data files is derived from data files representing information in long-term memory (LTM). All the information is controlled and accessed through computational modules designed to meet the demands of computation and to be as consistent as possible with human experiences and capabilities.
The process of inspecting an image begins by recalling the information necessary to generate an image from the LTM files. These data files consist of both positional information and propositional information. Each sort of object that can be imagined has both sorts ofinformation stored in LTM. The positional or "literal" information includes lists of locations of points that are to be "filled" in the row/column structure of the STM matrix. The positional information, then, generates a display or "surface image" in which appearance and geometric properties become explicit.
The propositional information consists of a set of lists of information about the object. The information embodied in these lists is extensive. We shall not give a detailed account of each type of list, but these include lists containing information on the parts associated with each object, the locations of various parts relative to other parts, the sizes of objects and parts, the catgories to which the object belongs, and so on.
When we are asked a question about the presence of a part on an object Kosslyn describes how the model handles this using a long series of complicated steps. The name of the object is located in LTM, a skeletal image is generated in the surface matrix (STM) from the positional information, the part is searched for in the lists of parts for that object, the part is located in the skeletal images using mathematical rules, and so on. At this point, the part in question is located on the surface image, but is not yet "seen" by the mind's eye. The reason for this is that all the parts of the image are not capable of being seen or resolved in the same degree at the same time. This is meant to correspond to the introspective fact that we inwardly scan across a mental image, bringing in one part then another under the scrutiny of the mind's eye. Kosslyn parallels this phenomenon by including what corresponds to an attention window in his simulation. The part in the image is not seen until a procedural module called SCAN is activated. This module translates the surface image across the spatial matrix until the part in question is in the center of the matrix. This corresponds to our being attentive to the part in the image generated for the mind's eye.
In following the process so far we have only succeeded in
locating a set of points at the attention window. But the
activated cells being "attended to" are still simply points in a
representational space. How is a part recognized? How can the
region of activated cells in a portion of memory be known to
represent a particular thing?
LTM INFORMATION ----------> SURFACE IMAGE (STM INFORMATION)
|
|
active cells ------------------------------------
^
|
|
ATTENTION
WINDOW
The answer lies in the use of information in LTM that provides the description of the part sought. The FIND process module initiates "a set of procedures that test for various spatial configurations in the surface matrix" (Kosslyn and Schwartz, 1977, p. 276). These tests on the surface image are derived from the description originally retrieved from LTM.
2. Critique of the Inspection Model
You have been given the job displaying information via a matrix of electric lights at Times Square. You are shown how to program a computer that will turn the circuit to each light bulb on or off at the correct moment in order to form letters or other shapes on the display. You are meticulously careful in compiling lists of electric circuits and the times at which each is to open or close. You check your program against another previously-stored program that reads your program, checking for errors. You even have a simulation program that shows you, on your private CRT, what the result will look like when your program is placed in action. After all these steps have been taken, have you accomplished your mission?
One might be tempted to answer the above question with a resounding "Yes!". A moment's reflection will show that this can not possibly be the correct answer. What was the mission? The mission was to display information via the electric lights on an actual, physical, publicly visible installation -- not to write a program about hypothetical circuit-closings and simulated lights turning on. For all you know, every single light bulb in the physical display may be burned out, or fifty percent of the relays closing the electrical circuits may not be functioning. Your program is irrelevant to the actual events happening on the matrix of lights at Times Square. The only way to be sure your mission is accomplished, is to visually inspect the light output of the matrix itself.
This, I believe, is a telling analogy to Kosslyn's inspection model. His description of inspecting images involves processes that are completely unnecessary for an unadorned information retrieval process. When the question is asked "does an X (object) contain a Y (part)" the information is already in LTM. The parts that definitively belong to an object are given in lists in propositional form. No additional process described in the model improves on the ability to answer the question of whether or not the part is believed to be a part of the object.
The use of descriptions to interpret the patterns in STM of categorization is unnecessary because the description of the part and the description of various categories ("round," "pointed," etc.) both exist in LTM. In theory, they could be compared directly without the translation into spatial information. The process represents a needless circumlocution of computational steps; the information sought is available at the beginning of the process.
The spatial "image" itself, or the pattern of activation in STM, never actually enters into the flow of information. Just as the actual appearance of the light bulbs on the Times Square matrix remain out of the picture until actually viewed, the actual pattern of the cells activated in STM by Kosslyn's FIND procedure are irrelevant to the computation. The description of the pattern is analyzed, not the pattern itself. The actual states of the matrix cells in STM are never part of the process.
Just as in the Times Square analogy, we might presume that the cells have been activated according to a positional information pattern. But in order to test the actual states, we need to perceive the cells of the matrix being filled.
In Kosslyn's model, the information used to "activate" the cells is just a form of template matching with information already in LTM. This is precisely Pylyshyn's objection to introducing "imagery" steps in the computational procedures. In as much as they are not actually perceived, but only conceptually interpreted in terms of categories available for computation or matched against other data already existing, they are no different than any other process. If they are different in the way appropriate to their being classified as images, then, just as Pylyshyn has argued, they require not a figurative, but a literal mind's eye.
Kosslyn attempts to show too much from the model. A
cognitive model need not explain experiences or establish a link
between forms of processing and experiences, but Kosslyn has
implied that his model does. Kosslyn strongly implies (and
sometimes directly states, see 1980, p. 160) that the model is
actually instantiated in the human mind. His explanation of how
images are inspected in the model shows us how he understands the
actual inspection of mental images in the mind. Equating the
mind's eye process with a set of procedures only disguises the
fact that Kosslyn's model is like a shell game, simply shifting
the same information and the same procedures from place to place
without ever generating new information. The model does not
demonstrate that imagery experiences are, or could be, part of
the information flow in the machine (and since we are supposed to
be the machine, he needs to show this). In his remarks about the
phenomenal images we experience, he never indicates that they
amount to more than superficial and ineffective effects of the
"truely functional" underlying processes he is interested in
uncovering. As a result, Kosslyn's entire theory of the human
mind becomes consistent with epiphenomenalism.
VI. Summary and Conclusions
A. Inner Space: Merging the Theories
Kosslyn has suggested that mental images exist in a space. In one sense, this is surely a reasonable way to put it. I suggested in the previous chapter that we have forms of spatial awareness that mediate tasks, and that some of the actual neurophysiological mechanisms involved may be the same ones used during vision. Whatever automatic mechanisms are used for computing(6) distance, relative size, and the interrelation of parts in vision may be activated in imagery. Some of these mechanisms may enable the holistic spatial rotation of image representations (on a subjective size scale, it turns out, that must be much smaller than any so far revealed by empirical psychology -- see next Chapter). Whatever these innate mechanisms are, these may be said to be features of or limits to, the representational medium through which they appear; i.e., when understood as characteristics literally due to some physiological, rather than conceptual limit. We can not, for example, visually represent to ourselves four-dimensional geometric objects, even though we can conceive of them. Pylyshyn's theory is also compatible with the notion, since he accepts that physiology determines such limits and that there can always be an epiphenomenal activation of the visual centers during imagery tasks. Images as existing in a space in this sense, surely, is a substantial part of what Kosslyn tries to appeal to.
On the other hand, as we have seen, we can not derive the
notion of experienced space from computational models, and we can
not say that brain representations exist in a space or that they
depict or have colors. They (the brain representations) are not
what we are aware of and their physical properties are utterly
unlike anything we experience. It is the mental images we are
aware of have that represent and exist in mental space.
B. Epiphenomenalism
It is readily seen that whatever the nature of the structures and processes invoked to explain imagery in systems like Kosslyn's or Pylyshyn's, the phenomenal image is left intact. Neither side denies the existence of or any of the significant features of the phenomenal image. From the perspective developed in Chapter 3, neither theory has produced much in the way of illuminating the nature of our experiences. It might be that computational models could emulate and because of this predict some features of human cognitive processing, but this appears to have very limited potential for explaining how we solve problems, think or imagine. And, needless to say, creating a computer emulation of cognitive processes is an entirely different project than claiming that computational structures and procedures are actually instantiated in the brain -- and it is the latter claim that results in the problems of epiphenomenalism we encountered in this chapter.
We should not be so quick to dismiss these theories,
however. Epiphenomenalism is a logical possibility and we know
that common sense is not always right. Despite the theoretical
difficulties, there may be something that emerges from these
theories and from the study of empirical data on mental imagery.
We explore this possibility in the next chapter.
1. Hence, in my view, the perversity of designating them "symbols" or "representations" in the first place. It is true that the computational scheme fixes the status of the object X, but now it is the meaning of "to Z," i.e., the relation of the object to the agent that is utterly ambiguous. As the convention of referring to physical symbols in computational systems has been so widely adopted, it is almost hopeless to fight it. In describing such systems below, I will accept this convention, but the reader is advised that all of the new language of symbols, representations, and their cognates is not to be confused with the meanings previously proposed as consistent with traditional psychology.
2. This is not to be confused with the claim that, as Pylyshyn has often observed, that there should be three autonomous levels of the description of phenomena in cognitive science: the physical, the symbolic, and the semantic. While, ostensibly, Pylyshyn's brand of cognitive science allows the validity of higher level (semantic or symbolic) attributions of causal effectiveness, his descriptions of what actually counts as a cause (as far as I have been able to discover) invariably revert to the physical. Pylyshyn's account of how thought may affect digestion is a case in point. Pylyshyn says that a cognitive state, such as a belief, can be a cause of poor digestion because "every state of a cognitive system is simultaneously a biological state and a cognitive state" (p. 143). But since, in Pylyshyn's view, cause and effect must fall under the same domain of description, a belief can not be shown to alter digestion unless we "first do something like discover a relevant biological property that happens to be coextensive with a certain class of cognitive descriptions" (p. 144).
3. Pylyshyn has also recognized the salience of this point. He cites the fact that robots are unconscious as one of the reasons some are unwilling to accept the comparison between robots and humans as adequate tool for empirical psychology. Pylyshyn points out that in developing a cognitive theory concerns of this sort have caused theorists to leave aside "questions about what constitutes qualia" (Pylyshyn, 1983, pp. 44-45).
4. Referring not, obviously, in the sense of a subject's intentional reference to content, but in the sense of an internal symbol corresponding to the represented domain.
5. Kosslyn's explanation of inner "seeing", stripped of computationalism, amounts to the view I advanced in Chapter 3. It just states that there are probably neural correlates for this in the restimulated visual system.
6. In the biological, not the computational theory sense of "compute." The impulses of two nerves may stimulate a third, producing a single output, but this does not mean the two nerve pulses were added. The only theory I have about the nature of inner neural connections is that they almost certainly can not embody anything like contemporary computational architectures, with registers, stacks, fully addressable memory, communication buses, clocks, and so on. If they did, it would be nothing short of miraculous.