«as . ‘ a a éwflfi - z: a n r! . .u z. x . - :95 _ «mama...» f...\......n». Leif. E. A. ,2“ 3.4% '15. L5" . . 1...: - V. .r? a. 2.... .2. 1.. ..w l | . 9 ,'.Y . 1!. «’L 3 .h. 7%. W 3.19111449731 ‘wm. ”a. Emuwvnm.‘ “J“.flflnwmg ._ _ f4 7 _ it; 7351813 LIBRARY 2005 Michigan State University This is to certify that the dissertation entitled THE EFFECT OF PERCEPTUAL INFORMATION ON THE ACTIVATION OF SCENE GIST: THE NFLUENCE OF COLOR AND STRUCTURE presented by Monica Sofia Castelhano has been accepted towards fulfillment of the requirements for the Doctoral degree in PsychologL / ~Major Professor’s ye /§//QZ 0'? 5/ Date MSU is an Affinnative Action/Equal Opportunity Institution PLACE IN RETURN BOX to remove this checkout from your record. TO AVOID FINES return on or before date due. MAY BE RECALLED with earlier due date if requested. DATE DUE DATE DUE DATE DUE 2/05 mm.mms THE EFFECT OF PERCEPTUAL INFORMATION ON THE ACTIVATION OF SCENE GIST: THE INFLUENCE OF COLOR AND STRUCTURE By Monica Sofia Castelhano A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Psychology 2005 ABSTRACT THE EFFECT OF PERCEPTUAL INFORMATION ON THE ACTIVATION OF SCENE GIST: THE INFLUENCE OF COLOR AND STRUCTURE By Monica Sofia Castelhano Previous studies have shown that scene gist perception occurs extremely rapidly; however, many of these studies rely on participants’ explicit reports, making it unclear how soon after onset scene gist is able to influence subsequent behavioral responses. This dissertation examined the onset of scene gist activation and investigated two possible sources of information using a new methodology: the Contextual Bias paradigm. The Contextual Bias paradigm relies on the tendency of participants to affirm having perceived a target object that is consistent with the scene gist, and to disconfirm having perceived an object that is inconsistent. In this paradigm, participants judged whether an object (either consistent or inconsistent with scene’s gist) was present in a scene that was briefly shown previously. If the scene presented is processed to the level of gist, then participants should be more likely to respond “yes” to consistent and “no” to inconsistent targets. If, however, the scene gist has not been processed, then participants should respond “yes” to both target types equally. Seven experiments were conducted to explore scene gist activation, and how color and scene structure contribute to this activation. Experiments 1-3 demonstrate that scene gist is activated after 42 ms of scene presentation, and that the strength of the response bias (the response difference between consistent and inconsistent objects) increases with longer scene presentation durations. Experiments 4-6 examined the contribution of color and structure by manipulating color (color vs. monochrome) and sharpness (sharp vs. blurred) of the scenes. Results showed that color influences gist activation later (80 ms), and only when structure was degraded (blurred). Thus, color plays a role in rapid gist activation, but only when the scene’s structural information is relatively more difficult to extract. Experiment 7 examined whether color is acting as a boundary segmenter or is directly associated with gist information. Abnormally colored scenes were used to provide segmentation for equiluminant regions, but without any association with the ’ scene’s gist. Results support the gist information hypothesis. A weighted input framework of the interaction between structure and color is proposed to explain the results of the current and previous studies. Cepyright by Monica Sofia Castelhano 2005 To my mother and father, who taught me to dream big and that being somebody means being true to yourself. ACKNOWLEDMENTS As with any accomplishment (big or small), it couldn’t have been done without the help, love and support of a whole army of people. I have many people to thank. First, I’d like to thank my advisor John Henderson for all his guidance and support over the past five years. I also wish to thank Tom Carr, Erik Altmann, and Fred Dyer for serving as my guidance committee and for their words of advice over these past years. Also, to Aude Oliva for her advice and coffee in times of need. I would’ve never made it through without my family. I thank my mother, Gloria Castelhano who encouraged me to follow my dreams even when it meant making a hard situation even harder. I thank my sister, Marta Castelhano who has always been there to cheer me up and listen to my gripes for hours on end. George Frade, whose kind words pushed onto this path. To the rest of my family, who have always given me love and support despite my absence. I also thank my friends, Teresa Silva, Ana Cavacas, Patricia Ponte, Nancy Pinto, and Rui Vicente, who despite everything have kept me sane and kept believing in me. Of course, in the absence of my blood-relations, I have formed a great dysfunctional family of friends at graduate school. First, Mareike Wieth. I do not have the ability to properly thank you because your kindness, love and support in my times of anxiety are more than I will ever be able to repay. Tom Wagner, although I’ve only recently been a plague on your life, you have already helped me in more ways than I can count. Sian Beilock, Matt Husband, Chrissy Velarde, Lisa Helder, Chris Chan, Laurie Carr and Christy Miscisin, I thank you for putting up with me and for all your encouragement and support. Then there was P. Barrel and T. Twist for the sudsy vi libations and sweet treats that helped me get through the tough parts. I also wish to thank my fellow lab mates, Dan Gaj ewski and Aaron Pearson for their help and advice over the last few years. And I also thank Gary Schrock and Dave Mcfarlane for their help with all the technical stuff associated with doing research. In addition, there are a number of graduate students, who although have moved on in recent years, nevertheless offered me great advice and encouragement. I want to thank Kiel Christianson, Carrick Williams, Catherine Am'ngton, and Karl Bailey. Finally, I’d like to thank Zed the cat, for it would not do for me to forget to mention my fiirry companion for the past four years. And despite her not know many tricks or even being an avid hunter, she has kept my feet warm during late night sessions and on many occasions has allowed herself to be subjected to lazy petting. Also, the recent additions to our pet family, F litzer the hamster, Beta the beta fish, and the late Long John Silver, the long living tetra for being the subjects of endless hours of mindless starring when work became too much to bear. Thanks to you all. I would have never been able to do this without you. vii TABLE OF CONTENTS LIST OF TABLES ........................................................................... x LIST OF FIGURES ......................................................................... xi INTRODUCTION ......................................................................... 1 Rapid ScenePerception... .... 1 Perception of Scenes and Structure .................................................... 8 Perception of Scenes and Color ........................................................ 10 Studying Scene Gist Onset: The Contextual Bias Paradigm ....................... 16 Overview of the Current Research ................................................................ 21 THE ONSET OF SCENE GIST PERCEPTION .................................... 24 Experiment I ............................................................................. 25 Method ............................................................................ 25 Results ............................................................................. 27 Discussion ........................................................................ 29 Experiment 11 ........................................................................... 30 Method ........................................................................... 3 1 Results ............................................................................ 31 Discussion ........................................................................ 33 Experiment III ........................................................................... 33 Method ............................................................................ 34 Results ............................................................................. 35 Discussion ........................................................................ 36 AN EFFECT OF COLOUR ON SCENE GIST PERCEPTION .................. 39 Experiment IV ........................................................................... 39 Method ............................................................................ 40 Results ............................................................................. 41 Discussion ........................................................................ 46 Experiment V ........................................................................... 48 Method ............................................................................ 48 Results ............................................................................. 50 Discussion ........................................................................ 54 THE INTERACTION BETWEEN COLOUR AND STRUCTURE ON SCENE GIST PERCEPTION 55 Experiment VI ........................................................................... 55 Method ............................................................................ 55 Results ............................................................................. 57 Discussion ........................................................................ 62 viii THE ROLE OF COLOR IN RAPID SCENE GIST PERCEPTION ............... 64 Experiment VII ........................................................................... 66 Method ............................................................................ 67 Results ............................................................................. 68 Discussion ........................................................................ 76 GENERAL DISCUSSION ............... - -- -- - 81 Implications for Scene Perception 83 The Response Bias and Long-term Memory for Scenes" . . . . . . . . . . . . .. 86 Conceptual vs. Visual Representations of Scenes ................................... 89 Conclusion ............................................................................... 94 APPENDIX ................................................................................... 96 REFERENCES ............................................................................ 106 ix LIST OF TABLES Table 1. Mean (Standard Deviation) of Proportion of “yes” responses for Experiment 6 ................................................................................................................ 56 Table 2. Mean (Standard Deviation) of Proportion of “yes” responses for Experiment 7 ................................................................................................................ 67 LIST OF FIGURES Figure 1. Trial Sequence. Consistent Object was “Spatula” and Inconsistent object “Wrench”. 26 Figure 2. Proportion “yes” responses to consistent target objects (blue bars) and to inconsistent target objects (red bars) for each duration condition in Experiment 1. Error bars represent Standard Error of the Mean and the asterisk indicate significant difference between target conditions (p < 0.05) .................................. 28 Figure 3. Proportion “yes” responses to consistent target objects (blue bars) and to inconsistent target objects (red bars) for each duration condition in Experiment 2. Error bars represent Standard Error of the Mean and the asterisk indicate significant difference between target conditions (p < 0.05) .................................. 32 Figure 4. Proportion “yes” responses to consistent target objects (blue bars) and to inconsistent target objects (red bars) for each duration condition in Experiment 3. Error bars represent Standard Error of the Mean .................................................. 35 Figure 5. Depicts the Color (A) and Monochrome (B) conditions for Experiment 4. . ....40 Figure 6. Proportion “yes” responses to consistent target objects (darker bars) and to inconsistent target objects (lighter bars) for each duration condition in Experiment 4: (A) Colored scene condition (B) Monochrome scene condition. Error bars represent Standard Error of the Mean ................................................................... 42 Figure 7. Difference scores (Inconsistent — Consistent) for the Colored (blue) and Monochrome (red). Error bars represent Standard Error of the Mean ................. 44 Figure 8. Depicts the Color (A) and Monochrome (B) conditions for Experiment 5. All scenes were filtered; spatial frequencies higher than 1 deg/image (or 17 cycles/ image) were removed, leaving only med- and low- spatial frequency information ............................................................................................................ 48 Figure 9. Proportion “yes” responses to consistent target objects (darker bars) and to inconsistent target objects (lighter bars) for each duration condition in Experiment 5. (A) Colored scene condition (B) Monochrome scene condition Error bars represent Standard Error of the Mean ................................................................... 50 Figure 10. Difference scores (Inconsistent — Consistent) for the Colored (blue) and Monochrome (red). Error bars represent Standard Error of the Mean ................ 52 Figure 11. Example of stimulus color conditions used in Experiment 6; (A) Sharp- Colored condition, (B) Sharp-Monochrome condition, (C) Blurred-Colored condition, (D) Blurred-Monochrome condition .................................................... 55 xi Figure 12. Proportion “yes” responses to consistent target objects (darker bars) and to inconsistent target objects (lighter bars) for each duration condition in Experiment 6. (A) Sharp-Colored scene condition (B) Sharp-Monochrome scene condition (C) Blurred Colored scene condition and (D). The Coloured condition is represented with blue bars, and the Monochrome condition with the red bars; The Sharp condition are represented with the solid bars, and the Blurred condition are represented with the hatched bars. Error bars represent Standard Error of the Mean ...................................................................................................................... 57 Figure 13. Difference Scores (inconsistent mean — consistent mean) for each duration condition in Experiment 6. The Colored condition is represented with blue lines, and the Monochrome condition with the red lines; The Sharp conditions are represented with the solid lines, and the Blurred condition are represented with the dashed lines. Error bars represent Standard Error of the Mean .................... 60 Figure 14. Example of stimulus color conditions used in Experiment 7; (A) Normal Color condition, (B) Monochrome Color condition and (C) Abnormal Color condition ................................................................................................................ 66 Figure 15. Results of Experiment 7: Responses Proportions for the (A) Coloured scenes (B) Monochrome scenes and (C) Abnormal scenes. ........................................... 68 Figure 16. Difference Scores (inconsistent mean — consistent mean) for each duration condition in Experiment 7. The Colored condition is represented with blue line, the Monochrome condition with the red line, and the Abnormal condition with the green line. Error bars represent Standard Error ofthe Mean . ........71 Figure 17. Example of low abnormal and high abnormal stimulus used in Experiment 7; (A) Low Abnormal Color average rating (for image: 3.60; average rating for group: 3.77), (B) High Abnormal Color (average rating for image: 5.8; average rating for group: 5.84) ........................................................................................ 73 Figure 18. Difference Scores (inconsistent mean — consistent mean) for each duration condition in Experiment 7 for the colored (blue), abnormal -—low (dashed green) and abnormal-high (solid green) color conditions. Error bars represent Standard Error ofthe Mean74 xii INTRODUCTION Despite the complex nature of the visual information that surrounds us, we seem to have an incredible ability to quickly determine what scene we are currently viewing, even when time is too limited to view specific details. This becomes most apparent when we are flipping through channels on a television set or quickly perusing through a slide show of vacation photos. The study of how our visual system processes the panorama of visual information from our environment is referred to as scene perception. Rapid Scene Perception What defines a scene? People have an intuitive sense of what qualifies as a scene, but few researchers have explicitly specified the qualifications. Most ofien, a scene is referred to as a view of a natural environment, but this definition makes it difficult to specify whether something is not a “scene”: essentially any view of the world qualifies as a scene. According to Henderson and Hollingworth (1999), a scene is defined as “. . .a semantically coherent view of a real-world environment comprising of background elements and multiple discrete objects arranged in a spatially licensed manner” (p. 244). They fiirther specify background elements as immovable surfaces and structure, while discreet objects are small-scaled, discrete entities that are movable within the scene. Although not immediately obvious, one important aspect of scenes captured by this definition is that a scene (and its elements) is dependent on spatial scale. As a result, what qualifies as a background element and what qualifies as an object within the scene can change according to how much the view is zoomed in. For instance, a coffee table can be considered an object within a living room, but a closer view of the table may shift the coffee table to a background element, while a mug, remote controller and set of keys become the objects within that scene (for more examples, see Henderson and Hollingworth, 1999). In the present study, all scenes were views of natural environments that fit with the definition proposed by Henderson and Hollingworth. Furthermore, all scenes were scaled to a human size, because it corresponds most to the type of views of the environment that people experience in their everyday lives. Thus, examples such as close-ups of a desktop or satellite images of a city were not included as real-world scenes. The current work explores whether certain perceptual properties (i.e., color and structure) are important in the rapid processing of scenes. Although many researchers have focused on the perception of objects, there has been a recent increase in the number of investigators investigating real-world scenes. The study of real-world scenes was at first thought to be a simple extension of the object recognition literature, but on a quantitatively larger scale. However, the mappings between objects and scenes are not as simple as first assumed. It has become obvious from recent findings that in order to understand scene processing, real-world scenes must be the objects of study. The study of real-world scenes has also called into question many general assumptions about how the visual system processes information that originally emerged from the study of objects and other simplified stimuli (e.g., basic geometrical shapes, letters, symbols, etc.). These assumptions are: (1) that the more information that is available, the more time processing that information will take; (2) that in order to extrapolate the global properties, details must first be processed and represented, and (3) that in order for a scene representation to be functional (e.g., able to produce the phemenology of viewing the world as richly detailed), all of the rich detail must be somehow represented. In the following section, each of these assumptions will be reviewed, as well as how rapid scene perception has challenged these assumptions. Many researchers assumed that the more information there is in the environment to process, the longer processing should take, and for many simple stimuli, this assumption holds true (e. g., parallel vs. serial visual search). For example, when searching through a random arrangement of objects on a screen, the more objects there are, the longer it takes to find the target object (Williams, Zacks, & Henderson, in press). Although real-world scenes seem to have both a high quantity of information and a high level of complexity, the assumption that more information leads to longer processing does not seem to apply: scenes are processed very rapidly despite the amount of information available to the system. In a seminal study on scene perception, Potter (1976) demonstrated that scene gist could be derived within the first 100 ms of viewing. When participants were given a categorical label for a scene and then viewed a rapid sequence of scenes displayed for 125 ms each, detection rates were extremely high (80%). These findings are in direct opposition to this assumption because it is clear that not all the details can be processed in this short amount of time; thus, rapid scene perception must be based on other properties of the scene. The ability to rapidly perceive a scene within the first 100 ms despite its level of complexity and familiarity has challenged the assumption that detailed information is processed before global information is perceived (Palmer, 1977). David Marr (Marr, 1982; Marr & Nishihara, 1978) proposed that visual representations are constructed from simple local properties of the scene (e.g., changes in luminance). From these local properties, surfaces, edges, and other visual properties are integrated into a progressively more complex representation (e.g., component objects). Scene representations are then the result of the integration of these increasingly complex component representations. As a result, the implication for scenes is that they cannot be categorized until their components are represented. Interpretation of a scene, therefore, would have to occur after the visual details are analyzed and grouped. However, many studies have since found that visual properties can lead directly to the recognition of scenes (Oliva & Schyns, 1997, 2000; Schyns and Oliva, 1994). Taking Marr’s theory one-step further, scenes can be conceptualized as the representation resulting from certain combinations of objects. Freidman (1979) proposed that the categorization of scenes occurred in a piece-meal process that categorized the scene based on the individual objects or a set of diagnostic objects in the image. For instance, a kitchen is categorized as such because it is a room that contains a refi‘igerator, stove, and toaster. This assumption of scene perception was derived from the mechanisms underlying how semantic schemas are likely to be activated — that is, through the activation of several key components (Friedman, 1979). Therefore, scene perception would have to occur after the perception of one or more objects. However, studies on the rapid perception of scenes have shown that categorizing a scene takes just as long as categorizing a single object (Biederrnan, 1988). Based on this finding, it is unlikely that the scene’s category is derived from recognizing a set of objects. In order for scenes to be perceived so rapidly, scene gist must be derived directly from the visual properties of the scene. A third assumption is that the visual system must be able to process most or all of the information available in the visual field in order to produce a functional representation. This assumption is predominantly seen in theories concerning the nature of the on-line representation. Our visual system is physiological and mechanistically constructed to acquire highly detailed visual information only fi‘orn the center of the visual field. This physical reality is in direct opposition to the subjective feeling of being able to “see” all the visual details in a scene simultaneously. To explain how the visual system produces the phenomenology of a richly detailed environment, Rayner and McConkie (1976) proposed that the visual system relies on an integrative buffer (although versions of this theory have been proposed by a number of other researchers, see also Brietrneyer, Kropfl, & Julesz, 1982; Davidson, Fox, & Dick, 1973; Duhamel, Colby, & Goldberg, 1992; F eldman, 1985; Jonides, Irwin, & Yantis, 1982; Pouget, Fisher, & Sejnowski, 1993). Visual details are stored in the buffer and new visual information from the current fixation is aligned and stored with previous fixations on the scene. With each fixation, the visual system builds up a representation of the scene. The buffer eventually contains an analogical, veridical representation of the scene that is responsible for the subjective feeling of being able to “see” all the details in a scene. The integrative buffer has found numerous challenges (see reviews by Irwin, 1992, 1996; O’Regan, 1992; Pollatsek & Rayner, 1992), not the least being the speed of perceiving complex scenes. In addition, researchers have found that the visual system can use the initial representation to locate informative or interesting regions (Loftus & Mackworth, 1978; Mackworth & Morandi, 1979), help or hinder perception of objects within the scene (Biederrnan, Mezzanotte, & Rabinowitz, 1982; Davenport & Potter, 2004; De Graef, Christiaens, & d’Ydewalle, 1990; Hollingworth & Henderson, 1998, 1999), guide eye movements (Castelhano & Henderson, under review), and assist with the retrieval of associated semantic information such as schemas and event scripts (Metzger & Antes, 1983, Friedman, 1979). Therefore, the scene representation resulting from the initial visual processing is functional, despite the fact that most of its visual details and component objects are not included in the representation. Studies on rapid scene perception have shown that despite complexity of the environment, scene perception is not based on the processing and representation of all available visual details. The implication for scene representations is that perceiving global information about the scene is independent of gathering details and foveating details is not necessary to form a functional representation of the scene. So what information is thought to be represented in rapid scene perception? The representation of a scene is thought to include low-level properties, (e.g., color and luminance), high- level properties (e.g., semantic category), and various intermediate-level properties, such as spatial layout and some level of object content (De Graef, et al., 1990; Sanocki & Epstein, 1997). Consequently, the initial representation of a scene is composed of visual information specific to the particular scene, as well as more conceptual or general semantic information. In the current dissertation, discussions of scene gist will specifically refer to the conceptual representation that is activated when a scene is viewed. In the scene literature, scene gist is oflen defined as the basic-level semantic category of the scene, but in the current work scene gist is more broadly defined. In addition to the basic category label, scene gist also includes the inferences this categorization affords, such as expected component objects, expected layout, schemas, scripts, and functions associated with a scene (F reidman, 1979; Oliva, in press; Potter, 1976, 1999). This conceptual representation of a scene is related to the semantic label of the scene category, but is thought of as the pre—stored concept that is activated before the activation of the semantic label. It is this conceptual representation that can be interpreted in multiple ways and allows for a single scene to have many labels, depending on the context of the task. The flexibility of viewing the concept of the scene as separate from a single categorical label allows the participant to mix pre-stored categories when encountering an unusual or new type of scene. The paradigm works under the assumption that as long the objects inquired about are within this conceptual representation of the scene (or outside, if inconsistent), then the onset of its activation can be measured. This definition acknowledges that once a conceptual representation for a scene has been activated, there is a certain set of expectations that affect how that scene is further processed. Although I acknowledge that there is more to the initial representation of a scene than simply its conceptual representation, the focus of the present dissertation is the perceptual information processed by the visual system that results in the activation of this representation. What does scene perception say about visual processing in general? Understanding the mechanisms underlying rapid scene perception has implications for theories of scene perception and for theories on how the visual system first processes incoming information. Ultimately, theories of scene perception may lead to fiirther insights into how visual information is translated into conceptual representation at the interface with higher cognitive processes. Representations resulting from rapid scene perception constrain on-line interactions (such as directing attention to a target object or navigating through a room) and the information that is then stored in memory (both in level of detail and type of information). Because of the broad implications for how scenes are represented and how the visual system processes information and interacts with other cognitive systems (i.e., semantic memory), rapid scene perception has received increasing amounts of attention in the visual literature in recent years. Perception of Scenes and Structure Sometimes referred to as the scene’s shape, scene structure at its most basic level is the luminance information within a scene that gives rise to edges and large boundary regions. At a higher level, scene structure refers to the different surfaces and their spatial layout within the scene. Luminance information is known to be processed rapidly within the early visual system. Recent studies have investigated how luminance patterns across the whole scene can contribute directly to scene perception (Oliva & Schyns, 1997; Parraga, Brelstaff, & Troscianko, 1998; Renninger & Malik, 2004; Schyns & Oliva, 1994; Torralba & Oliva, 2003) To examine the contribution of scene structure, Schyns and Oliva (1994) constructed hybrid scenes by combining low spatial fiequency information fi'om one scene with high spatial frequency information from another scene. Each hybrid stimulus therefore conveyed two possible scenes, one in the low fi'equency channels, and the other in the high frequency channels. The hybrid scenes were presented for either 30 ms or 150 ms, and participants were asked to name the scene just shown. Under extremely brief presentations (30 ms), the low spatial frequencies in the hybrids mediated scene recognition, while at longer presentation times (150 ms) high spatial frequency information was used for recognition. The low spatial frequency scenes were sufficiently blurred so that no objects in the scenes were identifiable, yet participants were able to correctly identify these scenes at brief display durations. Schyns and Oliva concluded that scene recognition is very rapid, and that coarse scene information (low spatial fi'equency) is extracted during the first 50 ms of scene viewing, while detailed information (high spatial frequency) is acquired in the next few tens of milliseconds. Furthermore, because objects were not identifiable in scenes composed of only low spatial frequencies, they posited that this initial rapid perception is mediated by information other than the combination of objects present in the scene. In a follow-up study, Oliva and Schyns (1997) found that the coarse-to-fine information extraction sequence could be modified. Participants were “trained” to attend either to the high or low spatial frequency channels by having them identify hybrid scenes that were a combination of scene information and noise. That is, for each participant, scene information was consistently displayed in one channel (high or low), while the contrary channel displayed a meaningless pattern. In this way, participants implicitly learned to pay attention to the only frequency channels carrying useful information. When tested later with real hybrid scenes that included a different scene in each of the high and low spatial frequency channels, the previous training mediated which scene the participant would “see”. Interestingly, the type of training (high or low spatial frequencies) did not affect accuracy at very brief presentation rates (30ms); in both conditions, accuracy was very high. Based on these results, Schyns and Oliva concluded that even if the system typically uses coarse-to-fine analysis, it is not a fixed processing sequence. Furthermore, it is not the case that only course, low spatial frequency information is available early on; instead, it seems that all spatial frequency channels are available and are selected according to task demands. In a recent study, Torralba and Oliva (2003) demonstrated that there are statistical regularities in the images, and that the visual system can base the categorization of a scene on the differences in these regularities (see also, Torralba & Oliva, 2002). Torralba and Oliva (2003) also showed that these statistical regularities exist for other higher-level categorical distinctions, such as natural vs. man-made environments, as well as whether the view is of a close or far range. The results of these studies suggest that global properties such as edge and region boundaries convey a great deal of information about a particular scene’s semantic category, as well as other properties of the scene, such as distance. These studies raise the interesting question of what other global information may be available for the identification of the scene and its properties. Perception of Scenes and Color Another source of visual information that has recently received some attention in scene perception is color. Psychophysical and neuroimaging studies suggest that color information is rapidly extracted (Edwards, Xiao, Keysers, Fdldiak, & Perrett 2001; Livingstone, 1988; Livingstone & Hubel, 1984, 1988). However, studies examining the contribution of color to the perception of both objects and scenes have shown mixed results. Because more research has been done in the area of color effects in object perception than in scene perception, the object perception literature can illuminate what role (if any) color may have in scene perception. I will briefly review the studies investigating color and object perception, before reviewing research on the effect of color on the perception of scenes. 10 Studies investigating the role that color may play in the initial visual processing of objects have produced highly inconsistent results. This is in turn has led researchers to argue for widely different assumptions on what information is exploited by the visual system in its early stages of processing. There are two main camps concerning what information is used in the initial perception of objects: edge-based information and surface-based information (including some combination of both edge and surface). Researchers supporting an edge-based view have argued that color (as well as other surface cues) has no additive value in the initial perceptual analysis of objects (Biederman, 1972, 1987, 1988; Biederrnan & In, 1988; Davidoff, 1991; Davidoff & Ostergaard, 1988; Ostergaard & Davidoff, 1985; Ryan & Schwartz, 1956). Davidoff and Ostergaard (1988; Ostergaard & Davidoff, 1985) had participants either name or recognize a set of colored and monochrome objects. For the naming task, participants had to respond with a label for each object presented on a computer screen. For the recognition task, a series of objects were presented in which they had to detect the presence of a target object. Based on past findings, naming was thought to involve slower processes than the recognition test. Results for the naming task showed that colored objects were named slightly faster, but color had no effect on the recognition rates of the target object. From these results, Davidoff and Ostergaard argued that the only reason that a color effect was found in the naming task was due to the slower processes involved in executing a response for that task and concluded that color was not used in the initial processing of the object representation because there was no effect of color in the recognition task. Biederrnan and In (1988) compared the recognition rates of fiill-color photographs to line-drawing depictions of objects and found no difference in 11 either the reaction times or accuracy performance. They concluded structural information is sufficient for the classification of objects, and color (and other surface cues, such as texture and light gradients) is not necessary. Biederrnan and J u argued that color was not a necessary cue for the initial recognition of the object, unless edge information was poor or degraded in some way (e.g., objects are occluded or filtered). In either case, the role of color is limited and irrelevant in the initial perceptual processing of objects. In Contrast, others have argued that surface cues, such as color and textures, are the important factor in the perception of objects. Furthermore, there is disagreement as to how and when surface cues matter. More specifically, some researchers argue that color sometimes provides unique information about an object’s identity (Gegenfurtner & Rieger, 2000; Joseph, 1997; Joseph & Proffitt, 1996; Price & Humphreys, 1989; Tanaka & Presnell, 1999; Tanaka, Weiskopf, & Pepper, 2001), whereas others argue that it merely acts as additional information for boundary segmentation (Rossion & Pourtois, 2004; Wurm, Legge, Isenberg, & Luebker, 1993). The former argument is based on the assumption that color effects are seen only when color is predictive or diagnostic of the object. In other words, an object that is associated with a particular color is said to be color diagnostic because color can help by restricting the possible objects that may be associated with the structure (Joseph, 1997; Joseph & Proffitt, 1996; Price & Humphreys, 1989; Tanaka & Presnell, 1999; Tanaka, et al., 2001). Price and Humphreys (1989) asked participants to classify whether a stimulus presented was a hit or a vegetable (Experiment 2). Results showed that food items were more easily classified when presented in color, and that this effect of color was more pronounced in objects that were structurally similar (Experiment 3). Price and Humphreys concluded that color does have 12 an effect on object perception, but that it would only be seen when color was diagnostic for a particular object and when that object’s structure did not suffice (e. g., within-in category distinction). They argued that previous studies (Biederman, 1988; Davidoff & Ostergaard, 1985; Ostergaard & Davidoff, 1988) found no effect of color because they used everyday objects that all had distinctive shapes. In the case of visually distinct objects, edge information provides a unique cue for identity, and as a results color effects are not detected. Recently, Tanaka and Presnell (1999) investigated the effect of color on the perception of objects ranked high in color diagnosticity only. In addition to a typicality rating that asked participants to report how often a particular object is seen in a particular color, Tanaka and Presnell used a feature list ranking of the objects to determine color diagnosticity. For the feature lists, each participant rank ordered typical colors for each given object. The number of times a particular color was listed for a particular object determined whether that object was designated color diagnostic. Tanaka and Presnell found larger effects of color for diagnostic objects than non-diagnostic objects in both a naming task and a classification task. In addition, they found that when objects were matched for shape (rate of recognition for monochrome versions both diagnostic and non- diagnostic objects were equivalent) an effect of color for diagnostic objects was still seen, suggesting that color may contribute to the recognition of diagnostic objects independently of their shape information. Based on these results, Tanaka and Presnell concluded that if objects have a strong association with a color, then color could act as a direct indicator of object identity independently of the shape information conveyed by edges. 13 On the other hand, Wurm et a1. (1993) found an effect of color on the recognition of food items, but did not find that color diagnosticity ratings was related to the color advantage. Wurm et a1. determined the diagnosticity of food items based on how often participants associated a particular color with a particular food object. They found that high color diagnostic foods were not recognized faster or more accurately than food items that were not typically associated with a particular color. Results also showed the color advantage was reduced for more distinctive or prototypical depictions of the food items. They concluded that the degree to which color improved reaction time and accuracy is due to additional segmentation and contour extraction information that color provides. Wurm et a1. argued that color only becomes helpful when shape or structural information is absent or degraded, and that rather than providing unique identity information, color provides auxiliary boundary segmentation when edge information alone is not sufficient. Taken together, the findings reviewed above suggest that the role of color depends on whether structure (edge) information is sufficient. It seems possible that whether color effects are seen depends in part on structural similarity of the stimulus set, quality of the stimulus (whether the information is degraded due to occlusion or blurring of the image), and diagnosticity of color for a particular object. However, how diagnosticity is defined varies from study to study, as does the definition of the structural similarity of the stimulus set. Furthermore, what role color plays in perception of objects is not clear. It could be that color contributes unique gist information and acts to simply narrow down possible alternatives (derived from structure), or is simply acting as a fine- tuning of boundary segmentation. Overall, it seems that there is little agreement what role, if any, color plays in object recognition. 14 The scene perception literature reveals a similar pattern of mixed results over a much smaller number of studies. In a recent study, Delonne, Richard, and F abre-Thorpe (2000) investigated the contribution of color to the rapid categorization of scenes containing either animals or food. Participants were asked to detect the presence of food or an animal in rapidly presented scenes (40-20 ms). There was no effect of color on animal classifications and a small effect (10-15 ms) of color on food categorization. Delorrne et al. found that the effect of color (for food items) coincided with reaction times longer than 250 ms and, therefore, concluded that the initial processing of the stimuli did not involve color information. Instead, Delorrne et a1. argue, the results suggest that the vital information lies in determining global regions of the scene that lead to the activation of gist information (i.e., scene structure). However, when looking at the contribution of color on the classification (naming and verification) of natural scenes and man-made scenes, Oliva and Schyns (2000) found that natural scenes (such as scenes of forests, deserts, and beaches) have a unique combination of colors in specific configurations that are associated with that particular scene category (e.g., beaches are associated with a band of blue along the upper region of the scene and a band of light brown along the lower region). In this way, natural scenes are color diagnostic because their color composition is consistent across exemplars and are indicative of the scene category (i.e., unique to each category). In this study, color diagnostic scenes were determined by plotting average hue in color space. Scene belonging to non-overlapping groups within the color space were selected as being color diagnostic. Oliva and Schyns found categorization of natural scenes was slower and less accurate when color information was removed compared to man-made scenes, which 15 showed no difference between the colored and monochrome versions. They concluded that the early processing of the scene must use color information when natural scenes are viewed, and therefore the early prOcessing must include color information regardless of scene type. These results have been replicated in a recent study by Goffaux, Jacques, Mouraux, Oliva, Rossion, and Schyns (in press) using event-related potential (ERP). Goffaux et al. showed that for color diagnostic scenes, there was an earlier activation in the frontal and parietal sites when scenes were presented in normal color versus in abnormally colored or monochrome conditions, by 200 ms and 351 ms, respectively. These results support the conclusion that it is not merely the presence of chromatic cues that lead to fast processing of scene information (i.e., color merely helps with scene segmentation), but that the normal colors are cues for scene categorization. Therefore, despite the mixed results in the literature as to the nature and extent of the contribution of color to the initial processing of scenes and objects, recent studies seem to suggest that color may play a role in the initial formation of the scene representation. Studying Scene Gist Onset: The Contextual Bias Paradigm Previous studies have shown that when a novel scene is presented, information necessary for processing scene gist is acquired within the first fixation, and no longer than 100 ms (Intraub, 1981; Metzger & Antes, 1983; Oliva & Schyns, 2000; Potter, 1976, 1975; Potter & Levy, 1969). Although these experiments demonstrate that scene gist is processed very quickly, they do not reveal exactly how soon after onset the information necessary to activate the scene gist is acquired. The activation of scene gist means that the scene information is available to other cognitive systems and, therefore, able to 16 influence subsequent interactions with and judgments about the scene (e.g., possible component objects). Knowing the onset of the activation of scene gist would also allow us to further investigate which scene properties are being exploited by the visual system to identify a scene (e.g., color). To date, studies of rapid scene perception have used various methodologies to assess how quickly a scene is detected or categorized. Many of these past studies relied on naming or judgment tasks that require participants to make explicit what they thought they saw. Judgment tasks that have open-ended responses often results in a high variability in responses that require multiple judges to then assess the accuracy of the responses. This presents a problem because it is unclear whether naming the scene is interpreted the same way for each participant. For instance, when a scene of a forest is named as trees, does that mean that the participants is understanding the gist, or are they simply naming prominent component objects? Also, some scenes are difficult to name because there is no particular category that they fit, so it is not clear how the variability in naming should be scored or whether these types of scenes should be included at all. To counteract the variability in naming tasks, some researchers prompt participants with category names before the scenes are viewed and instruct the participants to use only those names. Verification tasks have also often been used to study rapid scene perception. Potter and colleagues (1975, 1976; Potter & Levy, 1969) have used a Rapid Serial Visual Presentation (RSVP) paradigm, in which the task was to detect a pre-specified scene. The requirement that participants had to be told the target before seeing the series of presented scenes could have led participants to engage in a feature detection strategy or 17 some other guessing strategy that would not necessitate the identification of the scene gist. Oliva and Schyns (1997, 2000; Schyns & Oliva, 1994) used verification task requiring participants to indicate whether the scene matches the label of the scene shown just before its presentation. These verification tasks face the same uncertainty of labeling found with the judgment tasks. It requires that the experimenter assign a label to the scene that will correspond to what the participant labels the scene; otherwise, it leads to a higher rate of errors. Another related problem to providing labels which participant must judge as appropriate is with scenes that have a combination of features belonging to multiple categories. How close a scene is to the typical scene in any category will vary, as will the interpretation of a correct and incorrect label. This variability makes the selection of distractors difficult, and leads to an underestimation of how quickly information for the formation of the conceptual representation of a scene is gathered. Additionally, there are researchers that argue that judgment and verification tasks tap into late, rather than early, initial visual processing (Biederman & J 11, 1988; Davidoff, 1990). To circumvent the problems associated with these tasks, Fabre-Thorpe, Thorpe and colleagues (Delorrne, Richard, & Fabre-Thorpe, 2000; Fabre-Thorpe, Delorrne, Marlot, & Thorpe, 2001; Thorpe, Marlot, & F ize, 1996; Van Rullen & Thorpe, 2001) use a go/no-go categorization task in which participants are asked to detect the presence of a target object (e. g., animal, human, and vehicle). Scenes are flashed for an extremely brief duration (20-40 ms) and participants are asked to make a decision. For each trial, a scene is presented individually (unmasked) or a series of scenes is presented in an RSVP format and one of the scenes may contain a target object. These detection tasks are thought to be better able to capture early processes, but it is unclear exactly what type of processing is 18 necessary to complete these tasks. For instance, is it necessary to comprehend the scene gist in order to detect the presence of an animal? Recent studies have shown that this task may be performed by using a feature detection strategy and is highly dependent on the contents of the distractor scenes viewed within a trial sequence (Evans & Treisman, 2004). All these methodologies are limited in their ability to assess how quickly the necessary categorical information can be extracted to influence subsequent behavioral responses. The present study investigates the rapid perception of scene gist by introducing a new paradigm that avoids some of the problems associated with judgment, verification, and detection tasks. The new paradigm examines scene gist perception by measuring the speed at which relevant information is extracted to initially activate. Previous studies have found that when participants are asked about the presence of a target object in a rapidly presented scene, their responses are highly biased by the scene gist. Participants show a tendency to affirm the presence of a target object if it is consistent with the scene gist, and to reject its presence if it is inconsistent with the scene gist (Hollingworth & Henderson, 1998, 1999). From these results, it is clear that the activation of a scene’s gist precedes the examination of specific objects within that scene, but biases participants to assume the presence of certain objects. The current study examines how quickly information necessary for the activation of scene gist is acquired by using the Contextual Bias paradigm, which capitalizes on this bias. The Contextual Bias paradigm uses response bias to examine how quickly sufficient information is extracted to activate a scene’s gist after its onset. After presenting a scene, an object name that could be consistent or inconsistent with the scene’s gist is displayed on the screen. A consistent 19 object is associated with the scene’s gist and therefore has a high likelihood of appearing in the scene. An inconsistent object is not typically associated with that scene’s gist and has a lower likelihood of appearing in the scene. The participants are asked to indicate whether the object was present in the scene by responding “yes” or “no” to the object name via a response button. The logic of the paradigm is simply that if the perceptual information acquired fiom the scene within the presentation time is sufficient for the scene gist to be activated, then the responses should reveal a bias. If a scene presented for a given duration is processed to the level of gist, then participants should be more likely to respond “yes” to consistent and “no” to inconsistent targets. If, however, the scene is not presented long enough to acquire information to process to the level of gist, then participants should respond “yes” to "both target types in equal proportions. Therefore, it is the presence of the bias that indicates that the scene gist was activated and available to influence object judgments. By looking at the responses to semantic information early on, the current study hopes to ascertain how soon after onset is the necessary information acquired to form an initial conceptual representation that is functional (i.e., able to be used to make inferences). Further, the present study hopes to decipher which scene properties (i.e. scene information that is initially extracted) most directly affect rapid scene gist activation. By manipulating scene properties (such as the quality of its structure or the presence/absence of color), activation of the scene gist can be measured in the strength of the response bias. It is assumed that if the information removed or changed in the scene affects the response bias, then that information must be used by the system to initially activate a conceptual representation of the scene. 20 Overview of the Current Research The aim of the current work was to investigate the speed of the acquisition of scene gist information, to examine two possible sources of information in the scene likely to lead to this rapid activation (e.g., structure and color information), to investigate the nature of the interaction between these sources of information, and to introduce a novel way of investigating the onset of scene gist information. As reviewed above, scene gist information is thought to influence a number of later cognitive processes involved in interactions with that scene (e.g., navigation, memory, and visual search); however, very few studies have looked at the timing for the onset of the activation of this information or have investigated what information in a scene the system is capitalizing on in order to attain rapid activation. Unlike the methodologies used in past studies, the Contextual Bias paradigm measures the onset of scene gist by looking at the degree to which the activation of the conceptual representation influences judgments about the content of the scene. The participants’ task was to indicate whether the named target object was present in the scene (Experiments 1-3) or was likely to be present in the scene (Experiments 4-6). The degree to which participants responded differently to consistent and inconsistent target objects indicated not only whether scene gist was initially activated, but also demonstrated how this activation changed increasing exposure durations. Experiments 1-3 demonstrated that the Contextual Bias paradigm is a feasible method for the study of rapid scene gist perception. Results revealed that the information for the activation of scene gist is sufficiently acquired within the first 42 ms of onset and so the extraction of necessary information almost instantaneous. In addition, the pattern 21 of results showed that the influence of the scene gist activation on the response bias increases as the duration of scene presentation increases, suggesting a number of implications for the formation of scene representations. The next set of experiments was designed to examine what perceptual factors may play a role in the rapid onset of scene gist activation. Whether color information is important during initial processing has been a topic of much debate over the past few decades. Experiments 4-6 examined the contribution of color to the onset of scene gist. These experiments demonstrated that there is an interaction between a scene’s available structure information and its color information. Results reveal that if structural information is degraded, then color information seems to contribute to the early activation of scene gist. However, when structure information is fully available (i.e., the scenes are normal), color effects are not present. Based on this pattern of results, an interaction between color and structure is proposed that emphasizes the weighted input of information from these two sources into the decision process required for the response. When structure is informative, having color information is inconsequential. However, when the availability of structure information is lessened, then color contributes to the rapid onset of scene gist activation. There are two possible roles for color, based on the results of Experiment 6. First, color could simply be providing additional boundary segmentation information. This would indicate that color is not directly fimctional in the activation of scene gist, and as long as regions are distinguishable, color information is not needed. The second possibility is that color information is functional and provides a direct route to gist information. Its contribution is usually masked under normal circumstances because the 22 system color is redundant with structure information and the system is extremely efficient at extrapolating the necessary information. Experiment 7 was designed to investigate the role that color plays by presenting scenes with abnormal colors. The abnormal hues provide the segmentation information but not any direct cue to scene gist. Color as a segmenter only would indicate that the initial processing of visual information is based on structure information only and that the addition of color information (be it normal or abnormal) does not affect the onset of scene gist information when structure information is firlly available. If color does provide a route to gist information independent of the structure information and presenting abnormally colored scenes interferes with the initial onset of scene gist, then it may be that the visual system is processing both these types of information during the initial activation of scene gist. In this case, the reason that color effects were not found in past studies and in the studies reported above (Experiments 4 and 6) is either that color provides no additional unique information than that provided by the structure information or the system automatically relies on structure information, unless that information proves to be less useful than color. The results from Experiment 7 support the role of color as a direct contributor to the onset of scene gist activation. The interaction between color and structure information has implications for what perceptual properties are necessary for the rapid activation of scene gist and also for how scene representations are constructed and stored, and more generally, how this information may then influence other cognitive processes. 23 THE ONSET OF SCENE GIST PERCEPTION In this chapter, three experiments were conducted to investigate how quickly scene gist is activated from scene onset. Using the Contextual Bias paradigm, Experiments 1-3 investigated the nature of the response bias when scenes were presented for different durations. In every experiment, the participants’ task was to indicate whether the object named after the scene had been displayed was present in that scene. Based on associated schemas, the activation of scene gist should lead them to response “yes” more often to consistent than inconsistent target objects. The response bias is measured as the difference in “yes” responses to consistent and inconsistent objects. The degree of this difference indicates the strength of the activation of related semantic activation. Experiment 1 investigated a broad range of durations from 20 to 250 ms. The results demonstrated that the response bias has an extremely early onset and increases in strength as the duration increases. Experiment 2 was designed to investigate how early the response bias can be measured by displaying scenes in a narrow, more fine-grained range of durations than those in Experiment 1, ranging fi'om 20 ms to 50 ms. Results reveal an earlier response time than Experiment 1 and again show a continued increase in the response bias as the duration times for the scenes increase. Experiment 3 investigated the nature of this increase in response bias by looking at a more fine-grained durations ranging from 50 ms to 100 ms. 24 Experiment 1 The first experiment investigated how soon after onset a scene’s semantic information is available. In the Contextual Bias paradigm, the participants are asked to make a judgment based on whatever information they have gathered from a brief display of a photograph scene. The span of durations varied from 20 ms to 250 ms. The predictions are as follows: If related semantic information is available earlier than previously reported, then the effect of object consistency should be seen before 100 ms. If related semantic information is not available until later in the processing of visual information, then the object consistency effect should not appear until 100 ms after onset or later. Methods Participants. There were 24 Michigan State University undergraduates that participated in this experiment. All participants received credit towards an introductory psychology course. Apparatus & Stimuli. The stimuli were full-color photographs taken from a number of sources (books, calendars, web, and personal photos). There were a total of 80 scenes presented (10 scenes/condition). The scenes were presented on a Dell P78 Trinitron 16-in. (41.1 cm) monitor driven by a G—force3 ND VIDIA Pro super video graphics adapter card. The refresh rate was set at 100 Hz. The scenes had a resolution of 800 x 600 pixels and subtended 30° x 22.5° of visual angle viewed from a comfortable seated position of 61 .5 cm away. Head and body position were not restricted, and so the calculation of visual angle is based on the average distance that participants were seated from the monitor. 25 Design. The experiment had a basic design of two factors (2 x 4). There were two target object conditions (consistent and inconsistent), and four duration conditions (20, 50, 100, and 250 ms). Procedure. After the participant had signed the consent form, the experimenter explained the sequence of events for each trial and that the object of the task was to try to understand the “gist” or “what the scene was about” for each picture presented. Figure 1 depicts the events shown for any given trial. All images in this dissertation are presented in color. Spatula Until S‘yes’, or 66110,, subject 2 response Duration varies with condition Time Subject initiates trial Figure 1: Trial Sequence. Consistent Object was “Spatula” and Inconsistent object “Wrench”. At the beginning of a trial, participants would fixate on a screen with a center fixation cross displayed for 2000 ms. The participants would then view a photograph of a scene. The scene’s presentation duration varied by condition. Each participant took part 26 in all conditions and viewed each scene once. Scenes were counterbalanced for each condition across all participants and were presented randomly (determined by the program for each individual). The presentation of the scene was followed by a visual mask for 50 ms. The mask was composed of a jumble of scene sections taken fi'om the collection of scenes being shown in the experiment (see Figure 1). Next, a word was displayed at the center of the screen until the participant responded. The word could name an object that was semantically consistent with the scene or inconsistent. Object names were chosen to produce a high percentage of “yes” responses for consistent objects and a low percentage of “yes” responses for inconsistent object. An initial norming study showed this pattern for all scenes when presented for 250 ms, which is ample time for the gist of the scene to be acquired. The named object was never present in the scene, regardless of whether it was consistent or inconsistent. Therefore, the participants’ responses were always based on bias. That is, reSponses were based on whether the named object fit or belonged in that picture, never on having viewed that object in the scene. The participants made their judgments by pressing either a response button labeled “1” for yes or “2” for no (these labels were always presented as a reminder on the response screen below the word). The experiment took approximately 10-20 minutes to complete. Results Response bias for each scene duration condition was calculated and analyzed using the same method for all experiments reported in this dissertation. For each duration condition, the proportion of “yes” responses was calculated for both the consistent and inconsistent target conditions. Planned comparisons between the target conditions were 27 carried out for each duration condition. The bias effect was defined as a statistically significant higher proportions of “yes” responses for the consistent target condition than the inconsistent target condition for a given duration and the onset of the response bias was the indicator that the scene’s gist was activated to some threshold level that was able to affect the judgments on the target objects. The results for Experiment 1 are depicted in Figure 2. D Consistent I Inconsistent 1.00 0.90 0.80 0.70 0.60 + 0.50 0.40 0.30 0.20 0.10 0.00 — " Responses (I) Proportion of "Ye 20 50 1 00 250 Dura’o'on of Scene (ms) Figure 2: Proportion “yes” responses to consistent target objects (blue bars) and to inconsistent target objects (red bars) for each duration condition in Experiment 1. Error bars represent Standard Error of the Mean. Figure 2 shows the proportion of “yes” responses by duration according to target conditions for Experiment 1. An omnibus ANOVA revealed that there was a main effect of target condition (F (1,23) = 129.21, p<0.01, MSE =0.0392), in which mean “yes” responses were significantly higher for consistent targets than inconsistent targets; a main 28 effect of duration condition (F (3,69) = 129.21, p<0.01, MSE = 0.0308), which revealed an overall increase in “yes” responses as the duration time increased; and a significant interaction between target condition and duration (F (3,69) = 55.37, p<0.01, MSE = 0.0175), in which the relative difference between targets increased as a firnction of increase in duration. Planned paired-sample t-tests revealed significant difference between the consistent and inconsistent targets at each of the following duration conditions: 250 ms [Consistent M = 0.67, SD = 0.28; Inconsistent: M = 0.07, SD = 0.09; (t (23) = 10.27, p<0.01)], 100 ms [Consistent M = 0.66, SD = 0.25; Inconsistent: M = 0.11, SD = 0.12; (t (23) = 12.15, p<0.01)], and 50 ms [Consistent M = 0.44, SD = 0.25; Inconsistent: M = 0.26, SD = 0.21; (t (23) = 5.69, p<0.01)]. Scenes presented at 20 ms showed no significant difference [Consistent M = 0.26, SD = 0.24; Inconsistent: M = 0.26, SD = 0.24; (t (23) = 0.12, n.s.)]. Discussion The first experiment was designed first, to demonstrate the usefirlness of the Contextual Bias paradigm and second, to investigate the speed at which related semantic information about a scene is retrieved when that scene is presented rapidly. The onset of semantic activation for scene gist was measured by the presence of a response bias. The response bias was calculated as the difference between the proportion of “yes” responses to consistent target objects and the proportion of “yes” responses to inconsistent target objects. Experiment 1 clearly establishes the efficacy of the Contextual Bias paradigm in determining how soon after onset the scene gist information is sufficiently extracted to influencing judgments on component objects. Results also showed that related semantic information is available by at least the first 50 ms; results also revealed an increase in the 29 size of the bias as the scene duration was lengthened. Therefore, semantic information related to a scene becomes available prior to the 100 ms estimate proposed in previous studies (Intraub, 1981; Levy & Potter, 1969; Metzger & Antes, 1983; Oliva & Schyns, 1997; Potter, 1976, 1975; Schyns & Oliva, 1994). Further, as the presentation duration for a scene is lengthened, the response bias is larger, suggesting that the influence of the activation on behavioral responses increases in strength. Interestingly, an informal survey of the participants suggested that the scenes being viewed at such short durations did not reach conscious awareness. Most claimed that they felt that they are guessing; however, the bias was present and its strength increased before participants can report a conscious recognition of the scenes. Experiment [I The second experiment investigated how soon after onset a scene’s semantic information is available by investigating a more fine-grained range of durations than those used in the previous experiment. In Experiment 1, the response bias was present for durations of 50 ms and higher. However, that does not indicate that it takes 50 ms for the activation to occur. Experiment 2 investigated whether the response bias was present for shorter durations than that found in Experiment 1. The span of durations varied from 20 ms to 50 ms, with an additional duration of 250 ms to ensure that participants were performing the task. At 250 ms, the scene is easily visible, so any participants that merely responded without paying attention to the screen could be identified because they would fail to show a response bias at this condition. The predictions are similar to those of Experiment 1: If related semantic information is activated earlier, then a response bias 30 should emerge at a shorter duration condition than 50 ms. If related semantic information is not activated until after a duration of at least 50 ms, then the response bias should not appear until 50 ms. Methods Participants. Thirty Michigan State University undergraduates participated in this experiment. All participants received credit towards an introductory psychology course. Apparatus & Stimuli. The stimuli and apparatus were identical to those used in Experiment 1, with the following exception. The addition of a duration condition reduced the number of images per condition to 8 scenes/condition. In addition, in order to get at a finer gradation of times between duration conditions, the screen refresh rate was increased to 120 Hz. Design. The design was identical to the first experiment with the following exceptions. In Experiment 2 the duration conditions were: 25, 33, 42, 50, and 250 ms. The 250 ms condition was included in order to make certain that the participants were actually performing the task. Participants that were not responding according to the task instructions would not produce an effect at this duration. Procedure. The procedure for Experiment 2 was identical to Experiment 1. Results As in Experiment 1, the proportion of “yes” responses was calculated for both the consistent and inconsistent target conditions for each duration condition. Planned comparisons between the target conditions were carried out for each duration condition. 31 D Consistent I Inconsistent eponses 9997‘ 3888 R 8 888‘ 0.20 0.10 0.00 - Proportion of 'Yes' Duration of Scene (ms) Figure 3: Proportion “yes” responses to consistent target objects (blue bars) and to inconsistent target objects (red bars) for each duration condition in Experiment 2. Error bars represent Standard Error of the Mean. Figure 3 shows the proportion of “yes” responses by duration according to target conditions for Experiment 2. An omnibus AN OVA revealed that the pattern of results mimicked those found in Experiment 1. Specifically, there was an overall effect of target condition (F (1,29) = 67.1, p<0.01, MSE = 0.0523), a main effect of duration condition (F (4,116) = 2.6, p<0.05, MSE = 0.0411), and a significant interaction between target condition and duration (F (4,1 16) = 33.21, p<0.01 , MSE = 0.0315). Planned paired- sample t-tests revealed significant difference at durations of 250 ms [Consistent M = 0.73, SD = 0.23; Inconsistent: M = 0.09, SD = 0.10; (t (29) = 16.14, p<0.01)], 50 ms [Consistent M = 0.56, SD = 0.23; Inconsistent: M = 0.30, SD = 0.22; (t (29) = 5.89, p<0.01)], and 42 ms [Consistent M = 0.44, SD = 0.27; Inconsistent: M = 0.31, SD = 32 0.22; (t (29) = 2.5, p<0.01)]. Scenes presented at 33 ms [Consistent M = 0.41, SD = 0.32; Inconsistent: M = 0.33, SD = 0.27; (t (29) = 1.25, n.s.)] and 25 ms [Consistent M = 0.31, SD = 0.26; Inconsistent: M = 0.34, SD = 0.23; (t (29) = -0.67, n.s.) showed no significant difference. Discussion Experiment 2 investigated the onset of the bias more closely by using smaller increments in the duration condition. Results showed that relevant semantic information was extracted as soon as 42 ms after onset. In comparison with the findings of previous studies (Intraub, 1981; Metzger & Antes, 1983; Oliva & Schyns, 1997; Potter, 1976, 1975; Potter & Levy,1969; Schyns & Oliva, 1994), the activation of scene gist can be detected very early on in processing. In addition, the results replicate Experiment 1 in that the response bias for the 250 ms condition was much greater than those found in the 42 and 50 ms conditions, indicating that the activation of scene gist is much stronger, resulting in a stronger influence of the response patterns during longer durations. Experiment 1]] Experiment 3 investigated the nature of the response bias. In Experiments 1 and 2, the response bias seemed to increase in strength from 50 ms to 100 ms, and then asymptote for longer durations (there is no difference between the bias in the 100 ms and 250 ms conditions). The increase in response bias suggests that there is an accompanying increase in the activation of scene gist. So, between durations of 50 ms and 100 ms, it is possible that a maximum amount of activation is reached, leading to an asymptote in the 33 size of the response bias. The main purpose of Experiment 3 was to investigate whether the increase in response bias reaches a maximum activation by examining durations spanning fi'om 50 ms to 100 ms. If the increase in response bias does reach a maximum value, then there should be a point of inflection for the response biases as the duration conditions increase. However, if the increase in activation does not reach a maximum within the first 100 ms, then there should be a gradual increase in the strength of the response bias, with no obvious point of inflection for any given presentation duration. Methods Participants. Thirty-six Michigan State University undergraduates participated in this experiment. All participants received credit towards an introductory psychology course. Apparatus & Stimuli. The apparatus and stimuli were identical to those used in Experiment 2; however, another 16 photos were added to the stimulus set. These scenes were added to maintain the same image to condition ratio as Experiment 2 (8 scenes/condition), due the addition of a sixth duration condition in Experiment 3. Design. The target object conditions for Experiment 3 were identical to Experiments 1 and 2; however, an additional duration condition was included, resulting in a 2 x 6 factor design. In this experiment, participants viewed the scenes for: 50, 58, 75, 83, 92, and 100 ms. Procedure. The procedure for Experiment 3 was identical to Experiments 2. 34 Results For each duration condition, the proportion of “yes” responses was calculated for both the consistent and inconsistent target conditions. As in Experiments 1 and 2, planned comparisons between the target conditions were carried out for each duration condition. [ El Consistent I Inconsistent 50 58 75 83 92 100 Duration of Scene (ms) Figure 4: Proportion “yes” responses to consistent target objects (blue bars) and to inconsistent target objects (red bars) for each duration condition in Experiment 3. Error bars represent Standard Error of the Mean. Results for Experiment 3 are shown in Figure 4. An omnibus AN OVA revealed a similar pattern of effects as the previous two experiments. There was an overall effect of target condition (F (1,35) = 476.19, p<0.01, MSE = 0.0598), no main effect of duration condition (F (5,175) = 0.46, n.s., MSE = 0.0258), and a significant interaction between target condition and duration (F (5,175) = 17.6, p<0.01, MSE = 0.0217). Planned paired- 35 sample t-tests revealed significant difference across all duration conditions: 100 ms [Consistent M = 0.78, SD = 0.19; Inconsistent: M = 0.14, SD = 0.15; (t (35) = 16.32, p<0.01)], 92 ms [Consistent M = 0.75, SD = 0.21; Inconsistent: M = 0.13, SD = 0.09; (t (35) = 18.92, p<0.01)], 83 ms [Consistent M = 0.75, SD = 0.14; Inconsistent: M = 0.15, SD = 0.14; (t (35) = 18.19, p<0.01)], 75 ms [Consistent M = 0.72, SD = 0.16; Inconsistent: M = 0.19, SD = 0.18; (t (35) = 12.73, p<0.01)], 58 ms [Consistent M = 0.63, SD = 0.22; Inconsistent: M = 0.23, SD = 0.16; (t (35) = 9.74, p<0.01)], and 50 ms [Consistent M = 0.57, SD = 0.22; Inconsistent: M = 0.29, SD = 0.22; (t (35) = 5.97, p<0.01)]. Discussion In Experiment 3, the increase in the bias as duration times increased was further investigated and the results revealed that the response bias increased monotonically in strength up to and including 100 ms. These results suggest that activation was increasing with longer stimulus presentations; however, no maximum in the amount of activation was uncovered. Generally speaking, there are two possible reasons for the increase in the response bias over time in this and the previous experiments (Experiments 1 and 2). The first assumes that the recognition of a scene occurs as an all-or-none or binary process, in which a scene is recognized when information supporting that semantic scene category reaches a certain threshold. The other possible explanation is that scene recognition is continuous. In this case, a scene is recognized incrementally with increased presentation durations because more supporting visual information is available with increased display times. Across the three experiments reported so far, there was a gradual increase in 36 response bias across the range of durations. An assumption that all scenes are processed at equal rates and reach similar levels of activation at the same time would lead to the conclusion that the increase in the response bias is due to a gradual accumulation of activation strength in all scenes simultaneously. However, given the variety in complexity and type of information available from one scene to the next, it is unlikely that this assumption is true. It is more likely that scenes are processed at different rates and reach activation of their gist information at different times. For instance, the response bias results of Experiment 2 show that the fastest a scene can be retrieved is 42 ms, but it may not be the case for all scenes. Taking this assumption into account, it seems that the gradual increase in the response bias could be to due to both an increase in the number of scenes that reached some level of scene gist activation, as well as an increase in the activation level as more visual information is available with increasing durations. The actual mechanism responsible for the response bias (as either binary or continuous) is of little consequence to the predictions of this dissertation. In either case, the increase in bias is affected by an increasing availability of information about the scene. Deciding which of the two models is correct is not in the scope of this dissertation and does not affect the predictions. Therefore, for the purpose of the present paper, we will arbitrarily adopt the view that scene recognition occurs as a continuous activation of related information that can increase over time. Adopting this view does not mean that we support this view exclusively, but rather have sided with one view in order to outline the predictions of the current investigation more clearly. For the remainder of the paper, all predictions will be outlined with the continuous activation mechanism in mind; 37 however, that is not to say that other alternative mechanisms may not be equally probable. 38 THE INFLUENCE OF COLOR ON SCENE GIST PERCEPTION As reviewed in the introduction, the effect of color on the perception of objects and scenes has shown mixed results. For scenes, some studies have shown no effect of color (Delorrne et al., 2000), while others have shown that color can effect scene perception, but only for natural not man-made scenes. (Goffaux et al., in press; Oliva & Schyns, 2000). We know from other studies on rapid scene perception that the structural information available globally plays an important role in the initial activation of gist (Oliva & Schyns, 1997; Schyns & Oliva, 1994; Torralba, 2003; Torralba & Oliva, 2003). If color and structural information are processed immediately and simultaneously (Edwards, et al., 2003; Livingston & Hubel, 1984a, 1984b, 1988), it is possible that the initial processing of information for a scene includes color information, but that its influence can only be seen at the very early stages of processing. The mixed results of past studies, therefore, may be due to differences in tasks (naming vs. verification of stimuli) and timing (very early vs. later processing). In the present study, the contribution of color is investigated by examining the role color plays in the initial activation of scene gist. Experiment IV Experiment 4 investigated the effect of color on the activation of scene gist. Again the Contextual Bias paradigm was used and the participants were asked to make a judgment based whatever information they have extracted from a brief displayed photograph of a scene. In Experiment 4, the scenes are presented either as colored or monochrome photographs. In order to look at the early effects of color processing, the 39 span of durations used was identical to that used in Experiment 2 (from 20 to 50 ms). As reviewed above, there are two views on what role color can play in the perception of objects and scenes. On the one hand, researchers argue that color helps to boost the perception of certain objects (either through further assisting the segmentation of the shape or form or through a semantic association between the color and the object name). One the other hand, some researchers posit that the effect of color is much later, occurs only after the initial recognition, and so, the initial representation of visual information is colorblind. In this case, the initial representation would contain edge-information only, and perception of the scene category would be based on its structure. Therefore, if color does contribute to the activation of related semantic information during early stages of visual processing, then the bias effect should be larger for colored scenes than monochrome scenes. However, if the color is not used in the early stages of processing leading to the activation of relevant semantic information, then there should no difference between the colored and monochrome scenes. Methods Participants. Sixty Michigan State University undergraduates participated in this experiment for credit in an introductory psychology course. Apparatus & Stimuli. The apparatus was identical to Experiment 1 and 64 scenes were added to the collection used in Experiment 3 (for a total of 160 scenes). For each of the 160 colored scenes, a monochrome counterpart was generated. Monochrome versions of the photographs were created by transforming the photograph fi'om RGB to L*a*b* color mode and then discarding the chromatic components a*b* of the colored 40 scenes, leaving only L* (the gray-levels)‘. Colored scenes were simply the original photographs. Figure 5 shows an example of the scenes used in this experiment. Figure 5: Depicts the Color (A) and Monochrome (B) conditions for Experiment 4. Design. The second experiment had three factors: color, target and duration (2 x 2 x 5). The target and duration conditions were identical to Experiment 2. The only factor that was added was the color condition (color scenes and monochrome scenes). Procedure. The procedure was similar to the one used in Experiment 1, with the following exceptions. In Experiment 4, participants viewed 160 scenes, half of which were full-color and the other half monochrome. The experiment lasted for approximately 20—30 minutes. Results There were two types of analyses carried out for this experiment. The first set of analyses was the planned comparisons between the target conditions for each presentation duration. This was the same analysis carried out in Experiments 1 to 3 and looks at how soon after onset the response bias was present. The second analysis looked at the contribution of color. Difference scores were calculated by subtracting the target ' Thanks to Aude Oliva for the Matlab code that perform these transformations on the photographs 41 conditions (consistent and inconsistent) from each other for each duration and color condition separately. The difference scores for each color condition were then compared for each duration condition. In this way, the differences between the biases of color and monochrome scenes are more transparent, thus making the data more interpretable. Figures 6a and 6b show the proportion of “yes” responses by duration according to target conditions (light bars: consistent, and dark bars: inconsistent) for the color condition and monochrome conditions, respectively. An omnibus AN OVA revealed that there was no main effect of color (F (1,59) = 0.34, n.s., MSE = 0.0162), a main effect of target condition (F (1,59) = 241.3, p<0.01, MSE = 0.0487), and a main effect of duration condition (F (4,236) = 16.56, p<0.01, MSE = 0.0481). There was a significant interaction between target and duration conditions (F (4,236) = 109.38, p<0.01, MSE = 0.033). Neither the interaction between color and duration (F (4,236) = 0.83, n.s. , MSE = 0.0279), the interaction between target and color (F (1,59) = 3.1, p=0.083, MSE = 0.0344), nor the three-way interaction between color, target and duration (F (4,236) = 0.79, n.s., MSE = 0.0216) were significant. 42 A IConsistent Ilnconsistent 1.00 0.90 0.80 g 0.70 m 0.60 '8 0.50 a I: 0.40 '8 0.30 ~ g 0.20 ~ 0.10 « 0.00 ~ Duration (ms) B IConsistent Inconsistent 1 .00 0.90 0.00 g 0.70 g 0.60 '8 0.50 a c 0.40 .2 E 0.30 g 0.20 ~ 0.10 ~ 0.00 . Duration (ms) Figure 6: Proportion “yes” responses to consistent target objects (darker bars) and to inconsistent target objects (lighter bars) for each duration condition in Experiment 4. (A) Colored scene condition (B) Monochrome scene condition. Error bars represent Standard Error of the Mean. For the colored scenes, planned paired-sample t—tests revealed significant difference at durations of 250 ms [Consistent M = 0.72, SD = 0.24; Inconsistent: M = 43 0.07, SD = 0.13; (t (59) = 18.87, p<0.01)], 50 ms [Consistent M = 0.50, SD = 0.23; Inconsistent: M = 0.26, SD = 0.21; (t (59) = 7.3, p<0.01)], and 42 ms [Consistent M = 0.40, SD = 0.23; Inconsistent: M = 0.30, SD = 0.20; (t (59) = 3.16, p<0.01)]. Scenes presented at 33 ms [Consistent M = 0.31, SD = 0.20; Inconsistent: M = 0.26, SD = 0.23; (t (59) = 1.55, n.s.)] and 25 ms [Consistent M = 0.27, SD = 0.23; Inconsistent: M = 0.22, SD = 0.19; (t (59) = 1.91, n.s.) showed no significant difference. The same pattern was seen for the monochrome scenes. Flamed paired-sample t-tests revealed significant difference at durations of 250 ms [Consistent M = 0.65, SD = 0.26; Inconsistent: M = 0.07, SD = 0.10; (t (59) = 16.18, p<0.01)], 50 ms [Consistent M = 0.47, SD = 0.28 Inconsistent: M = 0.27, SD = 0.22; (t (59) = 5.5, p<0.01)], and 42 ms [Consistent M = 0.39, SD = 0.25; Inconsistent: M = 0.28, SD = 0.21; (t (59) = 3.67, p<0.01)] and no significant difference for scenes presented at 33 ms [Consistent M = 0.31, SD = 0.22; Inconsistent: M = 0.31, SD = 0.22; (t (59) = -0.01, n.s.)] and 25 ms [Consistent M = 0.24, SD = 0.22; Inconsistent: M = 0.25, SD = 0.23; (t (59) = -0.29, n.s.)]. Figure 7 shows the difference scores for each color condition at each duration. [+Coloured +Monoohrome 1.00 0.90 0.80 0.70 0.60 0.50 0.40 0.30 0.20 0.10 0.00 -0.10 Difference Score (Consistent - Inconsistent) 0 50 100 150 200 250 300 Duration (ms) Figure 7. Difference scores (Inconsistent — Consistent) for the Colored (blue) and Monochrome (red). Error bars represent Standard Error of the Mean. The premise of the current experimental design was that when presented with the two target choices, consistent targets would solicit more yes responses than inconsistent targets. It is this difference between responses to the targets that is expected to change as a function of duration depending on whether the scene was presented for long enough for the “gist” of that scene to be processed. By looking at the difference scores, we can see whether this change in responses over time varies as a function of the other factor of interest, namely color. The analysis of the difference scores compared the colored vs. monochrome conditions for each duration condition. If there was an advantage of color present, then there should be a bigger effect for the color scenes than the monochrome scenes at that duration. The analysis showed that at no point did the colored scenes show an advantage over the monochrome (as measured by the responses to the consistent and inconsistent 45 targets). The results for each duration condition were as follows: 250 ms (t (59) = 1.64, n.s.), 50 ms (t (59) = 0.98, n.s.), 42 ms (t (59) = -0.59, n.s.), 33 ms (t (59) = 1.13, n.s.) and 25 ms (t (59) = 1.4, n.s.). Discussion Experiment 4 investigated the contribution of color in the initial retrieval of related semantic information, or is the initial representation colorblind. In the results section, the two analyses completed were: (1) to investigate the onset of the response bias (for the replication of previous experiments), and (2) to explore the initial contribution of color. Each of these will be discussed in turn. The results of the first analysis replicated the results fi'om Experiment 2, in that the response bias appeared 42 ms after onset, although a trend at 33 ms was also present in two of the experiments (Experiment 2 & 4). To investigate whether a bias effect could be detected as early as 33 ms, a between-experiment analysis was conducted. The sharp- color condition in Experiment 4 was analyzed with the data from Experiment 2. The ANOVA showed no main effect of experiment (F (1 ,107) = 1.7, n.s.), a main effect of target (F (1,107) = 226.13, p<0.001), a main effect of duration (F (4,107) = 12.59, p<0.001), and an interaction between target and duration (F (4,428) = 100.36, p<0.001). No other interactions were significant. Collapsing across experiments, a planned paired- sample t-test of the 33ms duration condition revealed that the bias effect was significant (t (108) = 2.137, p<0.05). These post-hoe analyses indicate that sufficient information about the scene is acquired almost immediately after onset, as the effects of the activation of scene gist on the judgment of component objects can be seen with presentation durations as short as 33 ms. 46 The second analysis examined the contribution of color to the bias effect, and demonstrated that in the case of full—colored scenes color had no effect on the onset of the bias. In other words, monochrome scenes produced a bias of the same magnitude as the colored scenes. The results from this experiment support the notion from previous studies that the initial construction of a visual representation is based only on edges or luminance information, which are still preserved in the monochrome photographs (Biederman & J u, 1988; Davidoff, 1991; Davidoff & Ostergaard, 1988; Ostergaard & Davidoff, 1985). An alternative explanation for these results is that the initial representation does have color information, but when the semantic information can be derived from structural information, then the contribution of color is masked (Price & Humphreys, 1989). With this view the visual system is so efficient at extracting necessary information from the structural information of the scene that there is essentially a ceiling level of performance that the addition of color cannot improve. This would mean that when the structural information is not as efficiently extracted (causing the visual system to no longer be at its ceiling of efficiency), then the contribution of color might be seen. Evidence for this alternative explanation arises from the study by Oliva and Schyns (2000) demonstrating that the effect of color is exacerbated by degrading the edge information of scenes; the same effect has been found in the object recognition literature (Wurm et al., 1993). The following experiments were designed to investigate this alternative explanation regarding the existence of color in the initial representation. 47 Experiment V As reviewed in the introduction of this chapter, the relative contribution of color to the categorization of scenes may be masked by the speed of processing of structural information. Experiment 5 investigated whether the quicker processing of structural information masked the effect of color on the activation of semantic scene information. In order to investigate the contribution of color independent of that derived from the scene structure, the scenes used in Experiment 4 were filtered to remove high-spatial frequency information (thus, keeping most med- and low-level spatial frequency information). By removing some of the structural information, it is possible that colored scenes would show an increased rate of recognition with shorter presentation durations, relative to their blurred monochrome counterparts. If color is part of the information quickly extracted for the activation of scene gist, , then the response bias should be larger for the blurred colored scenes than the blurred monochrome scenes. If the color is not extracted and processed in the early stages of perception, leading to the activation of relevant semantic information (as suggested by the results of Experiment 4), then there should no difference between the color conditions. Methods Participants. Sixty Michigan State University undergraduates participated in this experiment for credit in an introductory psychology course. Apparatus & Stimuli. For Experiment 5, an additional set of the scenes was created in which each scene was low-pass filtered at 1 cycle/deg of visual angle (corresponds to 17 cycles/ image). In total, the experiment comprised 160 low-pass 48 filtered colored images and their 160 monochrome counterparts. Figure 8 shows an example of the blurred scenes used in this experiment. Figure 8: Depicts the Color (A) and Monochrome (B) conditions for Experiment 5. Spatial frequencies higher than 1 deg/image (or 17 cycles/ image) were removed, leaving only rned— and low- spatial frequency information. Design. The design for Experiment 5 was the same as in Experiment 4, with the following exception. The duration condition included the following levels: 20, 50, 80, 100, and 250 ms. The reason for this change in the duration conditions (from Experiment 4) was that with information available in the scenes, activation would most likely take longer that those found in Experiments 2 and 4. Therefore, duration conditions were the same as those used in Experiment 1. Procedure. The procedure was identical to the one used in Experiment 4, with the following exception. The participants were instructed to indicate whether the target object was likely to occur in the scene just displayed (as opposed to being asked to indicate whether the object was present). The instructions were modified because the scenes were now blurred and even if the scene was perceived, the fact that objects were harder to make out inclined a number of participants to press the “no” button. 49 Results The same sets of analyses that were described in Experiment 4 are reported for Experiment 5. First, the relative effect of target in each duration and color condition is considered. Second, difference scores (calculated using the same methods reported in Experiment 4) are analyzed. The question of whether the colored scenes have any advantage over monochrome scenes when the structural information is degraded is investigated by comparing the difference scores between color conditions. Figures 9a and 9b show the proportion of “yes” responses by duration according to target conditions (light bars: consistent and dark bars: inconsistent) for the color and monochrome conditions, respectively. 50 A [_IConsistent Inconsistent l Duration (ms) [ IConsistent Ilnconsistent I 80 Duration (ms) Figure 9. Proportion “yes” responses to consistent target objects (darker bars) and to inconsistent target objects (lighter bars) for each duration condition in Experiment 5. (A) Colored scene condition (B) Monochrome scene condition. Error bars represent Standard Error of the Mean To investigate the duration at which the scene gist was perceived, the target effect was investigated for each color condition. An omnibus ANOVA revealed that there was 51 no main effect of color (F (1,59) < 1, n.s., MSE = 0.0277), a main effect of target condition (F (1,59) = 392.97, p<0.01, MSE = 0.0695), and a main effect of duration condition (F (4,236) = 27.72, p<0.01, MSE = 0.0499). There was a significant interaction between target and duration conditions (F (4,236) = 135.42, p<0.01, MSE = 0.0295). None of the interactions between color and duration (F (4,236) = 1.73, n.s., MSE = 0.0301), target and color (F (1,59) = 9.56, n.s., MSE = 0.0251), nor color, target, and duration (F (4,236) = 2.26, n.s., MSE = 0.06), were significant. As found in the previous experiment, the same pattern of results was seen in both the colored and monochrome scenes. For the colored scenes, planned paired-sample t- tests revealed a significant difference at durations of 250 ms [Consistent M = 0.86, SD = 0.14; Inconsistent: M = 0.18, SD = 0.18; (t (59) = 22.13, p<0.01)], 100 ms [Consistent M = 0.78, SD = 0.19; Inconsistent: M = 0.27, SD = 0.19; (t (59) = 13.94, p<0.01)], 80 ms [Consistent M = 0.70, SD = 0.28; Inconsistent: M = 0.33, SD = 0.21; (t (59) = 11.32, p<0.01)], 50 ms [Consistent M = 0.45, SD = 0.24; Inconsistent: M = 0.31, SD = 0.19; (t (59) = 4.74, p<0.01)], but not at 20 ms [Consistent M = 0.34, SD = 0.24; Inconsistent: M = 0.36, SD = 0.23; (t (59) = -0.82, n.s.)]. Planned paired-sample t-tests for the monochrome scenes revealed a significant difference at durations of 250 ms [Consistent M = 0.81, SD = 0.22; Inconsistent: M = 0.20, SD = 0.16; (t (59) = 19.21, p<0.01)], 100 ms [Consistent M = 0.70, SD = 0.20; Inconsistent: M = 0.31, SD = 0.21; (t (59) = 11.76, p<0.01)], 80 ms [Consistent M = 0.61, SD = 0.24; Inconsistent: M = 0.35, SD = 0.21; (t (59) = 7.3, p<0.01)], 50 ms [Consistent M = 0.46, SD = 0.22; Inconsistent: M = 0.38, SD = 0.22; (t (59) = 2.5, p<0.01)], and no significant difference for scenes presented at 20 ms [Consistent M = 0.36, SD = 0.3; Inconsistent: M = 0.33, SD = 0.23; (t (59) = - 52 1.03, n.s.]). Figure 10 shows the difference scores for each color condition at each duration condition. +Colored + Monochrome] 1.1!) 0.90 0.80 0.70 0.60 0.50 0.40 0.30 0.20 0.10 0.00 -0.10 Difference Score (Consistent - Inconsistent) 0 50 100 150 200 250 300 Duration (me) Figure 10. Difference scores (Inconsistent — Consistent) for the Colored (blue) and Monochrome (red). Error bars represent Standard Error of the Mean. The analysis of the difference scores was carried out in the same way as in Experiment 4; difference scores for colored scenes were compared to difference scores for monochrome scenes for each duration condition. If there was an advantage of color, then there should be a bigger effect for the colored scenes than the monochrome scenes at that duration. The analysis revealed colored scenes show an advantage over monochrome scenes, but only for scenes shown for durations of 100 ms (t (59) = 2.69, p<0.01), and 80 ms (t (59) = 2.67, p<0.01). There was no difference for scenes presented at 250 ms (t (59) = 1.25, n.s.), 50 ms (1(59) = 1.55, n.s.), and 20 ms (t (59) = 4.03, n.s.). 53 Discussion Prior studies have shown that structural information is important in identifying scene gist (Oliva & Schyns, 1997; Schyns & Oliva, 1994) and the speed at which gist information is extracted may be responsible for the mixed results for the effect of color (Davidoff, 1991). In Experiment 5, the structural information for each scene was degraded by filtering the high-level spatial frequency information from the scene photographs. Results showed that when filtered scenes were used, color did have an effect on the activation of scene gist: monochrome scenes showed less of a bias than colored scenes. Furthermore, the results suggest that the benefit of color occurred later, in that the effect of color started at 80 ms, which is well after the 42 ms onset of the bias effect seen in Experiments 2 and 4. Taken together, the results of both Experiment 4 and 5 suggest that there is an interaction between available structure information and color and that color may be used to categorize scenes and activate relevant semantic information when structure information is degraded and the system cannot extract structure information as easily. However, the two experiments look at different ranges of duration and a comparison across experiments makes it difficult to draw any firm conclusions about the degree to which color contributes and the onset of its contribution. The next chapter further investigates the existence and nature of the interaction between a scene’s structural information and its color information. 54 THE INTERACTION BETWEEN COLOUR AND STRUCTURE ON SCENE GIST PERCEPTION Experiment VI In Experiment 6, the effect of color on the activations of semantic scene information is assessed by a within-subject manipulation of the availability of both color and structure information. The results of Experiments 4 and 5 suggest that the contribution of color to the activation of scene information is dependent on the relative contributions of structure. If color contributes to the activation of scene gist only when structural information is degraded, then the bias effect should be larger for the blurred colored scenes than the blurred monochrome scenes and no color effect should be seen between the sharp color and sharp monochrome scenes. On the other hand, if color is not extracted and processed in the initial activation of scene gist, then there should be no difference between the color conditions for either the sharp or blurred scenes. Methods Participants. Eighty Michigan State University undergraduates participated in this experiment for credit in an introductory psychology course. Apparatus & Stimuli. Experiment 6 was essentially the combination of Experiments 4 and 5 into a single, within-subj ect experiment. With the addition of a condition (sharpness), another 140 scenes were added to the experiment (for a total of 400 scenes) in order to keep the same number of scenes per condition. Each photograph had 4 versions: sharp colored, sharp monochrome, blurred colored, and blurred monochrome. Figure 11 shows an example scene in each of the sharp and color conditions. Each version was created using the same procedures described in 55 Experiments 4 and 5. Figure 11. Example of stimulus color conditions used in Experiment 6; (A) Sharp-Colored condition, (B) Sharp-Monochrome condition, (C) Blurred-Colored condition, (D) Blurred-Monochrome condition. Design. In Experiment 6, the color, target, and duration conditions were the same as those used in Experiment 5. In addition, a sharpness condition was included (scenes were either sharp or blurred), resulting in a four factor, within-subject design: sharpness, color, target, and duration (2 x 2 x 2 x 5). Procedure. The experimental procedures were identical to the one in Experiment 5, with the following exceptions. The experiment took approximately 25 to 40 minutes to complete. Participants were encouraged to take breaks while performing the experiment and all participants took at least two breaks during the experiment. 56 Results The analyses are presented in the same order as described in previous experiments. Figure 12 shows the proportion “yes” responses for each target condition for sharp-color scenes (Figure 12a), sharp-monochrome scenes (Figure 12c), blurred- color scenes (Figure 12b), and blurred-monochrome scenes (Figure 12d). An omnibus ANOVA also revealed that there was a significant main effect of sharpness (F (1,79) = 91.64, p<0.01, MSE = 0.0378), color (F (1,79) = 4.99, p<0.05, MSE = 0.0925), target (F (1,79) = 1372.33, p<0.01, MSE = 0.093), and duration (F (4,316) = 103.122, p<0.01, MSE = 0.044). There were several significant two-way interactions that included those between target and sharpness (F (1,79) = 404.93, p<0.01, MSE = 0.0318), sharpness and duration (F (4,316) = 15.22, p<0.01, MSE = 0.0224), target and color (F (1,79) = 13.65, p<0.01, MSE = 0.0293), and target and duration (F (4,316) = 374.8, p<0.01, MSE = 0.024). There was also a significant three-way interaction between sharpness, target and duration (F (4,316) = 24.18, p<0.01, MSE = 0.0248). Finally, the analysis revealed that there was a significant 4-way interaction between sharpness, color, target, and duration (F (4,316) = 3.52, p<0.01, MSE = 0.0202). No other interactions were significant. 57 AaEv c8830 emu 03 on on on E22308...“ Eofiwcoom , sesuodsea ,seK_ uoprodmd G EEC 5:530 EoEmcooEU 392200 I A sesuodsea 59A,. uoruodord 3.5 5:930 emu on: on on om CmquCOOC—L €929.00. I D A . . A A _. D cud. 8.0 O V O sssuodsea .seA, uoluodord A25 c2350 acoumficooci EofimcoOI _ sesuodseu _eeA_ uomodord 3.2::— 9:25 aurorqoouow .10po 58 Figure 12. Proportion “yes” responses to consistent target objects (darker bars) and to inconsistent target objects (lighter bars) for each duration condition in Experiment 6. (A) Sharp-Colored scene condition (B) Sharp-Monochrome scene condition (C) Blurred-Colored scene condition and (D) Blurred- Monochrome scene condition. The Coloured condition is represented with blue bars, and the Monochrome condition with the red bars; The Sharp condition is represented with the solid bars, and the Blurred condition is represented with the hatched bars. Error bars represent Standard Error of the Mean 59 Of theoretical interest are the effects of color as a function of sharpness. In order to simplify this analysis, difference scores were analyzed. Figure 13 shows the difference scores plotted as a function of sharpness and color by the duration condition. Planned comparisons were conducted between the color and monochrome scenes at each duration condition. Table 1 shows the means and standard deviation for all conditions. Sharp Blurred Colored Monochrome Colored Monochrome 250 Consistent 0.88 0.85 0.81 0.74 (0.12) (0.15) (0.19) (0.24) Inconsistent 0.12 0.13 0.19 0.19 (0.13) (0.16) (0.16) (0.1 7) 100 Consistent 0.85 0.81 0.68 0.61 (0.1 7) (0.15) (0.21) (0.21) Inconsistent 0.16 0.13 0.26 0.28 (0.14) (0.16) (0.1 7) (0.18) 80 Consistent 0.83 0.82 0.59 0.53 (0.1 7) (0.1 7) (0.22) (0. 20) Inconsistent 0.19 0.16 0.26 0.30 (0.16) (0.15) (0.1 7) (0.19) 50 Consistent 0.75 0.69 0.40 0.41 (0.19) (0.19) (0.21) (0. 22) Inconsistent 0.24 0.25 0.26 0.30 (0.27) (0.1 7) (0.16) (0. 22) 20 Consistent 0.37 0.34 0.25 0.27 ~ (0. 23) (0.23) (0.20) (0. 22) Inconsistent 0.25 0.29 0.26 0.26 (0.22) (0. 24) (0.21 ) (0.21 ) 60 Table 1. Mean (Standard Deviation) of Proportion of Wes” responses for Experiment 6. For scenes in the Sharp condition there was no difference between color and monochrome for all durations [250 ms: t (79) = 1.34, n.s.; 100 ms: t (79) = 1.41, n.s.; 80 ms: t (79) = 1.41, n.s.; 20 ms: t (79) = 1.43, n.s.)], with the exception of the 50 ms duration (t (79) = 2.12, p<0.05), in which the colored condition had a greater bias than the monochrome condition. For the blurred scenes condition, however, there was a significant effect of color at duration of 250 ms (t (79) = 2.4, p<0.05), 100 ms (t (79) = 2.93, p<0.01), and 80 ms (t (79) = 2.94, p<0.01). In all cases, there was an advantage for the colored scenes over the monochrome scenes. There was no difference between the colored blurred scenes and monochrome blurred scenes at durations of either 50 ms (t (79) = 0.66, n.s.) or 20 ms (t (79) = -072, n.s.). +Sharp Colored —O—Sharp Monochrome —i- BlurredColored -I- Stand Monochrome 1.00 0.90 0.80 0.70 0.60 0.50 0.40 0.30 0.20 0.10 0.00 -- -0.10 -0.20 fference Score (Consistent - Inconsistent) 0 50 100 150 200 250 300 Duration (ms) Figure 13. Difference Scores (inconsistent mean — consistent mean) for each duration condition in Experiment 6. The Colored condition is represented with blue lines, and the Monochrome condition with the red lines; The Sharp condition are represented with the solid lines, and the Blurred condition are represented with the dashed lines. Error bars represent Standard Error of the Mean. 61 Discussion Experiment 6 was designed to look at the effects of color on normal and structurally degraded scenes. These effects were examined for presentation durations ranging from 20 ms to 250 ms. Results revealed that responses to sharp photographs did not change according to whether the scenes were colored. However, blurred photographs were at an advantage when presented in color versus monochrome. Thus, the findings of Experiment 6 replicated the results of Experiment 4 and 5. More interestingly, the contribution of color seems to occur later than the effects of structure. At presentation durations of 50 ms, there was a definite (and not surprising) advantage for sharp scene over blurred scenes, but there was no effect of color. The contribution of color information begins at durations of 80 ms and continues for longer durations for the blurred scenes only. Thus, it seems that structure information was available earlier than color information. Although the data do not speak to the question of why the color effects have a later onset, there remains two possibilities. On the one hand, the effect of color could onset later than structure because the system only uses luminance information in the initial stages of processing visual information (Biederman & Ju, 1988; Davidoff, 1991). In this case, the effects of color are seen only when color is available to the decision-making system and it has an onset of at most 80 ms from onset. On the other hand, it could be that both types of information are available soon after onset, but the system is bias to use one type of information over the other. Because most objects and scenes have highly variable shapes, it would be natural that the system favors luminance information over color information as a default. In this case, the later onset of color simply reflects a change in strategy of the system in that the color 62 information is eventually weighted more, but that this change in weighting of incoming information takes time and is stimulus dependent. Therefore, the shift in the weighting is seen only at later onset, namely ~80 ms. 63 THE ROLE OF COLOR IN SCENE GIST PERCEPTION The experiments reported thus far have examined the relationship of color and structure in terms of the effect that color has on scene gist perception when the structure information is degraded (i.e., the scenes are blurred). Experiments 4, 5, and 6 demonstrated that when the structure was degraded, color contributed to the activation of scene gist; however, when structure was normal, there was no effect of having the scene presented in color. Yet none of these experiments addressed the question of how color contributes to gist activation and whether it is being processed at all when color effects are not seen. The role of color in the object recognition literature is thought to be based on one of two functions: color could help with the segmentation of the shape (thus, acts as a auxiliary segmenter in scenes), or color could directly act as a cue for object identity due to its association with the object in memory. In object recognition, the latter function of color is thought to be limited to objects that are consistently perceived with a specific color (i.e., that are color diagnostic). However, when the objects and distractors have similar shape or structure (Biederman & J u, 1988; Price & Humphreys, 1989; Tanaka & Presnell, 1999; Wurm, 1993), color becomes an important identifying factor. In fact, many objects that are thought to be color diagnostic also share similar structure (e.g., red fruit). Therefore, when the structure information across a stimulus set is highly overlapping, the unique contribution of color seems to increase relative to the contribution of structure information. The role of color as diagnostic of identity in object perception has recently been extended to scene perception (Goffaux et al., in press; Oliva & Schyns, 2000). In these 64 studies, when natural scenes were presented in normal color, participants’ performance was higher in a verification task (participants indicated whether a scene matched a given label) than when the scenes were presented in grayscale. Furthermore, natural scenes that were displayed with abnormal colors had a greater disadvantage than those presented without color. Both of these studies’ findings support the notion that color provides more than segmentation information. These results suggest that although color may sometimes act as an auxiliary boundary segmenter for equiluminant regions, color also directly provides scene gist information. In light of the findings reviewed above, it is not clear if the effect of color (found for blurred scenes in Experiments 5 and 6) was due to segmentation, or if color also provides gist information. It is possible that during the initial processing of visual information, scene gist activation is based on the scenes’ structure alone. In this case, color contributed only for blurred scenes because the ability of the visual system to segment region boundaries has been hampered by the filtering of high spatial frequencies. Therefore, even if the hues of the scenes are replaced with opposite hues, the activation of gist should be the same as when the normal hues are present and should be greater than when no color is presented (i.e., the scene is presented monochromatically) because color is available to aid segmentation. On the other hand, color information could be directly contributing to gist activation independently of structure. In this case, color effects were seen for the blurred scenes of Experiment 5 and the blur condition of Experiment 6 because the quality of structure information was lessened and so color was able to contribute more information, relative to structure information. If color is supplying gist information independently of 65 structure information, then swapping the hues of the scenes should adversely affect the activation of scene gist. Experiment VII Experiment 7 addressed the question of whether color directly contributes to the activation of scene gist when color effects are seen (i.e., for blurred scenes). The design was similar to Experiment 6; however, an abnormally colored scene condition was added to be compared to the colored and monochrome scene conditions. The abnormally colored scenes are able to reveal how the system processes color information because the misplaced color hues have no link to the relevant gist information. Therefore when blurred, the abnormal color scenes will provide the segmentation information that is thought to be provided by color, but will not be associated with the correct scene gist information. If the effect of color found in the previous experiments (Experiments 5 and 6) was only due to color acting as a segmenter of equiluminant regions, then abnormally colored scenes should affect the onset of scene gist activation equally to their normally colored counterparts. Therefore, the response biases for abnormally colored scenes should be greater than monochrome scenes and should be just as strong as the normally colored scenes (i.e., not statistically different from normally colored scenes). On the other hand, if color directly contributes to the activation of scene gist, then color information in abnormally colored scenes should interfere with gist activation. In this case, abnormally coloured scenes should produce less of a response bias than scenes presented in normal color. Whether there is a significant difference fi'om monochrome 66 scenes will depend on how much interference the abnormal color causes compared to the cost incurred by the absence of color (i.e., monochrome scenes). Methods Apparatus & Stimuli. A subset of 300 scenes was randomly selected from the set used in Experiment 6. Each scene had three versions: normal color, monochrome, and abnormal color. Figure 14 shows an example of a scene in each of the three color conditions. The methods used to produce the abnormal color version are identical to those used by Oliva & Schyns (2000). The photographs are transformed into L*a*b* color space and the information for a* and b* channels are swapped and inverted, thereby ‘ producing hues that are opposite of each other within the L*a*b* color space. The scenes were also blurred using the same methods described in Experiment 5. Figure 14. Exarrrple of stimulus color conditions to be used in Experiment 7: (A) Normal Color condition, (B) Monochrome Color condition and (C) Abnormal Color condition. 67 Design. Three factors that were varied for each photograph: color, target, and duration conditions (3x2x5). Procedure. The procedure was similar to Experiments 6, but participants were shown only blurred scenes in normal color, monochrome, or abnormally color. Results The analyses are organized as follows. First, the response bias patterns are analyzed for all color conditions. Then, the color effects are analyzed by comparing the difference scores across color conditions as described in previous experiments. Figure 15 shows the proportion “yes” responses for each target condition for normal color (Figure 15a), monochrome (Figure 15b), and abnormal color scenes (Figure 150). An omnibus ANOVA treating color, target, and duration (3 x 2 x 5) as within-subject factors revealed that there was a significant main effect of color (F (2,238) = 12.52, p<0.01, MSE = 0.023), target (F (1,119) = 1265.66, p<0.01, MSE = 0.066), and duration (F (4,476) = 83.92, p<0.01, MSE = 0.054). In addition, the interaction between color and target (F (2,238) = 15.63, p<0.01, MSE = 0.027), and target and duration (F (4,476) = 332.02, p<0.01, MSE = 0.030) were reliable. No other interactions were significant 68 IColored Consistent IColored Inconsistent 1 .00 7. l §oso 77 777777 c . 30.70 77 77 77 7 777i. 50.607 7 7777 7 '8 0.50 c0407 7 7 7 7 7 .8 0.307 E 0.20 0.1077 0.007 20 50 80 100 250 Duration (ms) Deuteron- @001me L 80 Duration (ms) Vflbnormal Consistent iIAbnormal Inconsistent 80 Duration (ms) Figure 15. Proportion “yes" responses to consistent target objects (darker bars) and to inconsistent target objects (lighter bars) for each duration condition in Experiment 7. (A) Colored scene condition (B) Monochrome scene condition (C) Abnormal scene condition Error bars represent Standard Error of the Mean. 69 Of specific theoretical interest was the effect of color between the abnormal condition and the other two color conditions. The pattern of differences with the normal color and monochrome conditions will decisively support one of the aforementioned hypotheses. The means for all conditions are presented in Table 2. Colored Abnormal Monochrome 250 Consistent 0.86 0.78 0.82 (0.13) (0.17) (0.15) Inconsistent 0.18 0.22 0.20 (0.14) (0.1 7) (0.16) 100 Consistent 0.78 0.67 0.71 (0.15) (0.21 ) (0.1 7) Inconsistent 0.30 0.31 0.30 (0.1 7) (0.18) (0.1 7) 80 Consistent 0.69 0.60 0.64 (0.18) (0.19) (0.19) Inconsistent 0.33 0.32 0.16 (0.18) (0.18) (0.1 7) 50 Consistent 0.49 0.45 0.48 (0.22) (0.21 ) (0.21 ) Inconsistent 0.34 0.34 0.34 (0.19) (0.19) (0. 20) 20 Consistent 0.35 0.33 0.35 (0. 25) (0. 25) (0. 25) Inconsistent 0.32 0.31 0.31 (0. 24) (0. 22) (0. 24) Table 2. Mean (Standard Deviation) of Proportion of “yes” responses for Experiment 7. In order to fiirther simplify these analyses, difference scores were calculated by subtracting the Inconsistent Target means from the Consistent Target means for each color and duration condition combination. Figure 16 shows the difference scores plotted 70 as a function of color and duration conditions. As explained above, the Segmentation Hypothesis predicted that the color information, regardless of hue, should aid with the definition of region boundaries that may have been lost with the removal of high spatial frequency information. Therefore, the abnormal color scenes should produce the same advantage over monochrome scenes as normal color scenes, and it also predicts that there should be no difference between the normal color and abnormal color scenes. The Gist Cue Hypothesis predicts that the abnormal color scenes should interfere with gist processing because the hue information provided is not associated with the correct scene gist, and may activate an unrelated scene gist. Therefore, the normal color scenes should produce an advantage over the abnormal color scenes. Whether the abnormal color scenes produce a cost that is equivalent, less than, or greater than the cost produced by the monochrome scene is unknown because the proposed mechanisms of interference for each of these scene types is different (i.e., no hues vs. misleading hues). To test the predictions of these competing hypotheses, two simplified AN OVAs were conducted comparing normal color to abnormal color scenes and monochrome to abnormal color $061168. 71 l +Colored +Monochrome +Abnormal 0.80 0.60 0.40 0.20 Differece Score (Consistent - Inconsistent) 0.00 -0.20 I l I I V TV I T I l l I 0 20 40 60 80 100 120 140 160 180 200 220 240 260 Duration (ms) Figure 16. Difference scores (Inconsistent — Consistent) for the Colored (blue) and Monochrome (red) and Abnormal (green) scene conditions. Error bars represent Standard Error of the Mean. First, the abnormal color scenes were compared to the monochrome scenes. A simplified ANOVA showed that there was a main effect of color (F (1,119) = 8.46, p<0.05), a main effect of duration (F (4,476) = 220.25, p<0.01), but no interaction (F (1,59) < 0.1, n.s.). Planned comparison t-tests were conducted and showed no effect of color across all duration conditions [250 ms: t(119) = 1.48, n.s.; 100 ms: 1(119) = 1.95, n.s.; 80 ms: t(119) = 1.55, n.s.; 50 ms: t(119) = 1.51, n.s.; 20 ms: t(119) = 0.32, n.s.]. Second, a simplified ANOVA was conducted to compare the effect of normal color scenes and abnormal color scenes. The analysis revealed a significant main effect of color (F (1,1 19) = 32.64, p<0.01), and duration (F (4,476) = 258.45, p<0.01), and a significant interaction (F (4,476) = 2.71 , p<0.05). Further planned comparisons using t- 72 test revealed an advantage of the normal color over the abnormal color for certain duration conditions. There was a higher response bias for the normal color than abnormal color for 250 ms (t (119) = 4.04, p<0.001), 100 ms (t (119) = 4.39, p<0.001), and 80 ms (t (119) = 3.14, p<0.01). There was no difference between the normal color condition and abnormal color condition at durations of either 50 ms (t (119) = 1.69, n.s.) or 20 ms (t (119) = -0.30, n.s.)z. One interesting aspect about creating abnormal color scenes by swapping and inverting the color hues is that it is possible that some scenes in the abnormal color condition that were produced would still fall within or close to its normal range of possible color hues. For example, many man-made scenes have a greater variety of possible colors than natural scenes (Oliva & Schyns, 2000). Because the normal hues have an association to the correct color hue and because some of these abnormal color scenes may be close to the normal range, it is possible that a difference exists in the amount of interference that produced depending on how different the abnormal colors are from a scene’s normal colors. To further explore the degree to which abnormal colors can differentially interfere with the activation of scene gist, a secondary analysis was conducted in which the abnormal scenes were divided into high and low abnormal. The abnormal color scenes were rated by a separate group of participants (11 = 20) on a 7-point Likert scale indicating the strangeness of the colors for each particular scene (1 -normal: 2 A more conservative set of postlroc analyses using Tukey’s LSD (p< 0. 05 for all conrparisons collectively) were also conducted to check against possible Type II errors, but t-tests are reported in the results section for consistency with the previously reported experiments. Comparisons for the colored scenes and abnorrrml scenes at each duration condition revealed an identical pattern as reported above with the t—tests, in that the normal color condition has a significantly higher response bias than the abnormal color condition at durations of 80 rrrs, 100 ms, and 250 nrs, but not at either 50 ms or 20 ms. 73 7-extremely strange). Participants were shown the 300 images from this experiment; half were shown in normal color and half were in abnormal color. Each participant saw each scene once and the color condition for each scene was counter-balanced across participants. The abnormal scenes with an average rating between 1.1 and 5 were designated as low abnormal and those scenes with an average rating of higher than 5 were designated as high abnormal. There were 165 low abnormal scenes (average rating: 3.77) and 135 high abnormal scenes (average rating: 5.84). Figure 17 shows an example of a low and high abnormal scene representing the average rating for each range. Figure 17. Example of low abnormal and high abnormal stimulus used in Experiment 7; Panel (A) and (B) show a scene in its Normal and Low Abnormal Color, respectively. The average rating for the image was 3.60. The average rating for the group was 3.77. Panels (C) and (D) show a scene in its Normal and High 74 Abnormal Color, respectively. The average rating for the image was 5.8. The average rating for the group was 5.84. Response bias difference scores were calculated for each abnormal group for all duration conditions. Figure 18 shows the difference scores for normally colored scenes, low abnormal, and high abnormal scenes as a function of duration. —e—Colour - i- Abnormal-Low +Abnormal High 1.00 E E 0.80 i {E 0.60 t; E) g 0.40 8 2 5’; 0.20 I i . E 0.00 a ‘0.20 T r T r r o 50 100 150 200 250 300 Duration (ms) Figure 18. Difference scores (Inconsistent — Consistent) for the Colored (blue), Low Abnormal (solid green) and High Abnormal (dashed green) scene conditions. Error bars represent Standard Error of the Mean. A within-subjects AN OVA was conducted for the normal color and abnormal color conditions with color and duration (3 x 5) as factors and revealed that there was a significant main effect of color (F (2,212) = 19, p<0.01), duration (F (4,424) = 171.64, p<0.01), and a significant interaction (F (4,424) = 2.51, p<0.01). Post-hoe comparison of 75 color conditions (using Tukey’s LSD, p<0.05 collectively) revealed that normally colored scenes produced a larger response bias than low and high abnormal scenes. High abnormal scenes were also significantly lower overall than low abnormal. In order to look more closely at the onset of the abnormal differences, post-hoe comparisons between normally colored and each of the abnormal conditions were conducted at each duration condition. Tukey’s LSD comparisons revealed that the response bias was greater for the normally colored scenes than low abnormal scenes for durations of 100 ms and longer and greater than high abnormal scenes for durations of 80 ms and longer. Taken together, the results indicate that the greater the abnormal coloration of the scenes, the more interference with the activation of the scene gist is produced and that the closer to normal the abnormal colors are the later the onset of the interference. Discussion Experiment 7 investigated the role of color when structural information is degraded. There were two hypotheses proposed: the Segmentation Hypothesis and the Gist Information Hypothesis. The Segmentation Hypothesis was that color aided in the extraction of scene regions and predicts that colored scenes (both normal and abnormal) would show a greater bias than the monochrome scenes and would not be significantly different from each other. Predicting that color can only help in the extraction of gist because it aids in the segementation of scene regions assumes that the structural information of a scene is responsible for the activation of the scene gist. This hypothesis is in line with the theory that luminance information alone is important for the object identification and scenes gist, and color can only play an indirect role (Biederman & Ju, 1988; Davidoff, 1991; Delorrne et al., 2000). On the other hand, the Gist Information 76 Hypothesis presupposes that color hue is associated with scene gist information and plays a direct role in the activation of scene gist. Therefore, when the hues are swapped (as in abnormal color scenes) the correct scene gist cannot be activated through color and the hues that are present could activate an incorrect scene gist. The Gist Information Hypothesis predicts that the normal color scenes would produce a greater bias than the abnormal scenes because the hues in the abnormal scenes would be misleading and interfere with the activation of the correct scene gist. This hypothesis assumes that color is a direct cue for the correct scene gist, and holds some identity information; therefore it is in-line with the color diagnostic arguments that state that color can be used for identification purposes because certain colors are diagnostic of certain stimuli (Joseph, 1997; Joseph & Proffitt, 1996; Price & Humphreys, 1989; Tanaka & Presnell, 1999; Tanaka, et al., 2001). The pattern of results for both the comparisons of the abnormal scenes to monochrome scenes and the abnormal color scenes to the normal color scenes supported the Gist Information Hypothesis. The bias effect was greater for the normal scenes than the abnormal scenes for durations of 80 ms and longer and there was no interaction between the abnormal color and monochrome scenes, although the significant main effect of color suggests that these two color conditions interfere with the activation of scene gist in differently, with abnormal colors producing a slightly greater interference. It is clear from this pattern that the hue information is directly contributing to the activation of scene gist. Although these results support the Gist Information Hypothesis, they do not necessarily rule out the Segmentation hypothesis. It is not possible to rule out the Segmentation hypothesis with the current data because it could be that color is doing both 77 and the pattern of results viewed stem from both the cost of interference from the abnormal hues and the benefit from improved segmentation, with the overall cost being higher. Future investigation would be required to rule out the role of color as a segmenter of scene regions. The finding that abnormal colors interfere with scene gist activation is similar to previous studies of color diagnosticity. Color diagnosticity is defined as the strength of the association between a color and an object or scene. Many researchers measure this strength by having participants name the color of a set of particular objects or rate the frequency with which an object is usually seen in a particular color. Color diagnostic stimuli are then defined as the items that have a particular color that is intrinsically linked to its identity and therefore, produced the most consistent responses (either fiequency ratings or naming frequencies) across all participants (Price & Humphreys, 1989; Tanaka & Presnell, 1999; Wurm et al., 1993). Another method is by plotting the average or most frequent hue for each scene in a color space. Color diagnostic scenes are defined as those that form a tight cluster in one area of the color space with very little overlap with other types of scenes (Oliva & Schyns, 2000). Either way, the proposed role of color is the same: it contributes to identification of the stimuli because it is associated with its identity. Inherent in the concept of color diagnosticity is that there is variation in the contribution of color for any given object or scene. Some objects have a much stronger association between a color and identity than others. For example, ripe bananas are associated with yellow, while ripe apples can be red, green or yellow. The scenes in the current study were not measured according to their diagnosticity because it was assumed 78 that for any particular scene, even one with a high variation (and therefore, considered to have low color diagnosticity), the variation of acceptable hues is still limited. For instance, kitchens are often colored in a wide range of color hues; however, it is still possible to depict a kitchen in a hue that falls outside its normal range, such as fluorescent yellow. Although previous studies have deemed only certain objects or scenes as color diagnostic, the assumption in the current study is that all scenes are potentially “color diagnostic” as long as the hue change falls outside the normal variation. If all hues are swapped and the resulting hues are outside this normal variation of a particular scene, then the resulting hues interfere with processing and a cost in ‘ performance is observed. In addition, there is the possibility that how far the abnormal scenes fall outside the normal range could affect performance. It was possible with the current set of stimuli to investigate whether abnormal hues that were conSidered to be highly abnormal produced a greater cost than those that were considered to be close to normal. A post-hoe analysis of the abnormal scenes was conducted in which the abnormal scenes were rated as high or low abnormal by a separate group of participants suggested that the amount of interference produced by abnormal scenes is dependent on how much the abnormal hues departed from the hues considered normal for a particular scene. The bias effect was greater for the low abnormal scenes than the high abnormal scenes. Therefore, it seems that if the hues are closer to the normal variation of hues for a given scene, the correct scene gist is activated more readily than if those hues are considered extremely abnormal. Furthermore, the analysis of duration condition shows that as the duration of the abnormal scenes increase the interference seems to increase. The pattern seen in Figure 79 14 suggests that the increased difference over time is due to the increase of the bias effect for the normal color scenes over time, over which period of time there is no equivalent increase for the abnormal scenes. However, these results are based on only on two points and therefore, are interpreted only at a speculative level. It is not possible to draw any firm conclusions about the nature of the interaction or how it influences performance over time. However, do suggest an interesting interaction that can be investigated further in future studies. 80 GENERAL DISCUSSION The seven experiments reported in the current study investigated how quickly information is extracted for the activation of scene gist after onset, whether color plays a role in its rapid activation, and what type of role color plays under circumstances in which its effects are seen. The Contextual Bias paradigm was introduced in which scene gist onset is measured as the bias to affirm objects that are consistent with the scene gist 1“ and to disconfirm objects that are inconsistent. This paradigm allows for the measure of the onset of scene gist at the conceptual level by having that concept measured by another component (likely objects), rather than by asking the participant to agree with the label chosen for that concept. The scenes were presented for various durations and the onset of the bias was used as an indication when scene gist was activated. Across Experiments 1-3 (and all subsequent experiments), the onset of the bias effect increased with an increase in the duration of scene presentation. These results suggest that the activation of scene gist builds up over time as more visual information is acquired. Further discussion of the implications for scene representations and suggestions for future investigations will be discussed in detail later on. Experiments 4-6 showed that removal of color information from normal scenes produced no effect on the activation of scene gist. However, when structural information is degraded (i.e., the scene is blurred by removing high frequency information, thus slowing down the efficiency with which the scene gist information can be extracted), color had an effect on the activation of scene gist. Blurred scenes that were presented in color produced a more pronounced bias effect than those that were presented in 81 monochrome. These results suggest that color plays a role in the activation of scene gist, but is dependent on lack of structural information. Moreover, these results suggest two possible roles for color: Color may act as either a segmenter of scene regions, or as a cue for scene gist. These hypotheses suggest two different architectures for how color information may contribute to the activation of scene gist. The Segmentation Hypothesis assumes that color contributes to the activation of scene gist only when the structure information is degraded because it can help recover some of the boundary edges that were lost when the high spatial frequency information was removed. The architecture would involve an indirect route for color to scene gist activation through structure information and thus, would mean that only structure can directly activate scene gist. Alternatively, the Gist Information Hypothesis assumes that color is associated with scene gist and therefore that it can act as a direct cue in its activation. The architecture implied in this case would be that both structure and color information are associated with scene gist and each can contribute to its activation. To further explore the role that color plays in the activation of scene gist, abnormally colored scenes were used in Experiment 7. Abnonnally colored scenes were selected because the altered hue information can differentiate between equiluminant regions and therefore provide segmentation, but the hues have no association with the correct scene gist. The results strongly supported the Gist Information Hypothesis. Response biases for the abnormal scenes were significantly lower than the normal color scenes and did not differ significantly from monochrome scenes. Therefore, providing segmentation information alone did not contribute to the activation of the scene gist. 82 Additionally, because the wrong hues were displayed, they could have potentially cued other scene gists and thus, interfered with the activation of the correct scene gist. Implications for Scene Perception The interaction of structure and color and the finding that color does contribute directly to the activation of scene gist suggests that scene gist may be activated by a combination of weighted input from these two sources. It is possible that color information is available, but as a default the visual system weighs structure information more because it is usually sufficient for the activation of scene gist. Therefore, the use of color would depend on the infonnativeness of the scene structure. If the stimulus set has indistinctive structures (i.e., same shape), or some structure information is irretrievable (i.e., due to occlusion or blurring), then the color contributes relatively more to the activation. This type of interaction suggests that color plays an important role in the activation of scene gist, but only under specific circumstances. For instance, when stimulus sets are structurally distinct (such as most collections of man—made objects), color is less of a necessity for identification (Biederman & Ju, 1988; Davidoff, 1991). Therefore, not only do the results from the current study suggest how color and structure interact, they also strongly suggest that color has a role in the activation of scene gist. Additionally, the interaction outlined above can highlight why one would or would not expect differences between color and monochrome stimuli. For example, as reviewed in the discussion of Experiment 7, in some cases it seems that the role that color plays in the identification of certain objects depends on the strength of the association between a particular hue and a particular item. However, previous studies have shown that the 83 association of color to these identities alone cannot be used to predict the usefulness of color in the initial identification of an item (Wurm et a1, 1993). Rather, it is the combination of structure and color that: Given a certain structure (with many possible identity candidates), color can further narrow the possible alternatives. Despite the findings that color diagnosticity cannot be used to predict identity, given a certain structure, color is a unique identifying feature. Narrowing possible candidate items given a certain structure can also explain the differences found in the contribution of natural vs. man-made scenes (Oliva & Schyns, 2000). These two types of scenes are categorically different, but that alone cannot explain why the system would weigh color more with one particular type of scene, while weighing structure more for another before the category is processed. However, from the perspective of the interaction, the importance of color would have to rely on the structure information. One property that natural scenes share is the inclusion of mass objects (e.g., water, sand, snow). Mass objects do not have any defining structure, but instead have defining textures and colors. Based on the findings in the current study, one may speculate that because scenes containing mass objects have similar structures, color is more informative. Therefore, scenes such as deserts, beaches, fields, and forest seen from a bird’s-eye perspective have similar structures that arise fi'om various types of mass objects. The interaction framework would then predict that because natural scenes have more overlapping structures, color is weighed more heavily. Although they may have stronger associations with color cues than the man-made scenes, deciding to use color according to the structure would not require the scenes’ categorical membership be known ahead of time. Another interesting prediction would be that man-made scenes 84 containing mass objects should also show a benefit from having color (e. g., outdoor pools, fountains, etc.). The default of having structure initially weighted more heavily could provide the information that the system needs for adjusting the amount of importance it assigns to the incoming perceptual information. When structure information is poor, indistinct, or unavailable, then the system could essentially tune itself toward color cues. This proposed system of “structure first” in deciding which perceptual cues to then consider is consistent with the later onset of color effects seen in the current study. However, further investigation is necessary to determine the reason is for the later onset of color across these experiments. In addition to the properties of the stimulus set, the interaction between color and structure framework also predicts that the default strategy of the system can be changed according to the task. If the task involves a decision that relies more on color information than structural information (for instance, estimating temperature of the environment or deciding indoor vs. outdoor), then removing color information should produce a cost in performance even when all structural information is available (i.e., normal, not blurred). In addition, changing the contribution of color relative to structure by changing the task could answer questions about the availability of color early on. In all experiments that showed a color effect (Experiments 5-7), the contribution of color seemed to have a later onset (~80ms) than the onset of the response bias overall. The later onset could result from the fact that color is simply not available before 80 ms (due to a longer processing time than luminance information) or it could be that changing from the default strategy to an alternative source (i.e., from structure to color) takes time in a case in which structure is relied upon three quarters of the time (sharp-color; sharp-monochrome and blur 85 monochrome). Future studies could examine whether color effects could onset earlier by having a task in which in all conditions, color provides more useful information. By using a change in task, the nature of the interaction between color and structure could be further explored. The Response Bias and long-term Memory for Scenes From the current study, it is clear that with the initial representation of scenes, certain biases are present. However, it is not clear how this bias then changes over time and whether it is incorporated into the long-term representation of a particular scene. Furthermore, explorations into how the response bias changes with increased exposure to the scene may help explain a discrepancy in the literature about memory for scenes. It seems that memory of the global properties of a scene is better than memory of specific details within a particular scene. Past studies on memory for complex scenes have shown that people have an enormous capacity for remembering previously viewed scenes, even when the distractors were mirror images of previously viewed scenes (N ickerson, 1965; Shepard, 1967; Standing, 1973). Based on these studies, many researchers assumed that representations of scenes included many details about the scenes, such as its component objects. The problem with this assumption is that memory for the details of a briefly presented scene is often based on the scene’s associated semantic category and are affected by its associated schemas, not the actual details present in the scene (Hollingworth & Henderson, 1999). Studies have also shown that this influence of schemas is also seen when participants are given an extended viewing time (Brewer & Treyens, 1981; Intraub, 86 1981). For example, when a scene is viewed either for 500 ms or 5 s, participants tend to make an equal number and the same types of errors (boundary extension) when asked to reconstruct the scene or to recall its details (Intraub, 1981). In another study, participants were escorted into what they thought was a graduate student’s office to wait until the start of the experiment (Brewer & Treyens, 1981). In this case, participants were given several minutes to study the room, although they were not made explicitly aware that they would be tested later. Memory for this office was then tested with various measures. The most notable finding was that participants recalled non-existent objects in the room (i.e., books). Recall of the non-existent objects was inferred from the schema, rather than the room itself. Furthermore, studies examining memory for objects within scenes have led many researchers to argue that there is little or no memory for visual information. Several findings such as change blindness, incremental change, and inattenttional blindness are used to argue for a poor memory of scenes overall. These arguments capitalize on the fact that the memory for details may be poorer than memory for different scenes. From these studies it seems that the biases that come with the initial perception of a scene has a strong influence on how information about a specific scene is retrieved. Despite the impressive capacity of remembering scenes, it is clear fi'om previous studies that memory for scene details are poorly encoded to begin with or may simply be more difficult to retrieve. From the current study, with the initial activation of the scene gist, expectations about the scene have an effect on how its contents are later recalled. The question now becomes whether the bias seen during the initial activation of the scene is responsible for these errors made later during recall. For instance, it could be that the bias becomes a 87 part of the memory for that scene and further exploration adds visual details to the initial representation with visual details that are encountered, but the initial biases still persist. On the other hand, it could be that with further exploration of the scene, the biases diminish and are replaced with the information that is actually encountered. Investigations into how the bias response is affected with longer viewing durations could help address this question. When asked to recall details within a scene versus recognize a previously viewed scene it may be that participants rely on feelings of familiarity rather than recalling actual object details. Future investigations could examine these possible differences using a technique used often in the memory literature. In this technique, often referred to as “know/remember judgments”, participants are asked to make a judgment on the memories they retrieve by indicating whether they are recalling the actual instance (remember) or whether they are basing their answer on a general feeling of familiarity (know) (Inoue & Belleza, 1998; Rajarem, 1993; Rajarem & Roediger, 1997). These types of judgments could easily be used when measuring for the retrieval of scene knowledge versus actual memory for the objects in a particular scene. The remember/know judgments could be especially intriguing method to examine how an increase in encoding time may affect the bias. For instance, it could be that even when participants are given ample time to exhaustively examine the scene, the response bias will still be present and more “know” than “remember” judgments will be made. Or it could be the case that when enough time is allotted to sufficiently build up the representation of a particular scene, the response bias will decrease and more “remember” judgments will be made. Investigations into the variability in recall abilities 88 between global properties and details of a scene could shed some light on the reasons for the differences seen throughout the literature. Conceptual vs. Visual Representations of Scenes Recent controversies in the scene perception literature have led some researchers to highlight a distinction between two types of information relating to scene representations: a visual/perceptual and a conceptual representation of the scene Glollingworth & Henderson, 2003; Oliva, in press; Potter, 1993; Potter et al., 2004). The perceptual representation refers to the visual information gathered and analyzed by the visual system. For instance, visual global information could include spatial layout, textures, and the most dominant hue, while specific visual details could include the orientation, shape and color of objects. The conceptual representation, however, refers to a more abstract coding of the scene that may include related schemas and scripts, and will often lead to expectations about the scene (e.g., what its component objects are and where they are likely to be found within a scene). There are those researchers that concentrate on the visual details of the scene: how visual information leads to the interpretation or categorization of the scene (Biederman, 1988; Homa & Viera, 1988; Marr, 1981; Oliva & Schyns, 1997, 2000; Schyns & Oliva, 1994; Torralba & Oliva, 2003), or what type of visual information is represented and stored in memory (Castelhano & Henderson, in press; Hollingworth & Henderson, 2002; Sanocki, 2003; Sanocki & Epstein, 1997). Then there are those that concentrate on the conceptual representations of the scene and how this representation can then affect further evaluations and memory for a particular scene exemplar (Freidman, 1979; Intraub, 1981, 1988, 1999; Palmer, 1975; Potter, 1999; Potter 89 et al., 2002, 2004). Regardless of how the representation of the scene is evaluated, studies have shown that within the first few hundred milliseconds of viewing each of these representations exist in some form. Within the first 100ms, a representation of the scene is formed, but the representation is highly unstable (Potter, 1978, Potter et al., 2002, 2004). Potter (1999) proposed that even if scenes can be perceived within that short amount of time, one second of further processing by the system is necessary (even a blank screen) to store a firnctional representation of the scene. In a recent study, Potter et ‘al. (2002) asked participants to discriminate between scenes they had been briefly shown and ones that they had not been shown but that shared a similar conceptual representation. Results showed that participants had better memory for a conceptualization of the scene than the specific visual details of that scene. Intraub (1981) has also shown that after a brief viewing of a scene, participants have specific biases in how the visual details of a scene are reconstructed and have a tendency to extend beyond the boundaries of a particular view of a scene during recall. Intraub explains these biases as a result of the related semantic information, which expands the current view due to expectations drawn fi'om the conceptual representation of the scene. Taken together, these results have been used to erroneously argue that visual details of a scene are reconstructed based on the conceptual representation and that very little visual information is stored. Other research has shown that a scene’s visual information is stored and that some information such as spatial layout and color can then be used to improve both detection, matching, recall, and recognition performance for previously viewed scenes (Amano, Uchikawa, & Kuriki, 2002; Gegenfurtner & Rieger, 2000; Hanna & 9O Remington, 1996; Sanocki & Epstein, 1997 ; Wichman, Sharpe, & Gegenfurtner, 2002). For instance, in one study Wichman et a1. (2002) had participants recognize previously viewed scenes that were originally shown either in color or in monochrome. In one experiment, the images that were shown in their original state (color or monochrome) showed an advantage of color in recognition accuracy. Importantly, the images were selected from four categories (forests, flowers, rock formations and man-made) and a color advantage was found across all categories. These results indicate that the associated semantic information (e. g., forests tend to be green) may not be as important for recognition as the remembered hues in episodic memory. In another experiment, the color state of the images was swapped between study and test. Results showed there was an overall performance cost for showing a different state at test than seen at study, which is consistent with the encoding specificity principle. However, the cost for removing color was much greater than when color was added, indicating that the color in scenes is processed and stored regardless of whether it is used in the initial conceptual identification of a particular scene. Other studies investigating encoding and storage of visual information suggest that this information is not iconic in nature, but rather abstracted away from the percept. In one study, Amano et a1. (2002) had participants view images for a memory test that had been modified by changing the color hues in a specific manner. The hues could either change, but remain in the same category (e.g., one shade of red for another), or change categories completely (e.g., from red to orange). For the within category changes, the hue could be replaced with a border hue (e. g., to orangish-red) that was identified within the RGB color space or could be replaced with the ideal hue defined as the centre 91 of the color category with in the color space (e. g., pure red). Memory for images was enhanced when colors were replaced with the ideal hue, slightly impaired when replaced with the border colors, and greatly impaired when replaced with a hue fiom another category. Amano et a1. concluded that memory for the color in images is not absolute, but rather are more abstract. Furthermore, when participants were asked to compare these modified images to the original, they are able to make the distinction. Therefore, it is not the case that these modifications produced the same subjective percept. Taken together, the results fi'om many of these memory studies show that visual features are stored to some degree during encoding. However, it is not the case that the system encodes these features as exact visual copies, but rather as abstract visual information. Change blindness was once thought to demonstrate how visual features within an image that is currently being viewed are not encoded or permanently stored (for review, see Hollingworth & Henderson, 2003; Rensink, 2000; Simons, 2000). However, research scrutinizing how visual information was encoded and retrieved revealed that visual information for scenes does exits in some form (Hollingworth & Henderson, 2002; Simons, Chabris, Schnur, & Levin, 2002). Further studies by Hollingworth (Hollingworth, 2003, in press; Hollingworth & Henderson, 2002) have shown that when directly tested, participants were able to distinguish between a previously viewed object and one that shares many similar visual features. Furthermore, this ability to recall or recognize previously viewed objects remains intact even when the stimuli has been absent from view for some time (Castelhano & Henderson, in press; Hollingworth & Henderson, 2002). More recently, studies have shown that this information is stored incidentally (Castelhano & Henderson, in press; Williams et al., in press). The 92 phenomenon of change blindness is explained as the failure to either encode the original visual feature or as the failure to retrieve the previously encoded representation and make the comparison to the current view (Hollingworth, 2003, in press). Therefore, not only are visual features stored during the on-line exploration of a scene, but also the visual information is stored into a more permanent representation that can potentially be accessed when encountered again. Many of the studies reported above investigated the nature of scene representations by examining memory for the two types of information (i.e., perceptual and conceptual). The current study was designed to look specifically at the initial perception of the conceptual representation and what perceptual factors may affect how quickly the conceptual representation is activated. Although the current study examined only the initial conceptual representation of scenes, this is not to say that some visual properties of the scene are not encoded into memory as well. Given that certain visual features act as an important memory cue (e.g., color), it is intuitive to think that the information used to activate the conceptual representation of the scene is also consolidated and stored as a visual representation. However, the relation is between the initial processing of visual information and the encoding of visual information into a more permanent store is still unclear. There may be differences in how the conceptual and visual information of scenes are perceived, encoded (e.g., may have variable consolidation times), and retrieved. Future investigations will be needed to examine whether the initially processed visual information is in fact stored and consolidated into a long-term representation and what the nature of the visual information is when only a brief amount of viewing time for the scene is allowed. 93 Conclusions The present work examines the contribution of two perceptual factors (color and structure) to the rapid activation of scene gist. Previous research showed mixed results for the contribution of color in the initial identification of objects and scenes. As a result, researchers have been divided on whether color plays an important role in the initial processing of visual stimuli. The current study sought to explain these differences as a result of an interaction between color and structure. Rather than an all-or-none role for color, the results from the current study suggest that the role of color may be explained in terms of the properties of the stimulus set and the task selected. One way to conceptualize this is in a dynamic visual system that shifts its use of information received from early visual processes according to the task goals. In addition, the Contextual Bias paradigm was introduced for the study of scene gist. Instead of providing a label for each scene (e.g., kitchen, bedroom, park, etc.), participants were asked to make judgments on objects that are related or unrelated to the scene. In this way, the Contextual Bias paradigm avoids the problem of subjectivity in labeling images and is still able to access the activation of the conceptual representation of the scene. Exploring the interaction between structure and color may help to guide fiiture investigations into how the visual system takes incoming visual information and activates the appropriate conceptual representation. In this way, research into the junction between early perceptual processes and later cognitive processes can help to illuminate possible strategies (either explicit or implicit in the visual system dynamics) that are used for 94 finding and using the most useful incoming information fi'om all that is made available in order to complete the current task or set of tasks most efficiently. 95 APPENDIX 96 APPENDIX A list the scenes and the accompanying consistent and inconsistent objects used in all experiments. There are 400 total scenes listed here, and all experiments with fewer stimuli were a subset taken fiom this list. These photographs were taken fiprn magazines, books, calendars, and the Internet and were formatted as 800 x 600 bmp files. Item Scene Description Consistent Inconsistent Number Object Object l mountain road truck coat rack 2 white building with courtyard water fountain couch 3 restaurant patio waiter fishing net 4 bedroom clock stove 5 city line with bridge statue barn 6 amusement park ferris wheel train station 7 shipping yard fishing net couch 8 living room bookshelf car 9 kid's bedroom books lawn mower 10 alloy street lamp water fountain 1 1 garden wind chime motorcycle 12 garden with fountain sun dial office building 13 kitchen microwave couch 14 living room lamp doll house 15 bathroom bath towel flag 16 city bridge traffic sign washing machine 1 7 cemetery gate bathtub 18 fishing boats fish stop sign 19 store fire hydrant bridge 20 building in forest flag area rug 21 bikes bike helmet cows 22 garden birdfeeder streetlamp 23 venice street balcony television 24 harbor lifejacket painting 25 front of house mailbox stop sign 26 living room TV remote bridge 27 park frisbee television 28 street street sign wardrobe 29 living room book cashier 30 bathroom toilet paper calculator 3 1 bathroom lotion skateboard 32 bathroom toothpaste dictionary 33 bathroom mouthwash grill 34 greenroom rug stove 35 city bridge helicopter ice machine 36 city billboard ocean liner 97 37 38 39 40 41 42 43 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 garden bar outdoor cafe patio living room bedroom backyard with pool mountainous coast cliff on ocean city street castle cemetery cemetery dining room street corner kitchen park statues skyline cliff construction site living room field cemetery courtyard field with cows courtyard street dam dining room dining room dining room dining room dining room dining room dining room dining room dining room dining room dining room dining room dining room dining room dining room dining room dining room dining room dining room 98 fountain martini glass bicycle couch bedside table swimsuit ocean liner sailboat motorcycle fountain flowers flowers wine glass mailbox refrigerator park bench flag ship crane lamp cows vase birdfeeder barn sun dial street lamp boat N9 china cabinet water pitcher tea cup wine bottle area rug painting fruit bowl teapot candelabra place mat wine bottle candle salt and pepper shakers coffee pot bread basket curtains chandelier portrait painting monkey bars bed microwave dresser balcony fishing boat skyscraper street lamp sandbox ocean liner street sign beach ball mouthwash hammock motorcycle tractor painting oven mug dumpster water fountain doll house dumpster ice machine parking meter couch sun dial toaster computer grill toaster bird's nest crib baseball mitt skateboard phonebook printer toy chest scale mouthwash rose bush lawn mower oven stereo toilet paper life jacket 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 dining room dining room dining room/kitchen dining room [living room dining room dining room kitchen kitchen outdoors dining room dining room outdoors dining room outdoors dining room dining room dining room dining room dining room/kitchen dining room dining room dining room dining room dining room dining room dining room dining room desk bedroom living room living room artsy un-contemporary dining room living room living room dining room bedroom bedroom living room living room kitchen street forest highway houses living room city classroom living room kitchen 99 oranges blinds garbage can area rug bookshelf place mat refrigerator stove pitcher statue bird feeder tea pot tea cup silverware salt and pepper shakers coffee pot candles napkins portrait painting flower bouquet painting clock blinds water pitcher phone waste basket painting recliner lamp television magazine rack curtains alarm clock dresser bookshelf couch stove fire hydrant deer speed limit sign mailbox bookcase harbor globe fireplace fruit bowl flag street sign skateboard toothbrush buoy bike rack toilet dresser mirror dishwasher microphone sandbox moss rake couch lawn mower chalkboard bicycle overhead projector recliner swing set pinball machine phone book printer rocking horse blender overhead projector playpen speed limit sign hammock blender toothpaste mailbox fire truck bed toaster sleeping bag harbor recycling bin television coffee maker towel chalkboard microwave sun dial globe 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 house ruins fountain in city plaza fountain in city plaza fountain in city plaza kitchen kitchen garden ancient roman\greek ruins bedroom golden gate bridge living room den rafts in canyon cemetery greenroom docked sail boat harbor/outdoor cafe underwater reef dining room bedroom pond mountain tunnel forest home office office office bedroom art studio office office office home office office backyard kitchen store kitchen living room living room classroom classroom classroom living room bedroom living room 100 flowerbed stairs hedges bench garbage can - refrigerator phone bench bird's nest toy chest sailboat coffee table lamp life jacket wreath magazines life jacket buoy sting ray china cabinet mirror frog speed limit sign bird stapler printer notebook lamp recliner waste basket bookcase laptop scanner scanner towel stove cashier refrigerator lamp stereo books garbage can books juice pitcher bedside table tv remote dresser car birdfeeder beach ball baseball bat mailbox pillows china cabinet speed limit sign cooler pool pick axe Frisbee bookcase toy chest stairs lawn chair bathrobe bird oil drum life jacket lamp loveseat wine bottle pot water fountain bathrobe weather vane toaster loveseat hedge clippers baseball bat lawn chair washing machine crane coffee table couch hammock trees crosswalk trees trailer coffeemaker wind chime corkscrew kiddie pool 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 bathroom kitchen office office kitchen shelves kitchen garden shed closet attic bathroom bathroom living room living room bedroom living room temple kitchen living room living room bathroom kitchen kitchen kitchen kitchen kiosk kitchen kitchen kitchen kitchen kitchen kitchen kitchen kitchen kitchen kitchen kitchen kitchen kitchen kitchen kitchen kitchen kitchen kitchen kitchen kitchen kitchen scale oven waste basket desk lamp potatoes corkscrew birdfeeder mirror toy chest hamper mouthwash television fireplace phone magazine rack podium coffee maker loveseat television toothbrush cutting board stove cookie jar crock pot candy toaster dishwasher dishwasher paper towels spatula microwave juice pitcher blender coffee maker cutting board spatula crock pot oven fruit bowl table trash can paper towels food processor soup ladle wok cookie jar 101 coffeemaker well shovel cutting board dresser no parking sign sofa fireplace street sign motorcycle pillows mailbox boat rocking horse shovel jukebox desk lamp birdbath bench bird house flower bed welcome mat stop sign street light lamp coffee table magazine rack toothbrush magazine rack wrench loveseat fountain dumpster street sign kiddie pool monkey bars tool shed lighthouse bathrobe microphone tent swing set hydrant bookcase flag chest 221 222 223 224 225 226 227 228 229 230 23 1 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 25 l 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 kitchen kitchen kitchen kitchen kitchen kitchen kitchen kitchen kitchen kitchen kitchen kitchen/dining area kitchen kitchen kitchen kitchen kitchen kitchen kitchen kitchen kitchen kitchen dining room kitchen kitchen road tunnel harbor building on lakeshore road street living room living room motel highway living room living room living room type area living room living room living room living room living room living room living room living room living room 102 wok corkscrew sink bread basket refrigerator knives frying pan cookbook pot microwave oven bananas bananas refrigerator chinaware juice pitcher mixer coffee cup cutting board wine glass kettle corkscrew . wine bottle coffee maker refrigerator street sign steering wheel boat car weather vane coffee table painting ice machine billboard fire stoker tissue box television coffee table mirror stereo television books telephone coffee table chest magazine rack ferris wheel kiosk yield sign coffee table podium tire swing reading lamp restroom sign doghouse sled sundial wardrobe laptop wind chime fishing rod mailbox kiddie pool lawn chair recliner hose crosswalk hammer shower crib bathtub speed boat painting punching bag keyboard lighthouse Ping-Pong table overhead projector loveseat parking meter ice machine microwave recycling bin bathrobe drying machine shower tricycle fishing boat life jacket toolbox picket fence tractor 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 31 1 312 living room living room field forest ship motel dive restaurant country corner store kitchen houses harbor of yachts stores store boats backyard city square town mountain river city backyard restaurant city store gas station porch desk lab porch street river bridge fruit stand street and store front living room street with trolley street alley behind building city Vienna street city bridge garage houses building building building street street 103 coffee table candles rabbit moss sail parking meter open sign welcome mat blender doorbell fishing rod bike rack street light fish birdhouse bndge truck raft billboard swimsuit menu traffic lights mailbox street sign mailbox fountain pen flask flower bed yield sign bicycle peppers bicycle rack lamp no parking sign stop sign dumpster lighthouse bfldge trailer radio street light bench flag bike rack bench lamppost hose tree no parking sign television car stool doll house sandbox barn fish bowl street sign ship lighthouse street light fire hydrant ferris wheel bed recliner sailboat street sign faucet overhead projector coat rack wine rack parking meter shovel bathtub ceiling fan table refrigerator hedge clippers dresser dining set aquarium pillows dining set jukebox fruit bowl swing set parking meter buoy rocking horse podium bndge barn dishwasher 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 street courtyard stores stairs store street corner alley restaurant patio street square street building with river Vienna street Vienna street castle chinese building chinese building cemetery Sydney ancient hall ancient hall stairs street building small street street street bndge fountain pier bedroom deck deck porch living room river and rafts restaurant restaurant front bedroom bedroom ancient roman\greek ruins carnival lake cottages alley field living room parking meter bench street sign garbage can door crosswalk dumpster garbage can hedges statue flag flag boat oar flower bed garbage can fountain birdbath bicycle pillows statue fountain parking meter fountain flower bed doorbell fire hydrant fishing boat bench fishing boat dresser towel wine bottle birdfeeder bowl backpack motorcycle menu closet wardrobe stairs ferris wheel fishing rod shovel dog picture frame drying machine step ladder boathouse toy chest tricycle sofa lawn chair barn life jacket couch computer swing set food processor dust pan picket fence fire hydrant painting pool silverware restroom sign phone booth punching bag refrigerator phone booth scale porch swing monkey bars hedge clippers shampoo lawnmower blender car Ping-Pong table office building sandbox speed limit sign toaster wardrobe rocking horse computer car garden birdbath fire hydrant crosswalk blender 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 living room building kitchen living room living room city bfidge city harbour backyard street\buildings street\buildings street\buildings street\buildings street\buildings street\buildings street\buildings street\buildings street swamp temple dining room city street road house front construction site street living room street street venice street venice street store venice street venice street venice street stairs pond house dining room dining room dining room 105 books lamppost food processor candle ceramic figurine traffic lights fishing boat flag life jacket swing motorcycle fire hydrant lamppost no parking sign parking meter mailbox phone booth street light mailbox turtle microphone teabag bench speed limit sign trash can shovel street light magazines truck phone booth flower pot flag painting flower pot bfidge boat welcome mat frog lawn mower vase wine bottle china cabinet curtains faucet printer flower bed overhead projector port stop sign microwave crosswalk bathtub file cabinet guitar painting overhead projector stool barn sheep cows aquarium computer menu recliner microwave bike pillows phone booth tractor dresser fireplace deer television barn magazine rack bird's nest magazine rack stop sign pond notepad file cabinet trunk desk flower bed REFERENCES Amano, K., Uchikawa, K., & Kuriki, I. (2002). Characteristics of color memory for natural scenes. Journal of the Optical Society of America, 19 (8), 1501-1514. Biederman, I. (1972). Perceiving real-world scenes. Science, I 77(4043), 77-80. Biederman, I. (1987). Recognition-by—components: a theory of human image understanding. Psychological Review, 94, 115—147. Biederman, I. (1988). Aspects and extension of a theory of human image understanding. In Z. Pylyshyn (Ed.), Computational Processes in Human Vision: An Interdisciplinary Perspective. New Jersey: Abblex Publishing Corporation. Biederman, I., & Ju, G. (1988). Surface vs. edge-based determinants of visual recognition. Cognitive Psychology, 20, 38764. Biederman, 1., Mezzanotte, R.J., Rabinowitz, J .C. (1982). Scene perception: Detecting and judging objects undergoing relational violations. Cognitive Psychology, 14, 143-1 77. Breitrneyer, B.G., Kropfl, W., & Julesz, B. (1982). The existence and role of retinotopic and spatiotopic forms of visual persistence. Acta Psychological, 52(3), 175-196. Brewer, W. F ., & Treyens, J. C. (1981). Role of schemata in memory for places. Cognitive Psychology, 13, 207-230. Castelhano, M.S., & Henderson, J .M. (in press). Incidental Visual Memory for Objects in Scenes. Visual Cognition: Special Issue on Scene Perception. Castelhano, M.S., & Henderson, J .M. (under review). Initial Scene Representations Facilitate Eye Movement Guidance. Davenport, J.L., & Potter, MC. (2004). Scene Consistency in Object and Background Perception. Psychological Science, 15(8), 559-564. Davidoff, J. (1991). Cognition through color. MIT Press: Cambridge, MA. Davidoff, J. & Ostergaard, AL. (1988). The role of colour in categorical judgments. Quarterly Journal of Experimental Psychology: Human Experimental Psychology, 40 (3-A), 533-544. Davidson, M.L., Fox, M.J., & Dick, AC. (1973). Effect of eye movements on backward masking and perceived location. Perception & Psychophysics, 14, 110-116. 106 De Graef, P., Christiaens, D., & d’Ydewalle, G. (1990). Perceptual effects of scene context on object identification. Psychological Research, 52, 317-329. Delorme, A., Richard, G., & Fabre-Thorpe, M. (2000). Ultra-rapid categorization of natural scenes does not rely on colour cues: a study in monkeys and humans, Vision Research, 40, 2187— 2200. Duhamel, J ., Colby, C.L., & Goldberg, ME. (1992). The updating of the representation of visual space in parietal cortex by intended eye movements. Science, 255(5040), 90-92. Edwards, R., Xiao, D., Keysers, C., Fdldiak, P., & Perrett, D. (2001). Color sensitivity of cells responsive to complex stimuli in the temporal cortex. Journal of Neurophysiology, 90 (2), 1245-1256. Evans, A. & Treisman, A. (2004). Perception of natural scenes; is it really attention-flee? Paper presented at the Annual Workshop for Object Perception, Attention and Memory, Minneapolis, MN. Fabre-Thorpe M, Delorrne A, Marlot C, Thorpe S. 2001 . A limit to the speed of processing in ultra-rapid visual categorization of novel natural scenes. Journal of Cognitive Neuroscience, 1 3, 171-80. Fabre-Thorpe, M., Richard, G. & Thorpe, SJ. (1998). Rapid categorization of natural images by rhesus monkeys. Neuroreport, 9(2), 303-308. Feldman, J .A. (1985). Four frames suffice: a provisional model of vision and space. Behavioral & Brain Sciences, 8, 265-289. Friedman, A. (1979). Framing pictures: the role of knowledge in automatized encoding and memory for gist. Journal of Experimental Psychology: General, 108, 316— 355. Goffaux, V., Jacques, C., Mouraux, A., Oliva, A., Rossion, B., & Schyns. PC. (in press). Diagnostic colors contribute to early stages of scene categorization: behavioral and neurophysiological evidences. Visual Cognition. Gegenfirrtner, K.R., & Rieger, J. (2000). Sensory and cognitive contributions of color to the recognition of natural scenes. Current Biology, 10(13), 805-808. Hanna, A., & Remington, R. (1996). The representation of color and form in long-terrn memory. Memory & Cognition, 24, 322-330. Henderson, J.M. and Hollingworth, A. (1999). High-level scene perception. Annual Review of Psychology, 50, 243—271. 107 Hochberg, J. (1978). Perception (2nd ed.). Endglewood Cliffs, NJ: Prentice-Hall. Hochberg, J. (1986). Representation of motion and space in video and cinematic displays. In K]. Boff, L. Kaufinan, & J .P. Thomas (eds), Handbook of perception and human performance (vol. 1, pp. 22:1 -— 22:64). New York: John Wiley & Sons. Hollingworth, A. (2003). Failures of retrieval and comparison constrain change detection in natural scenes. Journal of Experimental Psychology: Human Perception and Performance, 29, 388-403. Hollingworth, A. (in press). Visual memory for natural scenes: Evidence fiom change detection and visual search. Visual Cognition: Special Issue on Visual Search and Attention. Hollingworth, A., & Henderson, J .M. (1998). Does consistent scene context facilitate object perception? Journal of Experimental Psychology: General, 127, 398-415. Hollingworth, A., & Henderson, J. M. (1999). Object identification is isolated fi'om scene semantic constraint: Evidence fi'om object type and token discrimination. Acta Psychological (Special Issue on Object Perception and Memory), 102, 319-343. Hollingworth, A., & Henderson, J. M. (2002). Accurate visual memory for previously attended objects in natural scenes. Journal of Experimental Psychology: Human Perception and Performance, 28, 113-136. Hollingworth, A., & Henderson, J. M. (2003). Testing a conceptual locus for the inconsistent object change detection advantage in real-world scenes. Memory & Cognition, 31, 930-940. Homa D., & Viera, C. (1988). Long-term memory for pictures under conditions of thematically related foils. Memory & Cognition, 16, 411-421. Inoue, C. & Belleza, ES. (1998). The detection model of recognition using know and remember judgments. Memory & Cognition, 26, 299-308. Intraub, H. (1980). Presentation rate and the representation of briefly glimpsed pictures in memory. Journal of experimental Psychology: Human Learning and Memory, 6, 1-12. Intraub, H. (1981). Rapid conceptual identification of sequentially presented pictures. Journal of Experimental Psychology: Human Perception and Performance, 7, 604-610. 108 Intraub, H. (1992). Contextual factors in scene perception. In E. Chekaluk, & K.R. Llewellyn (eds.), The role of eye movements in perceptual processes (pp. 45-72). Amsterdam: Elsevier Science Publications Intraub, H. (1999). Understanding and remembering briefly glimpsed pictures: Implications for visual scanning and memory. In V. Coltheart (Ed), Fleeting Memories: Cognition of Brief Visual Stimuli (pp. 47- 70). Boston, MA: MIT Press. Irwin, D. E. (1992). Memory for position and identity across eye movements. Journal of Experimental Psychology: Learning, Memory & Cognition, 18, 307-317. Irwin, DE. (1996). Integrating information across saccadic eye movements. Current Directions in Psychological Science, 5, 94-100. Jonides, J ., Irwin, D.E., & Yantis, S. (1982). Integrating visual information from successive fixations. Science, 215, 192-194. Joseph, J .E., & Proffitt, DR. (1996). Semantic versus perceptual influences of color in object recognition. Journal of Experimental Psychology: Learning, Memory, & Cognition, 22(2), 407-429. Joseph, LE. (1997). Color processing in object verification. Acta Psychological Special Issue: Higher level cortical processing of colour, 97 (1), 95-127. Livingstone, MS. (1988). Art, illusion, and the visual system. Scientific American, 258, 78-85. Livingstone, M.S., & Hubel, DH. (1984). Anatomy and physiology of a color system in the primate visual cortex. Journal of Neuroscience, 4, 309-356. Livingstone M.S., & Hubel DH. (1988) Segregation of form, color, movement, and depth: Anatomy, physiology, and perception. Science, 240, 740-749. Loftus, G.R., & Mackworth, NH. (1978). Cognitive determinants of fixation location during picture viewing. Journal of Experimental Psychology: Human Perception and Performance, 4, 565—572. Mackworth, N.H. & Morandi, A. J. (1967) The gaze selects informative details within pictures. Perception & Psychophysics, 7, 173-178. Marr, D. (1982). Vision : a computational investigation into the human representation and processing of visual information. W.H. Freeman Press: San Francisco, CA. 109 Marr, D. & Nishihara, H.K. (1977). Representation and recognition of the spatial organization of three-dimensional shapes. Proceedings of the Royal Society of London B, 200, 269-294. Metzger, R.L., & Antes, JR. (1983). The nature of processing early in picture perception. Psychological Research, 45, 267—274. Nickerson, RS. (1965). Short-terrn memory for complex meaningful configurations: a demonstration of capacity. Canadian Journal of Psychology, 19, 155-160. O’Regan, J. K. (1992). Solving the “real” mysteries of visual perception: The world as an outside memory. Canadian Journal of Psychology, 46, 461 -488. Oliva, A. (in press). Gist of a Scene. In Neurobiology of Attention. L. Itti, G. Rees and J. Tsotsos (Eds.). Academic Press, Elsevier. Oliva, A. & Schyns, P. G. (1997). Course blobs or fine edges? Evidence that information diagnosticity changes the perception of complex visual stimuli. Cognitive Psychology, 34, 72-107. Oliva, A. & Schyns, P. G. (2000). Diagnostic colors mediate scene recognition. Cognitive Psychology, 41, 176-210. Ostergaard, A.L., & Davidoff, J. (1985). Some effects of color on naming and recognition of objects. Journal of Experimental Psychology: Learning, Memory, & Cognition, 11(3), 579-587. Palmer, S. E. (1975). The effects of contextual scenes on the identification of objects. Memory and Cognition, 3, 519-526. Palmer, SE. (1977). Hierarchical structure in perceptual representation. Cognitive Psychology, 9(4), 441-474. Parraga, C.A., Brelstaff, G., & Troscianko, T. (1998). Color and luminance information in natural scenes. Journal of Optical Society of America A, 15 (3), 563-569. Price, C.J., & Humphreys, G.W. (1989). The effects of surface detail on object categorization and naming. Quarterly Journal of Experimental Psychology: Human Experimental Psychology, 41(4), 797-827. Pollatsek, A., & Rayner, K. (1992). What is integrated across fixations? In K. Rayner (Ed.), Eye movements and visual cognition: Scene perception and reading, (pp.166-191). New York: Springer-Verlag. Potter, MC. (1975). Meaning in visual search. Science, 187, 965—966 110 Potter, MC. (197 6). Short-term conceptual memory for pictures. Journal of Experimental Psychology: Learning, Memory & Cognition, 2, 509-522. Potter, MC. (1993). Very short-term conceptual memory. Memory & Cognition, 21, 156- 161. Potter, MC. (1999). Understanding sentences and scenes: the role of conceptual short- term memory. In V. Coltheart (Ed), Fleeting Memories: Cognition of Brief Visual Stimuli (pp. 13-46). Boston, MA: MIT Press. Potter, M.C. & Levy, EL (1969). Recognition memory for a rapid sequence of pictures. Journal of Experimental Psychology, 81,10-15. Potter, M. C., Staub, A., Rado, J ., & O'Connor, D. H. (2002). Recognition memory for briefly-presented pictures: The time course of rapid forgetting. Journal of Experimental Psychology: Human Perception and Performance, 28, 1163-1175. Potter, M. C., Staub, A., & O'Connor, D. H. (2004). Pictorial and conceptual representation of glimpse pictures. Journal of Experimental Psychology: Human Perception and Performance, 30, 478-489. Pouget, A., Fisher, S.A., & Sejnowski, T]. (1993). Egocentric spatial representation in early vision. Journal of Cognitive Neuroscience, 5, 150-161. Raj arem, S. (1993). Remembering and knowing: Two means of access to the personal past. Memory & Cognition, 21, 89-102. Rajarem, S., & Roediger, H. (1997). Remembering and knowing as states of consciousness during retrieval. In J .D. Cohen & J .W. Schooler (Eds), Scientific approaches to consciousness. (pp. 213-240). Hillsdale, N.J.: Englandiates, Inc. Rayner, K. (1998). Eye movements in reading and information processing: 20 years of research. Psychological Bulletin, 124, 372-422. Rayner, K., & McConkie, G. W. (1976). What guides a reader's eye movements? Vision Research, 16, 829-837. Rayner, K. & Pollatsek, A. (1983). Is visual information integrated across saccades? Perception & Psychophysics, 34, 39-48. Renninger, L.W. & Malik, J. (2004). When is scene recognition just texture recognition? Vision Research, 44, 2301-2311. Rensink, RA. (2000). The dynamic representation of scenes. Visual Cognition, 7, 17742. 111 Rossion, B., & Pourtois, G. (2004). Revisiting Snodgrass and Vanderwart’s object pictorial set: The role of surface detail in basic-level object recognition. Perception, 33(2), 217-236. Ryan, T.A., & Schwartz, CB. (1956). Speed of perception as a function of mode of representation. American Journal of Psychology, 69, 60-69. Sanocki, T. (2003). Representation and perception of spatial layout. Cognitive Psychology, 47, 43-86. Sanocki, T., & Epstein, W. (1997). Priming spatial layout of scenes. Psychological Science, 8(5), 374-378. Schyns, P.G., & Oliva, A. (1994). From blobs to boundary edges: Evidence for time- and spatial-scale-dependent scene recognition. Psychological Science, 5, 195—200. Shepard, RN. (1967). Recognition memory for words, sentences and pictures. Journal of Verbal Learning and Verbal Behavior, 6, 156-163. Simons, D. J. (2000). Attentional capture and inattentional blindness. Trends in Cognitive Sciences, 4, 147-155. Simons, D. J., Chabris, C. F., Schnur, T. T., & Levin, D. T. (2002). Evidence for preserved representations in change blindness. Consciousness and Cognition, I I, 78-97. Standing, L. (1973). Learning 10,000 pictures. Quarterly Journal of Experimental Psychology, 25, 207-222. Tanaka, J .W., & Presnell, L.M. (1999). Color diagnosticity in object recognition. Perception & Psychophysics, 61(6), 1140-1153. Tanaka, J.W., Weiskopf, & Williams, P. (2001). The role of color in high-level vision. Trends in Cognitive Sciences, 5 (5), 211-215. Thorpe, S., Delorrne, A., & Van Rullen, R. (2001). Spike-based strategies for rapid processing. Neural Networks, 14(6- 7), 715-725. Thorpe, S., Fize, D., & Marlot, C. (1996). Speed of processing in the human visual system. Nature, 381 (6582), 520-522. Torralba, A. (2003). Contextual priming for object detection. International Journal of Computer Vision, 53, 153-167. 112 Torralba, A., & Oliva, A. (2002). Depth estimation from image structure. IEEE Pattern Analysis and Machine Intelligence, 24,1226-1238. Torralba, A., Oliva, A. (2003). Statistics of Natural Images Categories. Network: Computation in Neural Systems, 14, 391-412. Van Rullen, R., & Thorpe, SJ. (2001). The time course of visual processing: from early perception to decision- making. Journal of Cognitive Neuroscience, 13, 454-461. Van Rullen & S.J. Thorpe (2002). Surfing a spike wave down the ventral stream. Vision Research, 42(23), 2593-2615. Wichman, F.A., Sharpe, L.T., & Gegenfurtner, K.R. (2002). The contributions of color to recognition memory for natural scenes. Journal of Experimental Psychology: Learning, Memory, and Cognition, 28(3), 509-520. Williams, C. C., Henderson, J. M., & Zacks, R. T. (in press). Incidental visual memory for targets and distractors in visual search. Perception & Psychophysics. Wurm, L.H., Legge, G.E., Isenberg, L.M., & Luebker, A. (1993). Color improves object recognition in normal and low vision. Journal of Experimental Psychology: Human Perception and Performance, 19(4), 899-911. 113 mgmgggmgwmnuW