This commit is contained in:
Johannes Paehr
2025-10-18 15:35:31 +02:00
commit c4354c0441
1352 changed files with 1821051 additions and 0 deletions

View File

@@ -0,0 +1,252 @@
Behavior Research Methods, Instruments, & Computers 2002, 34 (4), 455-470
A breadth-first survey of eye-tracking applications
ANDREW T. DUCHOWSKI Clemson University, Clemson, South Carolina
Eye-tracking applications are surveyed in a breadth-first manner, reporting on work from the following domains: neuroscience, psychology, industrial engineering and human factors, marketing/advertising, and computer science. Following a review of traditionally diagnostic uses, emphasis is placed on interactive applications, differentiating between selective and gaze-contingent approaches.
Eye-tracking research is entering its fourth era, distinguished by the emergence of interactive applications. Rayner (1998) summarized the characteristics of eye trackings first three eras: The first (ca. 18791920) was defined by the discovery of many basic eye movement facts, including saccadic suppression, saccade latency, and the size of the perceptual span; the second (ca. 1930 1958) was characterized by a more applied research focus, coinciding with the behaviorist movement in experimental psychology;the third (ca. 19701998) was marked by improvements in eye movement recording systems, facilitating increasingly accurate and easily obtained measurements. A wide variety of eye-tracking applications now exist, which can broadly be dichotomized,from a system analysis point of view, as diagnostic or interactive.
In its diagnostic role, the eye tracker provides objective and quantitative evidence of the users visual and (overt) attentional processes. In this capacity, eye movements are generally recorded to ascertain the users attentional patterns over a given stimulus. Diagnostic applications are typically represented by the unobtrusive use of the eye-tracking device. Furthermore, the stimulus being displayed does not usually need to change or react to the viewers gaze. In this scenario, the eye tracker is used to record eye movements for posttrial, off-line assessment of the viewers gaze during the experiment.
Bolstered by advancements in computational power, richness of graphical displays, and robustness of interactive styles, as an interface modality, the eye tracker serves as a powerful input device in a host of visually mediated applications. An interactive system must respond to or interact with the user on the basis of observed eye movements. Such interactive systems may fall into two appli-
Material in this paper is condensed from Duchowski (2003), adapted with permission. This work was supported in part by University Innovation Grant 1-20-1906-51-4087, NASA Ames Task NCC 2-1114, and NSF CAREER Award IIS-9984278. I am grateful for helpful suggestions concerning the manuscript made by Keith Rayner and a second, anonymous referee. Correspondence concerning this article should be addressed to A. T. Duchowski, Department of Computer Science, Clemson University, 451 Edwards Hall, Clemson, SC 29634-0974 (e-mail: andrewd@cs.clemson.edu).
cation subtypes: selective and gaze-contingent. Selective systems use the point of gaze as analogous to a pointing device, such as the mouse, whereas gaze-contingent systems exploit knowledge of the users gaze to facilitate the rapid rendering of complex displays (e.g., graphical environments). The latter can be further delineated in terms of display processing, as shown in the hierarchy in Figure 1.
This paper reports on eye-tracking methodologiesspanning three disciplines: neuroscience, psychology, and computer science, with brief remarks about industrial engineering and marketing. A breadth-first survey is given, proceeding from the diagnostic use of eye trackers to interactive applications.
NE U RO SC IE N CE
Neuroscientific research has identified numerous interconnected neural components of vision, starting at the retinal photoreceptors and (more or less) ending in the cortical regions implicated in low-level vision. Retinogeniculate pathways of vision, as well as deeper visual brain regions, have also been identified. In the context of vision and eye movements, knowledge of the physiological organization of the optic tract, as well as of the cognitive and behavioral aspects of vision, is indispensable in obtaining a complete understanding of human vision (for examples, see the papers by Robinson, 1968, and Findlay & Walker, 1999, which present frameworks for understanding human saccadic eye movement generation motivated by consideration of the convergence between low-level studies of the saccadic system from a physiological standpoint and studies investigating saccadic eye movements in a behavioral context). Current neuroscientific trends related to the study of eye movements will be briefly discussed, touching on exemplary work in which attention and brain imaging have been investigated.
Attentional Neuroscience The dissociation of the “spotlight of attention” (Posner,
Snyder, & Davidson, 1980) from ocular fixation poses a problem for eye-tracking researchers. It is possible to visually fixate one location while simultaneously diverting
455
Copyright 2002 Psychonomic Society, Inc.
456 DUCHOWSKI
Figure 1. Hierarchy of eye-tracking applications.
attention to another. For example, astronomers perform this fairly regularly when looking for faint stars or star clusters with the naked eye. When examining a scanpath over a visual stimulus, we can often say that specific regions were looked at, perhaps even fixated (following analysis of eye movements); however, we cannot be fully confident that these specific regions were fully perceived. There is (currently) no simple way of telling what the brain is doing during a particular visual scan of the scene. Ideally, we would have to record not only the point of ones gaze, but also ones brain activity.
Investigating neuronal activity related to f ixational eye movements (saccades and intersaccadic [drift] intervals), Snodderly, Kagan, and Gur (2001) showed that responses of a monkeys V1 neurons to f ixational eye movements are specific and diverse. Some cells are activated only by saccades, others discharge during drifts, and most show a mixture of these two influences. The patterns of activity reflect the interactions among the stimulus, the receptive-field activating region, the temporal response characteristics of the neuron, and the retinal positions and image motions imparted by eye movements. The diversity of the activity patterns suggests that during natural viewing of a stationary scene, some cortical neurons carry information about saccadic occurrences and directions, whereas other neurons code details of the retinal image. Snodderly et al. argued that, theoretically, saccade neurons could participate in saccadic suppression by inhibiting other neurons that carry stimulus information (reducing signal strength) or by adding noise to the signal, thereby raising thresholds and making stimulus events undetectable at the times of saccades.
The prefrontal (PF) cortex is largely thought to be central to the ability to shift attention and choose actions appropriate not only to the sensory information at hand, but also according to the specific sensory, motor, and cognitive demands (the behavioral context) in which the task is encountered (Asaad, Rainer, & Miller, 2000). Asaad et al. reported that most of the 305 neurons recorded in 2 monkeys displayed a task-dependent change in overall activity, particularly in the f ixation interval preceding cue presentation. Results showed that, for many PF neurons, activity was influenced by the task being performed.
Asaad et al. suggested that the formal demands of behavior are represented within PF activity and, thus, supported the hypothesis that one PF function is the acquisition and implementation of task context and the “rules” used to guide behavior.
Eye Movements and Brain Imaging Recently, eye movement recording and functional
brain imaging have been used to track a subjects fixation point while simultaneously recording cortical activation during attentional tasks, in order to identify functional brain structures implicated in attentional behavior. Presently, possibly owing to prohibitive cost, combined eye-tracking and brain-imaging equipment is not in widespread use, although such devices are beginning to appear.
Özyurt, DeSouza, West, Rutschmann, and Greenlee (2001) compared the neural correlates of visually guided saccades in the step and gap paradigms while recording saccadic eye movements during task performance. The results from Özyurt et al.s study indicated significant task-related activity in the striate and extrastriate cortex, the frontal eye fields, the supplementary motor area, the parietal cortex and angular gyrus, the frontal operculum, and the right prefrontal area 10. This type of research helps identify functional brain structures that participate in attentional mechanisms.
PSYCHOLOGY
Perhaps the first well-known use of eye trackers in the study of human (overt) visual attention occurred during reading experiments. Rayner (1992, 1998) has provided an excellent survey of eye-tracking applications in reading and other information-processing tasks.
Reading Rayner (1998) synthesized over 100 years of research.
Although the reader is referred to Rayners (1998) article for the complete review, three interesting examples of eye movement characteristics during reading will be summarized here. First, when English is read, eye fixations last about 200250 msec, and the mean saccade size is from seven to nine letter spaces. Second, eye
EYE-TRACKING APPLICATIONS 457
movements are influenced by textual and typographical variables—for example, as text becomes conceptually more difficult, fixation duration increases, and saccade length decreases. Factors such as the quality of print, line length, and letter spacing influence eye movements. Third, eye movements differ somewhat when one reads silently from when one reads aloud: Mean fixation durations are longer when one reads aloud or while one listens to a voice reading the same text than when one reads silently. There is of course a good deal more that has been learned (e.g., Reichle, Pollatsek, Fisher, & Rayner, 1998, summarized some basic data on reading and eye movements related to their subsequent effort to model the reading process); here, the methodology behind such discoveries is what is of primary interest.
In addition to descriptive studies, in which eye movements are simply recorded during the reading of text, eye trackers have been used to modulate the stimulus display in real time, depending on where the reader is looking. Three experimental paradigms, the moving window, the boundary, and the foveal mask, have been developed to explore eye movements (see Reichle et al., 1998, for a concise review and examples of these techniques). Although first developed for reading studies, these paradigms have since been adapted to other contexts, such as scene perception (see below).
In the moving window paradigm, or the gaze-contingent display (GCD) change paradigm, developed by McConkie and Rayner (1975), a window is sized to include a number of characters (e.g., 14) to the left and right of a fixated word. The assumption with this technique is that when the window is as large as the region from which the reader can obtain information, there will be no difference between reading in that situation and reading when there is no window.
Conversely, a foveal mask, developed by Rayner and Bertera (1979; see, also, Bertera & Rayner, 2000), is placed over a number of fixated characters (e.g., seven), creating an artificial foveal scotoma. Eye movement behavior in this situation is quite similar to the eye movement behavior of patients with real scotomas (Rayner, 1998).
In the boundary technique, developed by Rayner (1975) to investigate the use of peripheral information in reading, the stimulus changes as f ixation crosses a predefined boundary. In this technique, a word or nonword letter string is initially presented in a target location. When the readers gaze moves to that location, the initially presented stimulus is replaced by the target word. The fixation duration on the target allows the experimenter to make inferences about the type of information acquired from the target location when it was in parafoveal vision.
The motivation behind Rayners (1975, 1998) use of eye movements for the study of reading was due, in part, to his having found tachistoscopic methods to be inadequate. Tachistoscopic (strobe-like) presentation of letters and words relies on the presentation of material for very brief exposures, to exclude the possibility of an eye
movement during the presentation. Prior to eye-tracked reading studies, the tachistoscopic exposure was often thought of as being analogous to a single fixation during reading. Rayner argued that what subjects report from a tachistoscopic presentation cannot be taken as a complete specification of what they saw. The argument for eye movement recording over tachistoscopic displays carries over to scene perception and will be discussed in greater detail in the next section.
Scene Perception Although certain reading patterns are easily recog-
nized (e.g., left to right and top to bottom for English readers or right to left for Hebrew readers), no apparent strategies for scene viewing have been easily discerned. Contrary to reading, there appears to be no canonical scanpath for particular objects (Kennedy, 1992). There may be context differences at play. Kroll (1992) stated that although there may be similarities between reading and scene-viewing tasks, the tasks are very different.
According to Henderson and Hollingworth (1998), there are at least three important reasons to understand eye movements in scene viewing. First, eye movements are critical for the efficient and timely acquisition of visual information during complex visual-cognitive tasks, and the manner in which eye movements are controlled to service information acquisition is a critical question. Second, how we acquire, represent, and store information about the visual environment is a critical question in the study of perception and cognition. The study of eye movement patterns during scene viewing contributes to an understanding of how information in the visual environment is dynamically acquired and represented. Third, eye movement data provide an unobtrusive, on-line measure of visual and cognitive information processing.
Rayner (1998) recounted the traditionally held belief that examining the fine details of eye movements during scene perception is a high-cost, low-yield endeavor. Experiments in which tachistoscopic presentations and eye movement recordings have been used have led to the conclusion that subjects get the gist of a scene very early in the process of looking, sometimes even from a single brief exposure. Thus, it has been advocated that the gist of the scene is abstracted on the first few fixations, and the remainder of the fixations on the scene are used to fill in details. Rayner (1998) reviewed several findings that support the contention that important conclusions about temporal aspects of scene perception can be obtained from eye movement analysis.
Loftus (1981) presented results of a masked tachistoscopic study suggesting a model of picture encoding that incorporates the following propositions: (1) A normal fixation on a picture is designed to encode some feature of the picture, (2) the duration of a fixation is determined by the amount of time required to carry out the intended feature encoding, and (3) the more features are encoded from a picture, the better the recognition memory will be from the picture. A major finding of Loftuss experi-
458 DUCHOWSKI
ments was that with exposure time held constant, recognition performance increased with increasing numbers of fixations. When eye fixations were simulated tachistoscopically and their durations experimentally controlled, all traces of this phenomenon disappeared. Moreover, Loftuss experiments suggest that within a fixation, visual information processing ceases fairly early—that is, acquired information reaches asymptote soon after the start of a fixation. The problem identified with tachistoscopic exposures, which were meant to simulate saccades to new places, was that there was no guarantee that new places in the picture were fixated; it is entirely possible that the subjects were simply holding their eyes steady throughout all tachistoscopic flashes. From an eye-tracking experiment, Loftus drew the argument that given more places to look at in the picture, more information can be acquired from the picture. Additional (tachistoscopic) flashes are useful only insofar as they permit acquisition of information from additional portions of the picture.
Rayner and Pollatsek (1992) conceded that much of the global information about the scene background or setting is extracted during the initial fixation. Some information about objects or details throughout the scene can be extracted far from fixation. However, if an object is important, it is usually fixated. Rayner and Pollatsek indicated that this foveal identification is aided significantly by the information extracted extrafoveally and concluded that it is necessary to study eye movements to achieve a full understanding of scene perception.
Henderson and Hollingworth (1998) suggested several metrics for the evaluation of the relative informativeness of scene regions. For a macro-level analysis, the total time that a region is fixated in the course of scene viewing (the sum of the durations of all fixations in that region) is correlated with the number of fixations in that region. For a micro-level analysis, several commonly used measures include, first-fixation duration (the duration of the initial fixation in a region), first-pass gaze duration (the sum of all fixations from first entry to first exit in a region), and second-pass gaze duration (the sum of all fixations from second entry to second exit in a region). Generally, first-pass gaze durations are longer for semantically informative (i.e., inconsistent) objects. Semantically informative objects also tend to draw longer second-pass and total fixation durations. The influence of semantic informativeness on the duration of the very first fixation on an object is less clear. That is, scene context has an effect on eye movements: Fixation duration on an object that does not belong in the scene is longer than f ixation duration on an object that does belong (Rayner, 1998). However, it is not clear whether the longer fixations on objects in violation of the scene reflect longer times to identify those objects or longer times to integrate them into a global representation of the scene (it could also reflect amusement at the absurdity of the violating object in the given context). Recently, Greene and Rayner (2001) showed that familiarity with
distractors around an unfamiliar target facilitates visual search. There were comparably long, but fewer, fixations when distractors were familiar, contradicting the theory that unfamiliar distractors need longer processing.
Henderson (1992) offered two criticisms of the eye movement paradigm. First, it is likely that global measures of fixation time, such as the total time spent on an object during the course of scene viewing, and the gaze duration on an object (the time of all initial fixations on an object prior to leaving that object for the first time) reflect postidentification processes. Thus, it is likely that gaze duration in scene processing reflects other processes beyond object identification. Henderson suggested that the preferred fixation measure is the true first-fixation duration, or the duration of time from the initial landing of the eyes on an object until the eyes move to any other location, including another location on the object. Second, the basic premise of the eye movement paradigm is that the results will reflect normally occurring visualcognitive processes, because subjects can view scenes in a natural manner. However, unlike reading, where the overall task is arguably transparent, subjects must be given an orienting task when they view a scene. Unfortunately, viewing behavior and eye movement patterns change as a function of the viewing task given to the subjects (Yarbus, 1967). One way to address the orienting task issue would be to give subjects a task that did not force the creation of a coherent memory representation for the scene and to look for similar scene context effects on fixation time across tasks.
Perception of art. A particularly interesting subset of scene perception studies is the examination of gaze over a specific set of contextual images—namely, art. The f irst systematic exploration of f ixation positions in scenes was reported by Buswell (1935, as cited by Henderson & Hollingworth, 1998). An important observation was that fixation positions were found to be highly regular and related to the information in the pictures (Henderson & Hollingworth, 1998). These data provided some of the earliest evidence that eye movement patterns during complex scene perception are related to the information in the scene and, by extension, to perceptual and cognitive processing of the scene.
Another example of a study of eye movements directed at works of art, by Molnar (1981, as reported by Solso, 1999), showed small differences in scanpaths between groups of subjects viewing the artwork for its semantic meaning and those viewing it for its aesthetic appeal. Remarkably, however, both sets of scanpaths were very similar in terms of fixated image features. Fixations made by two groups of fine art students, given different sets of questions pertaining to the artwork, appeared to coincide on important elements, even though the order of fixations might differ.
A recent large-scale eye-tracking study of art was conducted by Wooding (2002). Over 3 months, eye movements were successfully collected from 5,638 subjects while they viewed digitized images of paintings from the
EYE-TRACKING APPLICATIONS 459
National Gallery collection. Since a composite representation of eye movements from so many subjects posed a problem, Wooding devised a fixation map method of analysis, which might be descriptively termed a landscape or terrain map of fixations and is, in fact, similar to the landscape map developed independently by Velichkovsky, Pomplun, and Rieser (1996). The value at any point on the map indicates the height or amount of a particular property at that point (e.g., the number of f ixation s).
Approaching art perception from a different starting point, eye movements have recently been considered in the generation of art by computers. Perceptual principles and eye-tracking methodology (among other techniques) were employed by DeCarlo and Santella (2002) to process aesthetically pleasing images by computer. Additional exemplary uses of eye trackers in computer graphics will be presented in subsequent sections.
Perception of film. An interesting example of an eyetracking study over film, essentially a dynamic form of artistic media, is given by dYdewalle, Desmet, and Van Rensbergen (1998), who distinguished three levels of film-editing errors in sequencing successive shots. Firstorder editing errors refer either to small displacements of the camera position or to small changes of the image size, disturbing the perception of apparent movement and leading to the impression of jumping. Second-order editing errors follow from a reversal of the camera position, leading to a change of the leftright position of the main actors (or objects) and a complete change of the background. With third-order editing errors, the linear sequence of actions in the narrative story is not obeyed. The experiment of dYdewalle et al. showed that there was an increased spatial distribution of eye movements 200400 msec after both second- and third-order editing errors. Such an increase was not obtained after a first-order editing error, suggesting that the increased distribution of eye movements after second- and thirdorder editing errors is due to postperceptual, cognitive effects.
Visual Search How humans perceive a visual scene in natural or free
tasks, such as picture viewing, can be modeled by the visual search task. This task requires a report of the presence of a target in a display. In comparison with reading, there have not been nearly as many studies dealing with visual search (Rayner, 1998).
Because the nature of the search task influences eye movement behavior, any statement about visual search and eye movements needs to be qualified by the characteristics of the search task. Specifically, visual search tasks vary widely, and tasks in which eye movements have been monitored consist of at least the following: search (1) through text or textlike material, (2) with pictorial stimuli, (3) with complex arrays such as X-rays, and (4) with randomly arranged arrays of alphanumeric characters or objects. There is considerable variability in fixation time and saccade length as a function of the par-
ticular search task (Rayner, 1998). When eye movements are recorded during extended search, fixations tend to be longer than in reading.
As was demonstrated by Yarbus (1967) and then by Noton and Stark (1971a, 1971b), eye-tracked scanpaths strongly suggest the existence of a serial component to the picture-viewing process. However, serial scanpaths do not adequately explain the brains uncanny ability to integrate holistic representations of the visual scene from piecemeal (foveal) observations. That is, certain perceptual phenomena are left unaccounted for by scanpaths, including perception of illusory images, such as the Kanizsa (1976) figure or the Necker (1832) cube. Although scanpaths cast doubt on a purely Gestalt view of visual perception, it would seem that some sort of holistic mechanism is at work that is not revealed by eye movements alone. Models of visual search attempt to answer this dilemma by proposing a parallel component, which works in concert with the serial counterpart exhibited by eye movements.
The consensus view is that a parallel, preattentive stage detects four basic features: color, size, orientation, and presence and/or direction of motion (Doll, 1993; Wolfe, 1993). Todd and Kramer (1993) suggested that attention (presumably in the periphery) is captured by sudden onset stimuli, uniquely colored stimuli (to a lesser degree than sudden onset), and bright and unique stimuli. However, there is doubt in the literature as to whether human visual search can be described as an integration of independently processed features (Van Orden & DiVita, 1993). Visual search, even over natural stimuli, is at least partially deterministic, rather than completely random (Doll, Whorter, & Schmieder, 1993). Determinism stems from either or both of two kinds: The observers strategy determines the search pattern (as in reading), and/or the direction of the next saccade is based on information gained through peripheral vision about surrounding stimuli. Features that are likely to be fixated include edges, corners, spatially high frequency components, but not plain surfaces.
Relatively few studies have addressed the relationship between eye movements and the search process (Findlay & Gilchrist, 1998). Findlay and Gilchrist argued that the tradition, in search research, of paying little attention to eye movements and, instead, using the concept of covert visual attention movements (redirecting attention without moving the eyes, an important component of the feature integration theory of visual attention; Treisman & Gelade, 1980) is misguided. Findlay and Gilchrist demonstrated that when viewers are free to move their eyes, no additional covert attentional scanning occurs. They showed that unless instructions explicitly prevent eye movements, subjects in a search task show a natural propensity to move their eyes, even in situations in which it would be more eff icient not to do so. Findlay and Gilchrist suggested that the reason for this preference is that, in naturally occurring search situations, eye movements form the most effective way of sampling the visual field.
460 DUCHOWSKI
Findlay (1997) recorded eye movements during tasks involving a simple feature search and a colorshape feature conjunction search. Findlays work provides an impressive confirmation that search for a prespecified color target can be carried out in parallel. When two targets are presented simultaneously in neighboring positions, the first saccade is directed toward some “center of gravity” position. Results suggest that the control of the initial eye movement during both simple and conjunction searches is through a spatially parallel process.
Results from a conjunction search experiment (color and shape) are particularly relevant, since this is a situation in which serial scanning would be expected according to the classical search theory of Treisman and Gelade (1980). If a rapid serial scanning with covert attention could occur before the saccade is initiated, it is not clear why incorrect saccades would occur as frequently as was observed in Findlays (1997) experiment. Moreover, Findlay argued that the data place constraints on the speed of any hypothetical serial scanning process, since it would be necessary for a number of locations to be scanned before the target is located, given the accuracy obtained. Alternative accounts of the visual search process have appeared that assign much more weight to the parallel processes and avoid the postulation of rapid serial scanning (see, e.g., Findlay & Walker, 1999; Wolfe, 1994; Wolfe & Gancarz, 1996).
Recently, Bertera and Rayner (2000) had viewers search through a randomly arranged array of alphanumeric characters (i.e., letters and digits) for the presence of a target letter. They used both the moving window technique and the moving mask technique. In the moving window study, search performance was equivalent for all windows at least as large as 5º. The moving mask had a deleterious effect on search time and accuracy, indicating the importance of foveal vision during search: the larger the mask, the longer the search time. The moving window paradigm for scene perception and visual search will be discussed below as an instance of GCD technology.
Natural Tasks Studies of visual search are expanding to consider
more complex stimuli, such as natural scenery (Hughes, Nozawa, & Kitterle, 1996). However, viewing of pictures projected on a laboratory display still constitutes a somewhat artificial task. Recent advancements in wearable and virtual displays now allow the collection of eye movements in more natural situations, usually involving the use of generally unconstrained eye, head, and hand movements.
Important work in this area has been reported by Land, Mennie, and Rusted (1999) and Land and Hayhoe (2001). The aim of the first study was to determine the pattern of fixations during the performance of a welllearned task in a natural setting (making tea) and to classify the types of monitoring action that the eyes performed. Results of this study indicated that even automated routine activities require a surprising level of continuous
monitoring. Foveal direction was always close to the object being manipulated, and very few fixations were irrelevant to the task. Land et al. concluded that although the actions of tea making are “automated” and proceed with little conscious involvement, the eyes closely monitor every step of the process.
Investigating a similar natural task, Land and Hayhoe (2001) examined the relations of eye and hand movements in extended food preparation tasks, tea-making and the making peanut butter and jelly sandwiches. Gaze usually reached the next object in the sequence before any sign of manipulative action, indicating that the eye movements were planned into the motor pattern and led each action. The eyes usually fixated the same object throughout the action upon it, although they often moved on to the next object in the sequence before completion of the preceding action. General conclusions provided by Land and Hayhoe are that the eyes provide information on an “as needed” basis and that the relevant eye movements usually precede the motor acts they mediate by a fraction of a second. Eye movements are thus in the vanguard of each action plan and are not simply responses to circumstances. Land and Hayhoe concluded that their studies lend no support to the idea that the visual system builds up a detailed model of the surroundings and operates from that. Rather, most information is actively obtained from the scene as it is needed.
Ballard, Hayhoe, and Pelz (1995) investigated the use of short-term memory in the course of a natural eyehand task. The investigation focused on the minimization of subjects use of short-term memory by employing deictic primitives through serialization of the task with eye movements (e.g., using the eyes to “point to” objects in a scene in lieu of memorizing all of the objects positions and other properties). Ballard et al. argued that a deictic strategy in a pick-and-place task involves a more efficient use of a frame of reference centered at the fixation point, rather than a viewer-centered reference frame, which might require memorization of objects in the world, relative to coordinates centered at the viewer. By recording eye movements during the block pick-andplace task, Ballard et al. were able to show that subjects frequently directed gazes to the model pattern before arranging blocks in the workspace area, suggesting that information was acquired incrementally during the task and was not acquired in toto at the beginning of the task. That is, the subjects appeared to use short-term memory frugally, acquiring information just prior to its use, and did not appear to memorize the entire model block configuration before making a copy of the block arrangement.
In a similar block-moving experiment, Smeets, Hayhoe, and Ballard (1996) were able to show that horizontal movements of gaze, head, and hand followed a coordinated pattern. A shift of gaze was generally followed by a movement of the head, which preceded the movement of the hand. This relationship is, to a large extent, task dependent. In goal-directed tasks in which future points of interest are highly predictable, Smeets et al. hy-
EYE-TRACKING APPLICATIONS 461
pothesized that although gaze and head movements may decouple, the actual position of the hand is a likely candidate for the next gaze shift.
A recent example of such intentional visual attention was demonstrated by Pelz, Canosa, and Babcock (2000). Using a wearable eye tracker, Pelz et al. observed intentionally based eye movements, which they termed “lookahead” eye movements. During a simple hand-washing task, recorded eye movements showed that gaze moved to a location (soap dispenser) prior to the movement of the hands to the same location.
To further examine issues raised by observations of natural behavior, Hayhoe et al. (2002) have recently begun using complex virtual environments that can be manipulated by the experimenter at critical points during task performance. In a virtual environment where subjects copy toy models, Hayhoe et al. showed that regularities in the spatial structure are used by subjects to control eye movement targeting. Other experiments in a virtual environment with haptic feedback have shown that even simple visual properties such as size are not continuously available or processed automatically by the visual system but are dynamically acquired and discarded according to the momentary task demands.
Auditory Language Processing In this paradigm, eye movements are recorded as peo-
ple listen to a story or follow instructions regarding an array they are looking at. Cooper (1974) found that when people are simultaneously presented with spoken language and a visual field containing elements semantically related to the informative items of speech, they tend to spontaneously direct their line of sight to the elements that are most closely related to the meaning of the language currently heard.
The eye movement paradigm has also been applied to auditory language processing by Allopenna, Magnuson, and Tanenhaus (1998). Subjects followed spoken instructions to manipulate either real objects or pictures displayed on a computer screen. Eye movements to objects in the workspace were closely time-locked to referring expressions in the unfolding speech stream, providing a sensitive and nondisruptive measure of spoken language comprehension during continuous speech. Allopenna et al. argued that the sensitivity of the response measure, coupled with a clear linking hypothesis between lexical activation and eye movements, indicates that this methodology will be invaluable in exploring questions about the microstructure of lexical access during spoken word recognition.
Other Information-Processing Tasks Rayner (1998) has provided a comprehensive review
of eye movement work from multiple domains, including (1) mathematics, numerical reading, and problem solving, (2) dual-task situations, (3) face perception, (4) brain damage, and (5) dynamic situations, such as driving, basketball foul shooting, golf putting, table tennis, base-
ball, gymnastics, walking in uneven terrain, mental rotation, and humancomputer interaction. Some of the latter applications will be covered in the following sections.
INDUSTRIAL ENGINEERING AND HUMAN FACTORS
Eye tracking is particularly important for evaluating present and future environments in which humans do and will work. Traditional measurement methods of human performance often include measures of reaction time and accuracy—for example, how fast a person completes a task and how well this task is performed. To study the steps taken to perform the tasks requires analysis of the individual procedures performed. For this analysis, eye movements are particularly interesting, since they present measures that can provide insights into the visual, cognitive, and attentional aspects of human performance. Here, three broad experimental domains will be presented in which eye tracking can play an important analytical role: aviation, driving, and visual inspection.
Aviation An example of a recent combined use of relatively new
eye-tracking technology in a sophisticated flight simulator was reported by Anders (2001). Eye and head movements of professional pilots were recorded under realistic flight conditions in an investigation of human machine interaction behavior relevant to information selection and management, as well as to situation and mode awareness, in a modern glass cockpit. Analysis of eye movements illustrates the importance of the primary flight display (PFD) as the primary source of information during flight (the PFD is the familiar combined artificial horizon and altimeter display for combined altitude and attitude awareness). As a proof of concept, this study shows the potential of eye movements for judgment of pilots performance and future training of novice pilots.
Eye movements have also been used to evaluate the usability of specific instruments, such as newly developed electronic maps. In their study, Ottati, Hickox, and Richter (1999) compared eye movement patterns on different terrain features between experienced and novice pilots during a visual flight rules simulation. On the basis of previous studies of pilots eye movements during instrument flight rules navigation, the authors expected experienced aviators to spend less time finding and fixating on their navigational landmarks, whereas novices were expected to have greater difficulty finding landmarks and extracting useful data from them, causing greater dwell times. As was expected, a greater tendency for novice pilots was found to “fly out of the window” (i.e., devoting visual attention outside the cockpit) than for experienced pilots.
In a study of electronic maps for taxiing, Graeber and Andre (1999) suggested that training is necessary to assure proper usage of and optimal visual attention inter-
462 DUCHOWSKI
action with electronic moving maps (EMMs). The main objective of Graeber and Andres study was to understand how pilots visually interact with the EMM. Results indicated that atmospheric visibility significantly affected the amount of time pilots dwelled on the EMM, with dwell time significantly higher under high-visibility conditions. That is, as visibility degrades, pilots spend more time eyes-out and less time dwelling on the EMM, with no loss in taxi performance. A potential explanation for this surprising result is that pilots need to be eyes-out to maintain lateral and directional loop closure, scan for hazards, and maintain information gathering “out the window.”
Driving It is widely accepted that deficiencies in visual atten-
tion are responsible for a large proportion of road traffic accidents (Chapman & Underwood, 1998). Eye movement recording and analysis provide important techniques for understanding the nature of the driving task and are important for developing driver training strategies and accident countermeasures. Chapman and Underwood indicated that at their simplest, drivers fixation patterns on straight roads can be described as concentration on a point near to the focus of expansion (the point in the visual field in front of the driver, where objects appear stationary), with occasional excursions to items of road furniture and road edge markers. It is clear that dangerous events generally evoke long fixation durations.
Dishart and Land (1998) discussed previous work showing that experienced drivers obtain visual information from two sections of their view of the road ahead, in order to maintain a correct position in lane while steering their vehicle around a curve. The more distant of these two sections is used to predict the roads future curvature. This section is optimally 0.751.00 sec ahead of the driver and is used by a feedforward (anticipatory) mechanism that allows the driver to match the curvature of the road ahead. The other, nearer section is about 0.5 sec ahead of the driver and is used by a feedback (reactive) mechanism to “fine tune” the drivers position in lane. Eye-tracking experiments have shown that the feedback mechanism is present in most people regardless of their driving experience (although its accuracy is higher in those with experience) but that the feedforward mechanism is learned through experience of steering tasks (that can include riding a bicycle, computer driving games, etc.).
A fundamental problem in visual search in driving research is defining and controlling demand on the driver as an independent variable (Crundall, Underwood, & Chapman, 1998). Possible confounding factors are increases in visual demand, such as an increase in visual clutter or complexity, and increases in cognitive demand, such as an increase in the processing demands of a particular stimulus, perhaps owing to an increase in its relevance to a current context. Despite the lack of consistency in the manipulations of visual demand, it seems
fairly well documented that general increases in task demands and visual complexity tend to reduce mean fixation duration and increase the sampling rate.
In a driving study in which the effects of clutter, luminance, and aging were examined, Ho, Scialfa, Caird, and Graw (2001) recorded subjects visual search for traffic signs embedded in digitized images of driving scenes. On average, older adults were less accurate than younger ones. Accuracy for daytime scenes was independent of target presence, but errors for nighttime scenes were more common on target-present trials than on target-absent trials. In daytime scenes, errors were generally more common in high clutter than in low clutter, whereas in nighttime scenes, no consistent clutter effect was detected. Fixation number data, a measure that is strongly correlated with reaction time, indicated that older adults made more fixations than did younger adults and that more fixations were needed for high-clutter and target-absent scenes than for low-clutter and target-present scenes.
Experiments in which gaze is monitored in a simulated driving environment have demonstrated that visibility of task-relevant information depends critically on active search initiated by the observer according to an internally generated schedule, which depends on learned regularities in the environment (Hayhoe et al., 2002). Hayhoe et al.s f indings suggest that fixation patterns and attentional control in normal vision is learned.
Research on the effects of mental activity during driving suggests the convenience of raising drivers awareness about the possible consequences of driving while their attention is focused on their own thoughts, unrelated to driving (Recarte & Nunes, 2000). Recarte and Nunes studied the consequences of performing verbal and spatial imagery tasks on visual search when driving. Visual functional-field size decreased horizontally and vertically, particularly for spatial imagery tasks. As compared with ordinary driving, fixations were longer during the spatial imagery task. With regard to driving performance, glance frequency at mirrors and the speedometer decreased during the spatial imagery tasks. Performing mental tasks while driving caused an increased attentional workload on ordinary thought, as shown by pupillary dilation. With regard to the implicationsfor driving, Recarte and Nunes suggested that the spatial reduction of the visual inspection window, including the reduction of the inspection of mirrors, could be interpreted as a predictor of decreased probability of detecting traffic events, particularly when performing mental spatial imagery tasks.
Visual Inspection Schoonard, Gould, and Miller (1973) stated that “visual
inspection pervades the lives of all people today. From poultry, meat, and fish inspection, to drug inspection, to medical X-ray inspection, to production line inspection, to photo interpretation, the consequences of inspection directly affect peoples lives through their effects on the quality and performance of goods and services” (p. 365).
EYE-TRACKING APPLICATIONS 463
Tracking eye movements during visual inspection may lead to predictive analyses, if certain recurring patterns or statistics can be found in collected scanpaths. For example, an expert inspectors eye movements may clearly exhibit a systematic pattern. If so, this pattern may be one that can be used to train novice inspectors. In their study of visual inspection of integrated circuit chips, Schoonard et al. found that good inspectors are characterized by relatively high accuracy and relatively high speed and make many brief eye fixations (as opposed to fewer longer ones) during the time they have to inspect.
In a survey of eye movements in industrial inspection, Megaw and Richardson (1979), identified the following relevant eye movement parameters: fixation times, number of fixations, spatial distribution of fixations, interfixation distances, direction of eye movements, and sequential indices (i.e., scanpaths). Megaw and Richardson reviewed previous inspection studies in which eye movements were recorded, including inspection of sheet metal, empty bottles, integrated circuits, and tapered roller bearings.
Besides being useful for gaging inspection performance, eye movements may play a part in training visual search strategy (Wang, Lin, & Drury, 1997). Visual search strategy training can be effective in adoption of a desirable systematic search strategy. To train visual search strategy, eye movements may be used both as a feedback mechanism and as confirmation of adoption of the new search strategy.
Duchowski, Medlin, Gramopadhye, Melloy, and Nair (2001) investigated the utility of eye movements for search training in a virtual environment simulating aircraft inspection. The users gaze direction, as well as head position and orientation, were tracked to allow recording of the users fixations within the environment. Analysis of eye movements led to two observations. First, mean fixation times did not appear to change significantly following training. Second, the number of fixations appeared to decrease following training. These results generally appear to agree with the expectation of a reduced number of fixations with the adoption of an improved visual search strategy (e.g., owing to learning or familiarization with the task; Drury, Gramopadhye, & Sharit, 1997).
Recently, Reingold, Charness, Pomplun, and Stampe (in press) reviewed the motivation behind the use of chess as an ideal task environment for the study of skilled performance. Reingold, Charness, et al. studied visual span as a function of chess skill (expert vs. intermediate vs. novice) and configuration type (chess configuration vs. random configuration), using a gaze-contingent window technique. The paper extends classic work demonstrating that after viewing structured, but not random, chess positions for 5 sec, chess masters reproduced these positions much more accurately than did lesser skilled players. These results provide strong evidence for a perceptual encoding advantage for experts, attributable to chess experience, rather than to a general perceptual or memory superiority.
M AR K E T IN G /ADV E RT ISIN G
Eye tracking can aid in the assessment of ad effectiveness in such applications as copy testing in print, images, video, or graphics and in disclosure research involving perception of fine print within print media and within television displays. Eye tracking can provide insight into how the consumer disperses visual attention over different forms of advertising. Applied research organizations may routinely examine the eye movements of consumers as they look at advertisements; however, this work tends to be proprietary (Rayner, Rotello, Stewart, Keir, & Duffy, 2001).
Copy Testing A particularly good example of analysis of eye move-
ments over advertisements in the Yellow Pages is given by Lohse (1997). In this experiment, eye movement data was collected while consumers chose businesses from telephone directories. The study addressed (1) what particular features caused people to notice an ad, (2) whether people viewed ads in any particular order, and (3) how viewing time varied as a function of particular ad features. Eye movement analysis revealed that, consistent with previous findings, ad size, graphics, color, and copy all influenced attention to advertisements.
Print Advertising In a study of consumers visual attention over print ad-
vertisements, conducted by Rosbergen, Wedel, and Pieters (1990), an eye tracker was used to gain insight into attentive processes over repeated exposure to print advertisements. The authors explored the phenomenon of repeated advertisings “wearout”—that is, consumers diminishing attentional devotion to ads with increased repetition. Analyses showed that whereas duration decreased and attention onset accelerated during each additional exposure to the print ad, the attentional scanpath remained constant across advertising repetitions and across experimentally varied conditions.
In a recent study of eye movements over advertisements, Wedel and Pieters (2000) reported that across two magazines, fixations to the pictorial and the brand systematically promoted accurate brand memory but that text fixations did not. Brand surface had a particularly prominent effect. The more information was extracted from an ad during fixations, the shorter the latency of brand memory was (i.e., recollection was faster following longer fixations). A systematic recency effect was found: When subjects were exposed to an ad later, they tended to identify it better. The effect of the ads location on the right or left of the page depended on the advertising context.
Considering text and pictorial information in advertisement, Rayner et al. (2001) performed a study in which viewers looked at print advertisements as their eye movements were recorded. Rayner et al. found that viewers tended to spend more time looking at the text than at
464 DUCHOWSKI
the picture part of the ad, although they did spend more time looking at the type of ad they were instructed to pay attention to. Both fixation durations and saccade lengths were longer on the picture part of the ad than on the text, but more fixations were made on the text regions. Viewers did not alternate fixations between the text and the picture part of the ad, but they tended to read the large print, then the smaller print, and then they looked at the picture (although some viewers did an initial cursory scan of the picture).
Rayner et al. (2001) suggested that the presented data have some striking implications for applied research and advertisement development and indicate that consumers may be paying much more attention to the text in ads than was previously thought. Rayner et al. noted, however, that the data also indicated that although an advertisement captures and holds participants attention, this may be caused by the instructions given to those participants. Rayner et al. suggested that participants goals must be considered in future research conducted by advertising agencies and researchers in the area.
COMPUTER SCIENCE
Following the hierarchy of eye-tracking systems given earlier, this section focuses on two types of interactive applications: selective and gaze contingent. The former approach uses an eye tracker as an input device, similar in some ways to a mouse. The latter, gaze-contingent application is a type of display system wherein the information presented to the viewer is generally manipulated to match human processing capability, often matching foveo-peripheral perception in real time. It should be noted that a good deal of the work discussed in this section is based on conference proceedings.
Selective Systems Interactive uses of eye trackers typically employ gaze
as a pointing modality—for example, using gaze in a manner similar to a mouse pointer. Prominent applications involve selection of interface items (menus, buttons, etc.), as well as selection of objects or areas in virtual reality (VR). A prototypical application of gaze as an interactive modality is eye typing, particularly for handicapped users. Other uses involve gaze as an indirect pointing modality—for example, as a deictic reference in the context of collaborative systems—or as an indirect pointing aid in user interfaces. Diagnostic uses of eye trackers are being adopted for usability studies— that is, testing the effectiveness of interfaces, as evidenced by where users look on the display.
Eye-based interaction. One of the first eye-based interactive systems, introduced by Jacob (1990), demonstrated an intelligent gaze-based informational display. In Jacobs system, a text window would scroll to show information on visually selected items. Jacobs paper was one of the first to use eye-tracking technology interactively and is well known for its identification of an
important problem in eye-based interactive systems: the Midas Touch problem. Essentially, if the eyes are used in a manner similar to a mouse, a difficulty arises in determining intended activation of foveated features (the eyes do not register button clicks!). To avoid the Midas Touch problem, Jacob discussed several possible solutions, including blinks, finally promoting the use of dwell time to act as a selection mechanism.
Another early eye-tracked interactive system was presented by Starker and Bolt (1990). This system provided the user with gaze-controlled navigation in a threedimensional graphics world. The graphical environment contained story world characters who responded in interesting ways to the users gaze. Fixations activated character “behaviors,” since the system would maintain and increase the users visual interest level of a fixated character. When fixated, characters would blush and/or provide a verbal narrative. Unlike Jacobs (1990) use of dwell time, in this system dwell time was used to zoom into the graphics world.
Recently, Tanriverdi and Jacob (2000) presented a new gaze-based interactive system, this time with gaze acting as a selective mechanism in VR. In this system, Tanriverdi and Jacob compared eye-based interaction with handbased interaction and found that performance with gaze selection was significantly faster than with hand pointing, especially in environments where objects were placed far away from the users location. In contrast, no performance difference was found in “close” environments. Furthermore, although pointing speed may increase with gaze selection, there appears to be a cognitive tradeoff for this gain in efficiency: Subjects had more difficulty recalling locations they interacted with when using gaze-based selection than when using hand selection.
The archetypical gaze-based pointing application is eye typing. Eye typing is and has been a useful communication modality for the severely handicapped. Most eye-typing systems are implemented by presenting the user with a virtual keyboard, either on a typical computer monitor or, in some cases, projected onto a wall. On the basis of an analysis of tracked gaze, the system decides which letter the user is looking at and decides (e.g., by dwell time) whether or not to type this letter. The system may provide feedback to the user by either visual or auditory means or by a combination of both. Majaranta and Räihä (2002) have provided an excellent survey of additional selection and feedback techniques employed in eye-typing systems.
Gaze-based communication systems, such as those featuring eye typing, offer certain (sometimes obvious) advantages but also raise problems. Gaze often may provide a faster pointing modality than does a mouse or some other pointing device, especially if the targets are sufficiently large. However, gaze location is not as precise as with a mouse, since the fovea limits the accuracy of the measured point of regard. Another significant problem is accuracy of the eye tracker. Following initial calibration, eye tracker accuracy may exhibit significant
EYE-TRACKING APPLICATIONS 465
drift, where the measured point of regard gradually falls off from the actual point of gaze. Together with the Midas Touch problem, drift remains a significant problem for gaze input.
Zhai, Morimoto, and Ihde (1999) have taken another approach to gaze-based interaction and have tested the use of gaze as a sort of predictive pointing aid, rather than as a direct effector of selection. This is a particularly interesting and significant departure from “eye pointing,” since this strategy is based on the assertion that loading of the visual perception channel with a motor control task seems fundamentally at odds with users natural mental model, in which the eye searches for and takes in information while coordinating with the hand for manipulation of external objects. In their paper on manual gaze input cascaded (MAGIC) pointing, Zhai et al. presented an acceleration technique in which a (two-dimensional) cursor is warped to the vicinity of a f ixated target. The acceleration is either immediately triggered by eye movement (liberal MAGIC pointing) or delayed until the mouse moves as well (conservative MAGIC pointing). Zhai et al. reported that while the speed advantage was not obvious over manual (mouse) pointing, almost all users subjectively felt faster with either MAGIC pointing technique.
Usability. Besides the use of gaze for interactive means, diagnostic eye tracking is gaining acceptance within the humancomputer interaction (HCI) community as another means for testing usability of an interface. It is believed that eye movements can significantly enhance the observation of users strategies while using computer interfaces (Goldberg & Kotval, 1999). Among various experiments, eye movements have been used to evaluate the grouping of tool icons, compare gaze-based and mouse interaction techniques (Sibert & Jacob, 2000), evaluate the organization of click-down menus, and more recently, to test the organization of Web pages.
Byrne, Anderson, Douglass, and Matessa (1999) tested the arrangement of items during visual search of click-down menus, contrasting two computational cognitive models designed to predict latency, accuracy, and ease of learning for a wide variety of HCI-related tasks: executive process interactive control (EPIC; Kieras & Meyer, 1995) and adaptive control of thought-rational (ACT-R; J. R. Anderson, personal communication, July 2002). The EPIC architecture provides a general framework for simulating a human interacting with the environment to accomplish a task (Hornof & Kieras, 1997). ACT-R is a framework for understanding human cognition whose basic claim is that cognitive skill is composed of production rules (Anderson, 1993). These models specifically predict, in different ways, the relationship between eye and mouse movement. A visual search experiment, conducted by Byrne et al., presented a target item prior to display of a click-down menu. On the basis of the number of fixations, both models were supported, since in some cases visual search exhibited exclusively top-down search (ACT-R) and in others both top-down
and random patterns were observed (EPIC). This particular study is informative for two reasons. First, it shows the importance of a good model that can be used for designing user interfaces. Lacking such a model can lead to a design based on inaccurate assumptions. Second, the paper points out that eye tracking is an effective usability modality and that a new model of visual search over menus is needed.
In a usability study of Web pages, Goldberg, Stimson, Lewnstein, Scott, and Wichansky (2002) derived specific recommendations for a prototype Web interface tool and discussed gaze-based evaluation of Web pages in which the system permits free navigation across multiple pages. This is a significant advancement for usability studies of Web browsers, since prior to this study, recording of gaze over multiple Web pages had been difficult owing to the synchronization of gaze with windows that scroll or hide from view.
Collaborative systems. Gaze can also be utilized to aid multiparty communication in collaborative systems. In Vertegaals (1999) GAZE Groupware system, an eye tracker is used to convey gaze direction in a multiparty teleconferencing and document-sharing system, providing a solution to two problems in multiparty mediated communication and collaboration: knowing who is talking to whom and who is talking about what. The system displays two-dimensional images of remotely located subjects in a virtual world. These images rotate to depict gaze direction, alleviating the problem of turn-taking in multiparty communication systems. Furthermore, a gaze-directed “lightspot” is shown over a shared document, indicating the users fixated regions and thereby providing a deictic (“look at this”) reference.
Gaze-Contingent Displays In general, eye-based interactive applications can be
thought of as selective, since gaze is used to select or point to some aspect of the display, whether it is two-dimensional (e.g., desktop), collaborative, or immersive (such as a virtual environment). GCDs mix both directly interactive and indirectly “passive” usage styles of gaze. Here, gaze is used not so much as a pointing device, but more as a passive indicator of gaze. Given the users point of regard, a system can tailor the display so that the most informative details of the display are generated at the point of gaze but are degraded in some way on the periphery. The purpose of these displays is usually to minimize bandwidth requirements, as in video telephony applications or in graphical applications in which complex data sets cannot be fully displayed in real time, by matching the displays spatial frequency to functional visual acuity across the visual field.
When evaluating GCDs, it is often necessary to distinguish between two main types of influences of fixationdependent changes: those affecting perception and those affecting performance. As a general rule, perception is more sensitive than performance. That is, it may be possible to degrade a display to a quite noticeable effect with-
466 DUCHOWSKI
out necessarily degrading performance. In either case, one of the main difficulties that must be addressed is the latency of the system. Without predictive capabilities, most GCDs will lag behind the users gaze somewhat, usually by a constant amount of time proportional to both the measurement of gaze (which may last up to the time it takes to display a single frame of video, typically 16 msec for a 60-Hz video-based tracker) and the subsequent time it takes to refresh the GCD (this may take another 33 msec for a system with an update rate of 30 frames per second).
Two main types of gaze-contingent applications are discussed: screen-based and model-based. The former deals with image (pixel) manipulation, whereas the latter is concerned with the manipulation of graphical objects or models prior to rendering.
Screen-based displays. Loschky and McConkie (2000) conducted experiments on a GCD, investigating spatial, resolutional, and temporal parameters affecting perception and performance. Two key issues addressed were the timing of GCDs and the detectability of the peripherally degraded component of the GCD. In all the experiments, monochromatic photographic scenes were used as stimuli with a circular, high-resolution window surrounded by a degraded peripheral region. In one facet of the experiment, it was found that for an image change to go undetected, it must be started within 5 msec after the end of the eye movement. Detection likelihood rose quickly beyond that point. In another facet of the study, concerning detection of peripheral degradation, results showed that the least peripheral degradation went undetected even at the smallest window size (2º), whereas the opposite was true with the highest level of degradation: It was quite detectable at even the largest window size (5º). It was found that the generation of an imperceptible GCD was quite difficult in comparison with the generation of a GCD that did not deteriorate performance. Although greater delays (e.g., 15 msec) and greater degradation produced detectable visual artifacts, they appeared to have minimal impact on performance of visual tasks when there was a 4.1º high-resolution area centered at the point of gaze.
Parkhurst, Culurciello, and Niebur (2000) investigated behavioral effects of a two-region gaze-contingent display. A central high-resolution region, varying from 1º to 15º, was presented at the instantaneous center of gaze during a visual search task. Parkhurst et al. found that reaction time and accuracy covaried as a function of the central region size and noted this as a clear indicator of a strategic speed/accuracy tradeoff where subjects favor speed in some conditions and accuracy in others. For small central region sizes, slow reaction times were accompanied by high accuracy. Conversely, for large central regions sizes, fast reaction times were accompanied by low accuracy. A secondary finding indicated that fixation duration varied as a function of central region size. For small central region sizes, subjects tended to spend more time examining each fixation than they did
under normal viewing conditions. For large central regions, fixation durations tended to be closer to normal. In agreement with reaction time and accuracy, fixation duration was approximately normal (comparable to that seen for uniform resolution displays), with a central region size of 5º.
By distinguishing the effects of gaze-contingent windowing and peripheral filtering, Reingold and Loschky (2002) have recently shown that peripheral target detection with GCDs may be negatively impacted not only by the loss of perceptual detail owing to filtering, but also by the mere presence of the gaze-contingentwindow (the window effect). An important consideration in Reingold and Loschkys study was the saliency of the gaze-contingent window boundary: Whether the gaze-contingent window was sharply defined or smoothly graded appeared to have no effect on peripheral target acquisition.
For screen-based VR rendering, the work of Watson, Walker, Hodges, and Worden (1997) is particularly relevant. Watson et al. studied the effects of level-of-detail (LOD) peripheral degradation on visual search performance, evaluating both spatial and chrominance detail degradation effects in head mounted displays. To sustain acceptable frame rates, two polygons were texture mapped in real time to generate a high-resolution inset within a low-resolution display field. Watson et al. suggested that visual spatial and chrominance complexity can be reduced by almost half without degrading performance.
In an approach similar to Watson et al.s (1997), Reddy (1998) used a view-dependent LOD technique to evaluate both perceptual effects and system performance gains. Reddy reported a perceptually modulated LOD system that affords a factor 4.5 improvement in frame rate.
For excellent reviews of GCDs, see Reingold, Loschky, McConkie, and Stampe (in press) as well as Parkhurst and Niebur (in press).
Model-based graphical displays. As an alternative to the screen-based peripheral degradation approach, model-based methods aim at reducing resolution by directly manipulating the model geometry prior to rendering. The technique of simplifying the resolution of geometric objects as they recede from the viewer, as originally proposed by Clarke (1976), is now standard practice, particularly in real-time applications such as VR (Vince, 1995). Clarkes original criteria of using the projected area covered by the object for descending the objects LOD hierarchy is still widely used today. However, as Clarke suggested, the LOD management typically employed by these polygonal simplification schemes relies on precomputed fine-to-coarse hierarchies of an object. This leads to uniform, or isotropic, object resolution degradation.
A gaze-contingent model-based adaptive rendering scheme was proposed by Ohshima, Yamamoto, and Tamura (1996), where three visual characteristics were considered: central/peripheral vision, kinetic vision, and fusional vision. The LOD algorithm generated isotropicallydegraded
EYE-TRACKING APPLICATIONS 467
objects at different visual angles. Although the use of a binocular eye tracker was proposed, the system as discussed used only head tracking as a substitute for gaze tracking.
Isotropic object degradation is not always desirable, especially when large objects at close distances are viewed. In this case, traditional LOD schemes will display an LOD mesh at its full resolution even though the mesh may cover the entire field of view. Since acute resolvability of human vision is limited to the central 5º, object resolution need not be uniform. This is the central tenet of gaze-contingent systems.
Numerous multiresolution mesh modeling techniques suitable for gaze-contingent viewing have recently been developed (Zorin & Schröder, 2000). Owing to the advancements of multiresolution modeling techniques and to the increased affordability of eye trackers, it is now becoming feasible to extend the LOD approach to GCDs, where models are rendered nonisotropically.
An early example of a nonisotropic model-based gazecontingent system, where gaze direction is directly applied to the rendering algorithm, was presented by Levoy and Whitaker (1990), where a spatially adaptive near real-time ray tracer for volume data displayed an eyeslaved region of interest (ROI) by modulating both the number of rays cast per unit area on the image plane and the number of samples drawn per unit length along each ray as a function of local retinal acuity. The ray-traced image was sampled by a nonisotropic convolution filter to generate a 12º foveal ROI within a 20º mid-resolution transitional region. On the basis of preliminary estimates, Levoy and Whitaker suggested a reduction in image generation time by a factor of up to five.
For environments containing significant topological detail, such as virtual terrains, rendering with multiple levels of detail, where the level is based on user position and gaze direction, is essential to providing an acceptable combination of surface detail and frame rate (Danforth, Duchowski, Geist, & McAliley, 2000). Danforth et al. used an eye tracker as an indicator of gaze in a gaze-contingent multiresolution terrain navigation environment. A surface, represented as a quadrilateral mesh, was divided into f ixed-size (number of vertices) subblocks, allowing rendering for variable LOD on a per subblock basis. Resolution level was chosen per subblock, based on viewer distance and direction of gaze. To exaggerate the gaze-contingent effect, fractal mountains disappeared when not in view.
More recent work on gaze-contingent LOD modeling has been carried out by Luebke and Erikson (1997). Luebke and Erikson presented a view-dependent LOD technique suitable for gaze-contingent rendering. Although simplification of individual geometric objects was discussed in their work, it appears that the strategy was ultimately directed toward solving the interactive “walkthrough” problem (Funkhouser & Séquin, 1993). In this application, the view-dependent LOD technique seems more suitable to the (possibly) gaze-contingent rendering of an
entire scene or an environment. Recently, Luebke, Hallen, Newfield, and Watson (2000) have developed a gazedirected LOD technique to facilitate the GCD of geometric objects. To test their rendering approach Luebke et al. employed a table-mounted monocular eye tracker to measure the viewers real-time location of gaze over a desktop display.
A new object-based LOD method has been developed by Murphy and Duchowski (2001). The technique was similar to Luebke and Eriksons (1997) and to Ohshima et al.s (1996), where objects were modeled for gaze-contingent viewing. Unlike the approach of Ohshima et al., resolution degradation was applied nonisotropically. The spatial degradation function for LOD selection differed significantly from the area-based criteria originally proposed by Clarke (1976). Instead of evaluating the screen coverage of the projected object, the degradation function was based on the evaluation of visual angle in world coordinates. A three-dimensional spatial degradation function was obtained from human subject experiments in an attempt to imperceptibly display spatially degraded geometric objects. System performance measurements indicated an approximate overall 10-fold average frame rate improvement during gaze-contingent viewing.
Another interesting approach to gaze-contingent modeling for real-time graphics rendering was taken by OSullivan and Dingliana (2001) and OSullivan, Dingliana, and Howlett (2002). Instead of degrading the spatial resolution of peripherally located geometric objects, OSullivan and Dingliana considered a degradable collision-handling mechanism to effectively limit object collision resolution outside a foveal ROI. When a viewer is looking directly at a collision, it is given higher (computational) priority than are collisions occurring in the periphery. Object collisions given higher priority are allocated more processing time, so that the contact model and the resulting response are more believable. OSullivan and Dingliana tested viewers sensitivity to collision resolution (size of gap between colliding objects) at eccentricity and noted a significant fall-off in detection accuracy at about 4º of visual angle. On the basis of their previous psychophysical findings, OSullivan et al. developed a gaze-contingent collision-handling system. Two variants of the gaze-contingent system were compared, each containing a high-priority ROI wherein collisions were processed at a higher resolution than they were outside the ROI. In the tracked case, the high-priority ROI was synchronized to the viewers tracked gaze position, whereas in the random case, the ROI position was determined randomly every five frames. OSullivan et al. reported an overall improvement in the perception of the tracked simulation.
CONCLUSION
As the present review demonstrates, eye trackers have traditionally shown themselves to be valuable in diagnostic studies of reading and other information-processing
468 DUCHOWSKI
tasks. The diagnostic use of an eye tracker, as exemplified by the research reviewed here, can be considered the eye trackers mainstay application at the present time and probably in its near future. As also is documented in this review, because eye trackers are able to provide a quantitative measure of real-time overt attention, they are valuable components of interactive systems. Several key problems and solutions have been identified in interactive eye-tracking systems (e.g., the Midas Touch problem). For disabled users, an eye-tracking interface may be an indispensable form of communication (e.g., eye typing). In a more general interactive setting, however, there is some debate as to whether it makes sense to overload a perceptual organ (the eye) by a motor task (e.g., mouselike pointing). As an auxiliary interface modality, the eye tracker may serve better as an indirect indicator of the users future selective intent (i.e., serving as a mouse pointing accelerator, rather than as the mouse pointer itself). This type of indirect use of eye movement may also be exploited in gaze-contingent scenarios in which gaze is not used as an interaction modality per se (i.e., no particular action is performed) but, rather, gaze is used to alter the scene in some manner to either manage computational or bandwidth resources or to test human visual perception or cognitive skill. In the latter sense, while the beginning of the fourth eye-tracking era may coincide with an increased number of interactive applications driven by increasingly sophisticated imagery (e.g., realtime video or VR), gaze-contingent applicationsare simply extensions of gaze-contingent paradigms of the third era. However, owing to the richness and flexibility of graphical environments, it is likely that novel interactive uses of eye trackers within increasingly complex contextual situations will allow investigation of a broader class of applications than has been seen in the past.
REFERENCES
Allopenna, P. D., Magnuson, J. S., & Tanenhaus, M. K. (1998). Tracking the time course of spoken word recognition using eye movements: Evidence for continuous mapping models. Journal of Memory & Language, 38, 419-439.
Anders, G. (2001). Pilots attention allocation during approach and landingeye- and head-tracking research in an A330 full flight simulator. In Proceedings of the 11th International Symposium on Aviation Psychology. Retrieved July 12, 2002, from http://www.geerdanders.de/literatur/2001_ohio.html.
Anderson, J. R. (1993). Rules of the mind. Hillsdale, NJ: Erlbaum. Asaad, W. F., Rainer, G., & Miller, E. K. (2000). Task-specific
neural activity in the primate prefrontal cortex. Neurophysiology, 84, 451-459. Ballard, D. H., Hayhoe, M. M., & Pelz, J. B. (1995). Memory representations in natural tasks. Journal of Cognitive Neuroscience, 7, 66-80. Bertera, J. H., & Rayner, K. (2000). Eye movements and the span of the effective stimulus in visual search. Perception & Psychophysics, 62, 576-585. Buswell, G. T. (1935). How people look at pictures. Chicago: University of Chicago Press. Byrne, M. D., Anderson, J. R., Douglass, S., & Matessa, M. (1999). Eye tracking the visual search of click-down menus. In Human factors in computing systems: CHI 99 conference proceedings (pp. 402409). New York: ACM Press. Chapman, P. R., & Underwood, G. (1998). Visual search of dynamic
scenes: Event types and the role of experience in viewing driving situations. In G. Underwood (Ed.), Eye guidance in reading and scene perception (pp. 369-394). Amsterdam: Elsevier. Clarke, J. H. (1976). Hierarchical geometric models for visible surface algorithms. Communications of the ACM, 19, 547-554. Cooper, R. M. (1974). The control of eye fixation by the meaning of spoken language: A new methodology for the real-time investigation of speech perception, memory, and language processing. Cognitive Psychology, 6, 84-107. Crundall, D. E., Underwood, G., & Chapman, P. R. (1998). How much do novice drivers see? The effects of demand on visual search strategies in novice and experienced drivers. In G. Underwood (Ed.), Eye guidance in reading and scene perception (pp. 395-418). Amsterdam: Elsevier. Danforth, R., Duchowski, A., Geist, R., & McAliley, E. (2000). A platform for gaze-contingent virtual environments. In Smart graphics (Papers from the 2000 AAAI spring symposium, Tech. Rep. SS-0004, pp. 66-70). Menlo Park, CA: AAAI. DeCarlo, D., & Santella, A. (2002). Stylization and abstraction of photographs. Transaction on Graphics, 21, 769-776. Dishart, D. C., & Land, M. F. (1998). The development of the eye movement strategies of learner drivers. In G. Underwood (Ed.), Eye guidance in reading and scene perception (pp. 419-430). Amsterdam: Elsevier. Doll, T. J. (1993). Preattentive processing in visual search. In Proceedings of the Human Factors and Ergonomics Society, 37th annual meeting (pp. 1291-1249). Santa Monica, CA: Human Factors & Ergonomics Society. Doll, T. J., Whorter, S. W., & Schmieder, D. E. (1993). Simulation of human visual search in cluttered backgrounds. In Proceedings of the Human Factors and Ergonomics Society, 37th annual meeting (pp. 1310-1314). Santa Monica, CA: Human Factors & Ergonomics Society. Drury, C. G., Gramopadhye, A. K., & Sharit, J. (1997). Feedback strategies for visual inspection in airframe structural inspection. International Journal of Industrial Ergonomics, 19, 333-344. Duchowski, A. T. (2003). Eye tracking methodology: Theory & practice. London: Springer-Verlag.
Duchowski, A. T., Medlin, E., Gramopadhye, A., Melloy, B., &
Nair, S. (2001). Binocular eye tracking in VR for visual inspection training. In Virtual reality software & technology (VRST). New York: ACM Press. dYdewalle, G., Desmet, G., & Van Rensbergen, J. (1998). Film perception: The processing of film cuts. In G. Underwood (Ed.), Eye guidance in reading and scene perception (pp. 357-368). Amsterdam: Elsevier. Findlay, J. M. (1997). Saccade target selection during visual search. Vision Research, 37, 617-631. Findlay, J. M., & Gilchrist, I. D. (1998). Eye guidance and visual search. In G. Underwood (Ed.), Eye guidance in reading and scene perception (pp. 295-312). Amsterdam: Elsevier. Findlay, J. M., & Walker, R. (1999). A model of saccade generation based on parallel processing and competitive inhibition. Behavioral & Brain Sciences, 22, 661-721. Funkhouser, T. A., & Séquin, C. H. (1993). Adaptive display algorithm for interactive frame rates during visualization of complex virtual environments. In Computer Graphics (SIGGRAPH 93) (pp. 247-254). New York: ACM Press. Goldberg, J. H., & Kotval, X. P. (1999). Computer interface evaluation using eye movements: Methods and constructs. International Journal of Industrial Ergonomics, 24, 631-645.
Goldberg, J. H., Stimson, M. J., Lewnstein, M., Scott, N., &
Wichansky, A. M. (2002). Eye tracking in Web search tasks: Design implications. In Proceedings of the symposium on eye tracking research & applications (ETRA) 2002 (pp. 51-58). New York: ACM Press. Graeber, D. A., & Andre, A. D. (1999). Assessing visual attention of pilots while using electronic moving maps for taxiing. In R. S. Jensen, B. Cox, J. D. Callister, & R. Lavis (Eds.), Proceedings of the Tenth InternationalSymposium on Aviation Psychology (pp. 791-796). Greene, H. H., & Rayner, K. (2001). Eye movements and familiarity effects in visual search. Vision Research, 41, 3763-3773.
EYE-TRACKING APPLICATIONS 469
Hayhoe, M. M., Ballard, D. H., Triesch, J., Shinoda, H., Aivar, P.,
& Sullivan, B. (2002). Vision in natural and virtual environments. In Proceedings of the symposium on eye tracking research & applications (ETRA) 2002 (pp. 7-13). New York: ACM Press. Henderson, J. M. (1992). Object identification in context: The visual processing of natural scenes. Canadian Journal of Psychology, 46, 319-341. Henderson, J. M., & Hollingworth, A. (1998). Eye movements during scene viewing: An overview. In G. Underwood (Ed.), Eye guidance in reading and scene perception (pp. 269-294). Amsterdam: Elsevier. Ho, G., Scialfa, C. T., Caird, J. K., & Graw, T. (2001). Visual search for traffic signs: The effects of clutter, luminance, and aging. Human Factors, 43, 194-207. Hornof, A. J., & Kieras, D. E. (1997). Cognitive modeling reveals menu search is both random and systematic. In Human factors in computing systems: CHI 97 conference proceedings (pp. 107-144). New York: ACM Press. Hughes, H. C., Nozawa, G., & Kitterle, F. (1996). Global precedence, spatial frequency channels, and the statistics of natural images. Journal of Cognitive Neuroscience, 8, 197-230. Jacob, R. J. (1990). What you look at is what you get: Eye movementbased interaction techniques. In Human factors in computing systems: CHI 90 conference proceedings (pp. 11-18). New York: ACM Press. Kanizsa, G. (1976, April). Subjective contours. Scientific American, 234, 48-52,138. Kennedy, A. (1992). The spatial coding hypothesis. In K. Rayner (Ed.), Eye movements and visual cognition: Scene perception and reading (pp. 379-396). New York: Springer-Verlag. Kieras, D., & Meyer, D. E. (1995). An overview of the EPIC architecture for cognition and performance with application to humancomputer interaction (EPIC Tech. Rep. No. 5, No. TR-95/ONR-EPIC-5). Ann Arbor: University of Michigan, Electrical Engineering and Computer Science Department. Kroll, J. F. (1992). Making a scene: The debate about context effects for scenes and sentences. In K. Rayner (Ed.), Eye movements and visual cognition: Scene perception and reading (pp. 284-292). New York: Springer-Verlag. Land, M. F., & Hayhoe, M. (2001). In what ways do eye movements contribute to everyday activities. Vision Research, 41, 3559-3565. Land, M. F., Mennie, N., & Rusted, J. (1999). The roles of vision and eye movements in the control of activities of daily living. Perception, 28, 1307-1432. Levoy, M., & Whitaker, R. (1990). Gaze-directed volume rendering. In Computer Graphics (SIGGRAPH 90) (pp. 217-223). New York: ACM Press. Loftus, G. R. (1981). Tachistoscopic simulations of eye fixations on pictures. Journal of Experimental Psychology: Human Learning & Memory, 7, 369-376. Lohse, G. L. (1997). Consumer eye movement patterns on Yellow Pages advertising. Journal of Advertising, 26, 61-73. Loschky, L. C., & McConkie, G. W. (2000). User performance with gaze contingent multiresolutional displays. In Proceedings of the symposium on eye tracking research and applications (ETRA) 2000 (pp. 97-103). New York: ACM Press. Luebke, D., & Erikson, C. (1997). View-dependent simplification of arbitrary polygonal environments. In Computer Graphics (SIGGRAPH 97) (pp. 199-208). New York: ACM Press. Luebke, D., Hallen, B., Newfield, D., & Watson, B. (2000). Perceptually driven simplification using gaze-directed rendering (Tech. Rep. CS-2000-04). Charlottesville: University of Virginia. Majaranta, P., & Räihä, K.-J. (2002). Twenty years of eye typing: Systems and design issues. In Eye tracking research & application: Proceedings of the symposium on ETRA 2002 (pp. 15-22). New York: ACM Press. McConkie, G. W., & Rayner, K. (1975). The span of the effective stimulus during a fixation in reading. Perception & Psychophysics, 17, 578-586. Megaw, E. D., & Richardson, J. (1979). Eye movements and industrial inspection. Applied Ergonomics, 10, 145-154. Molnar, F. (1981). About the role of visual exploration in aesthetics.
In H. Day (Ed.), Advances in intrinsic motivation and aesthetics. New York: Plenum. Murphy, H., & Duchowski, A. T. (2001). Gaze-contingent level of detail. In J. Roberts (Ed.), Eurographics (short presentations) (pp. 219228). Manchester, U.K.: University of Manchester. Necker, L. A. (1832). Observations on some remarkable optical phaenomena seen in Switzerland, and on an optical phaenomenon which occurs on viewing a figure or a crystal or geometrical solid. Philosophical Magazine & Journal of Science, 1, 329-337. Noton, D., & Stark, L. (1971a). Eye movements and visual perception. Scientific American, 224, 34-43. Noton, D., & Stark, L. (1971b). Scanpaths in saccadic eye movements while viewing and recognizing patterns. Vision Research, 11, 929942. Ohshima, T., Yamamoto, H., & Tamura, H. (1996). Gaze-directed adaptive rendering for interacting with virtual space. In Proceedings of VRAIS 96 (pp. 103-110). Los Alamitos, CA: IEEE Computer Society Press. OSullivan, C., & Dingliana, J. (2001). Collisions and perception. ACM Transactions on Graphics, 20 (3), 151-168. OSullivan,C., Dingliana, J., & Howlett, S. (2002). Gaze-contingent algorithms for interactive graphics. In J. Hyöna, R. Radach, & H. Deubel (Eds.), The minds eyes: Cognitive and applied aspects of eye movement research. Amsterdam: Elsevier. Ottati, W. L., Hickox, J. C., & Richter, J. (1999). Eye scan patterns of experienced and novice pilots during visual flight rules (VFR) navigation. In Proceedings of the Human Factors and Ergonomics Society, 43rd annual meeting (pp. 66-70). Santa Monica, CA: Human Factors & Ergonomics Society.
Özyurt, J., DeSouza, P., West, P., Rutschmann, R., & Greenlee,
M. W. (2001, August). Comparison of cortical activity and oculomotor performance in the gap and step paradigms. Paper presented at the European Conference on Visual Perception (ECVP), Kusadasi, Turkey. Parkhurst, D. [J.], Culurciello, E., & Niebur, E. (2000). Evaluating variable resolution displays with visual search: Task performance and eye movements. In Eye tracking research & application: Proceedings of the symposium on eye tracking research and applications 2000 (pp. 105-109). New York: ACM Press. Parkhurst, D. J., & Niebur, E. (in press). Variable resolution displays: A theoretical, practical, and behavioral evaluation. Human Factors. Pelz, J. B., Canosa, R., & Babcock, J. (2000). Extended tasks elicit complex eye movement patterns. In Proceedings of the symposium on eye tracking research and applications (ETRA) 2000 (pp. 37-43). New York: ACM Press. Posner, M. I., Snyder, C. R. R., & Davidson, B. J. (1980). Attention and the detection of signals. Journal of Experimental Psychology: General, 109, 160-174. Rayner, K. (1975). The perceptual span and peripheral cues in reading. Cognitive Psychology, 7, 65-81. Rayner, K. (Ed.) (1992). Eye movements and visual cognition: Scene perception and reading. New York: Springer-Verlag. Rayner, K. (1998). Eye movements in reading and information processing: 20 years of research. Psychological Bulletin, 124, 372-422. Rayner, K., & Bertera, J. H. (1979). Reading without a fovea. Science, 206, 468-469. Rayner, K., & Pollatsek, A. (1992). Eye movements and scene perception. Canadian Journal of Psychology, 46, 342-376.
Rayner, K., Rotello, C. M., Stewart, A. J., Keir, J., & Duffy, S. A.
(2001). Integrating text and pictorial information: Eye movements when looking at print advertisements. Journal of Experimental Psychology: Applied, 7, 219-226. Recarte, M. A., & Nunes, L. M. (2000). Effects of verbal and spatialimagery tasks on eye fixations while driving. Journal of Experimental Psychology: Applied, 6, 31-43. Reddy, M. (1998). Specification and evaluation of level of detail selection criteria. Virtual Reality: Research, Development & Application, 3, 132-143. Reichle, E. D., Pollatsek, A., Fisher, D. L., & Rayner, K. (1998). Toward a model of eye movement control in reading. Psychological Review, 105, 125-157.
470 DUCHOWSKI
Reingold, E. M., Charness, N., Pomplun, M., & Stampe, D. M. (in press). Visual span in expert chess players: Evidence from eye movements. Psychological Science.
Reingold, E. M., & Loschky, L. C. (2002). Saliency of peripheral targets in gaze-contingent multiresolutional displays. Behavior Research Methods, Instruments, & Computers, 34, 491-499.
Reingold, E. M., Loschky, L. C., McConkie, G. W., & Stampe, D. M.
(in press). Gaze-contingent multi-resolutional displays: An integrative review. Human Factors. Robinson, D. A. (1968). The oculomotor control system: A review. Proceedings of the IEEE, 56, 1032-1049. Rosbergen, E., Wedel, M., & Pieters, R. (1990). Analyzing visual attention to repeated print advertising using scanpath theory (Tech. Rep. No. 97B32). University Library Groningen, SOM Research School. Schoonard, J. W., Gould, J. D., & Miller, L. A. (1973). Studies of visual inspection. Ergonomics, 16, 365-379. Sibert, L. E., & Jacob, R. J. (2000). Evaluation of eye gaze interaction. In Human factors in computing systems: CHI 2000 conference proceedings (pp. 281-288). New York: ACM Press. Smeets, J. B. J., Hayhoe, H. M., & Ballard,D. H. (1996). Goal-directed arm movements change eyehead coordination. Experimental Brain Research, 109, 434-440. Snodderly, D. M., Kagan, I., & Gur, M. (2001). Selective activation of visual cortex neurons by fixational eye movements: Implications for neural coding. Visual Neuroscience, 18, 259-277. Solso, R. L. (1999). Cognition and the visual arts (3rd ed.). Cambridge, MA: MIT Press. Starker, I., & Bolt, R. A. (1990). A gaze-responsive self-disclosing display. In Human factors in computing systems: CHI 90 conference proceedings (pp. 3-9). New York: ACM Press. Tanriverdi, V., & Jacob, R. J. K. (2000). Interacting with eye movements in virtual environments. In Human factors in computing systems: CHI 2000 conference proceedings (pp. 265-272). New York: ACM Press. Todd, S., & Kramer, A. F. (1993). Attentional guidance in visual attention. In Proceedings of the Human Factors and Ergonomics Society, 37th annual meeting (pp. 1378-1382).Santa Monica, CA: Human Factors & Ergonomics Society. Treisman, A., & Gelade, G. (1980). A feature integration theory of attention. Cognitive Psychology, 12, 97-136. Van Orden, K. F., & DiVita, J. (1993). Highlighting with flicker. In Proceedings of the Human Factors and Ergonomics Society, 37th annual meeting (pp. 1300-1304). Santa Monica, CA: Human Factors & Ergonomics Society.
Velichkovsky, B., Pomplun, M., & Rieser, J. (1996). Attention and communication: Eye-movement-based research paradigms. In W. H. Zangemeister, H. S. Stiehl, & C. Freksa (Eds.), Visual attention & cognition (pp. 125-154). Amsterdam: Elsevier.
Vertegaal,R. (1999). The GAZE groupware system: Mediating joint attention in multiparty communication and collaboration. In Human factors in computing systems: CHI 99 conference proceedings (pp. 294301). New York: ACM Press.
Vince, J. A. (1995). Virtual reality systems. Reading, MA: AddisonWesley.
Wang, M.-J. J., Lin, S.-C., & Drury, C. G. (1997). Training for strategy in visual search. Industrial Ergonomics, 20, 101-108.
Watson, B., Walker, N., Hodges, L. F., & Worden, A. (1997). Managing level of detail through peripheral degradation: Effects on search performance with a head-mounted display. ACM Transactions on ComputerHuman Interaction, 4, 323-346.
Wedel, M., & Pieters, R. (2000). Eye fixations on advertisements and memory for brands: A model and findings. Marketing Science, 19, 297-312.
Wolfe, J. M. (1993). Guided Search 2.0: The upgrade. In Proceedings of the Human Factors and Ergonomics Society, 37th annual meeting (pp. 1295-1299). Santa Monica, CA: Human Factors & Ergonomics Society.
Wolfe, J. M. (1994). Visual search in continuous, naturalistic stimuli. Vision Research, 34, 1187-1195.
Wolfe, J. M., & Gancarz, G. (1996). Guided Search 3.0: A model of visual search catches up with Jay Enoch 40 years later. In V. Lakshminarayanan (Ed.), Basic and clinical applications of vision science (pp. 189-192). Dordrecht: Kluwer.
Wooding, D. S. (2002). Fixation maps: Quantifying eye-movement traces. In Proceedings of the symposium on eye tracking research & applications (ETRA) 2002 (pp. 31-36). New York: ACM Press.
Yarbus, A. L. (1967). Eye movements and vision. New York: Plenum. Zhai, S., Morimoto, C., & Ihde, S. (1999). Manual and gaze input cas-
caded (MAGIC) pointing. In Human factors in computing systems: CHI 99 conference proceedings (pp. 246-353). New York: ACM Press. Zorin, D., & Schröder, P. (2000). Course 23: Subdivision for modeling and animation. New York. Retrieved December 30, 2000 at http://www.mrl.nyu.edu/dzorin/sig00course/.
(Manuscript received March 14, 2002; revision accepted for publication August 4, 2002.)

View File

@@ -0,0 +1,13 @@
Title:
Creator: RealPage PDF Generator 2.0
Producer: Acrobat Distiller 8.0.0 (Windows)
CreationDate: 12/10/10 16:46:51
ModDate: 12/10/10 16:53:57
Tagged: no
Form: none
Pages: 16
Encrypted: no
Page size: 612.024 x 791.973 pts (letter) (rotated 0 degrees)
File size: 220255 bytes
Optimized: yes
PDF version: 1.3