Files
DissLiteratur/storage/FK3QGGCR/.zotero-ft-cache
Johannes Paehr c4354c0441 init
2025-10-18 15:35:31 +02:00

1024 lines
55 KiB
Plaintext
Raw Permalink Blame History

This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
738
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 28, NO. 5, MAY 2006
Meticulously Detailed Eye Region Model and Its Application to Analysis of Facial Images
Tsuyoshi Moriyama, Member, IEEE, Takeo Kanade, Fellow, IEEE, Jing Xiao, Member, IEEE, and Jeffrey F. Cohn, Member, IEEE
Abstract—We propose a system that is capable of detailed analysis of eye region images in terms of the position of the iris, degree of eyelid opening, and the shape, complexity, and texture of the eyelids. The system uses a generative eye region model that parameterizes the fine structure and motion of an eye. The structure parameters represent structural individuality of the eye, including the size and color of the iris, the width, boldness, and complexity of the eyelids, the width of the bulge below the eye, and the width of the illumination reflection on the bulge. The motion parameters represent movement of the eye, including the up-down position of the upper and lower eyelids and the 2D position of the iris. The system first registers the eye model to the input in a particular frame and individualizes it by adjusting the structure parameters. The system then tracks motion of the eye by estimating the motion parameters across the entire image sequence. Combined with image stabilization to compensate for appearance changes due to head motion, the system achieves accurate registration and motion recovery of eyes.
Index Terms—Computer vision, facial image analysis, facial expression analysis, generative eye region model, motion tracking, texture modeling, gradient descent.
æ
1 INTRODUCTION
IN facial image analysis for expression and identity recogni- appearance of the eye region for eye motion tracking. The tion, eyes are particularly important [1], [2], [3], [4]. Gaze model parameterizes both the structural individualities and
tracking plays a significant role in human-computer interac- the motions of eyes. Structural individualities include the
tion [5], [6] and the eye region provides useful biometric size and the color of the iris, the width and the boldness of
information for face and intention recognition [7], [8]. The the eyelid, which may have a single or double fold, the
Facial Action Coding System (FACS [9]), the de facto standard width of the bulge below the eye, the furrow below it, and
for coding facial muscle actions in behavioral science [10], the width of illumination reflection on the bulge. Eye motion
defines many action units (AUs) for eyes [11], [12].
includes the up-down positions of upper and lower eyelids
Automated analysis of facial images has found eyes still and the 2D position of the iris. The input image sequence
to be a difficult target [13], [14], [15], [16], [17], [18], [19], first is stabilized to compensate for appearance change due
[20], [21]. The difficulty comes from the diversities in the to head motion. The system then registers the eye region
appearance of eyes due to both structural individuality and model to the input eye region and individualizes it by
motion of eyes, as shown in Fig. 1. Past studies have failed adjusting the structure parameters and accurately tracks the
to represent these diversities adequately. For example, Tian motion of the eye.
et al. [22] used a pair of parabolic curves and a circle as a
generic eye model, but parabolic curves have too few parameters to represent the complexity of eyelid shape and
2
EYE REGION MODEL
motion. Statistical models have been deployed to represent We define a rectangular region around the eye as an eye
such individual differences for the whole eye region [23], region for analysis. We exploit a 2D, parameterized,
[24], [25], but not for subregions, such as the eyelids, due in generative model that consists of multiple components
part to limited variation in training samples.
corresponding to the anatomy of an eye. These components
In this paper, we propose and evaluate a generative eye include the iris, upper and lower eyelids, a white region
region model that can meticulously represent the detailed around the iris (sclera), dark regions near the inner and outer
corners of the white region, a bulge below the lower eyelid, a
. T. Moriyama is with the Department of Science and Technology, Keio University, 3-14-1 Hiyoshi, Kouhoku-ku, Yokohama-shi, Kanagawa 2238522 Japan. E-mail: moriyama@ozawa.ics.keio.ac.jp.
. T. Kanade is with the Robotics Institute, Carnegie Mellon University, 5000 Forbes Ave., Pittsburgh, PA 15213-3890. E-mail: tk@cs.cmu.edu.
. J. Xiao is with the Epson Palo Alto Laboratory, Epson Research and Development, Inc., Palo Alto, CA 94304. E-mail: xiaoj@erd.epson.com.
. J.F. Cohn is with University of Pittsburgh, 4327 Sennott Square,
bright region on the bulge, and a furrow below the bulge (the infraorbital furrow). The model for each component is rendered in a separate rectangular layer. When overlaid, these layers represent the eye region as illustrated in Fig. 2. Within each layer, pixels that render a component are assigned color intensities or transparency so that the color in a lower layer appears in the final eye region model if all the
Pittsburgh, PA 15260. E-mail: jeffcohn@pitt.edu.
upper layers above it have transparent pixels at the same
Manuscript received 4 Jan. 2005; revised 1 Aug. 2005; accepted 6 Sept. 2005; locations. For example, the iris layer (the third layer from the
published online 13 Mar. 2006. Recommended for acceptance by J. Goutsias. For information on obtaining reprints of this article, please send e-mail to:
bottom) has a circular region to represent the iris. The eyelid layer (the fourth layer, one above the iris layer) has two curves
tpami@computer.org, and reference IEEECS Log Number TPAMI-0008-0105. to represent upper and lower eyelids, in which the region
Authorized licensed use limited to: Technische Informationsbibliothek (TIB). Downloaded on October 01,2025 at 09:16:22 UTC from IEEE Xplore. Restrictions apply.
0162-8828/06/$20.00 ß 2006 IEEE Published by the IEEE Computer Society
MORIYAMA ET AL.: METICULOUSLY DETAILED EYE REGION MODEL AND ITS APPLICATION TO ANALYSIS OF FACIAL IMAGES
739
Fig. 2. Multilayered 2D eye region model.
2.1.1 Structure of Upper Eyelid
To represent the distance between the boundary and the furrow, parameter du [0 1] gives the ratio to the predefined maximum distance between curve1 and curve2. When curve1 and curve2 coincide (du ¼ 0), the upper eyelid appears to be a uniform region, which we refer to as a single eyelid fold. Single eyelid folds are common in East Asians. “Boldness” parameter f [0 1] controls both the intensity Ir1 of region1 and the line width wc2 of curve2, simultaneously by Ir1 ¼ Irb1rightest À 1 Á f and wc2 ¼ 2 Á f þ wtch2ickest ( 1; 2: constant). The appearance changes controlled by du and f are shown in Table 2.
2.1.2 Motion of Upper Eyelid
Fig. 1. Diversity in the appearance of eye images. (a) Variance from When an upper eyelid moves up and down in its motion (e.g.,
structural individuality. (b) Variance from motion of a particular eye.
blinking), the boundary between the upper eyelid and the
between those curves (palpebral fissure) is transparent while the region above the upper curve and the region below the lower curve are filled with skin color. When the eyelid layer is superimposed over the iris layer, only the portion of the circular region between the eyelid curves appears in the final eye region image while the rest is occluded by the skin pixels in the eyelid layer. When the upper curve in the eyelid layer is lowered, corresponding to eyelid closure, a greater portion of the circular region in the iris layer is occluded.
Table 1 shows the eye components represented in the multilayered eye region model along with their control
palpebral fissure moves up and down. The model represents
this motion by moving the vertices of curve1 (u1). They move between the predefined curve for a completely open eye (ut1op) and that for a closed eye (ub1ottom), as shown in Fig. 3.
Parameter height [0 1] specifies the position of curve1 within
this range and, thus, the ith vertex position of curve1 (u1i) is
defined by parameter height as,
u1i
¼
 sin
2
Á

height 
Á
ut1oip

þ 1 À sin 2 Á height
Á
ub1oi ttom ;
ð1Þ
parameters. We call parameters du, f, db, dr, ri, and I
7 the structure parameters (denoted by s) that define the static and structural detail of an eye region model, while we call parameters height, skew, height, x, and y the time-dependent motion parameters (denoted by mt, t: time) that define the dynamic detail of the model. The eye region model defined and constructed by the structure parameters s and the motion parameters mt is denoted by T ðx; s; mtÞ, where x denotes pixel positions in the model coordinates. Table 2 and Table 3 show examples of the appearance changes due to the different
uwt1hoperaenudt1oipuab1onttdomu, b1oirtetosmpeacrteivtehley.poTshiteiosnisnuofsothideailthtevremrtiicnes(o1)f moves the vertices rapidly when height is small and slowly when height is large with respect to the linear change of height. This corresponds to the possible rapid movement of the upper eyelid when it lowers in motion such as blinking.
The furrow on the upper eyelid also moves together with
the boundary. The model represents this motion by moving
the vertices of curve2 (u2). The positions of the vertices of
values of s and mt in the eye region model T ðx; s; mtÞ.
curve2 (u2) are defined by using parameters height and du
2.1 Upper Eyelid The upper eyelid is a skin region that covers the upper area of
such that they move in parallel to curve1 (u1) when height is larger than a preset threshold hTeight or move slowly keeping the distance between curve1 and curve2 wide otherwise.
the palpebral fissure (the eye aperture). It has two descriptive features: 1) a boundary between the upper eyelid and the
If height is larger than hTeight, then
palpebral fissure and 2) a furrow running nearly in parallel to the boundary directly above the upper eyelid.
The model represents these features by two polygonal curves (curve1 and curve2) and the region (region1) surrounded by them. Both curve1 and curve2 consist of Nu vertices denoted by u1 and u2, respectively (Table 1).
ux2i uy2i
¼ ¼
ux1i; uy1i À
 1
Á
jux1i jux11
À À
ux1 j ux1 j
þ
 2
Á
du;
ð2Þ ð3Þ
ux1
¼
ux11
þ ux1Nu 2
Authorized licensed use limited to: Technische Informationsbibliothek (TIB). Downloaded on October 01,2025 at 09:16:22 UTC from IEEE Xplore. Restrictions apply.
740
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 28, NO. 5, MAY 2006
TABLE 1 Detailed Description of the Eye Region Model
else
transformed into the skewed positions (us1kewed) under
ux2i
¼
À 1
À
Á
du ;height
Á
ux;height ¼hTeight
1i
þ
du ;height
Á
ux1i;bottom;
ð4Þ
orthographic projection, where C denotes the center of the eyeball and  defines the opening of the eye. The coordinate
of C in the xeye À zeye plane is
uy2i
¼
À 1
À
Á
du ;height
Á
u~y;height ¼hTeight
1i
þ
du ;height
Á
u~y1;ibottom;
ð5Þ


Cxeye ¼
uxeye
1Nu
þ
uxeye
11
=2;
ð6Þ
¼ du;height
!
1
À
height hTeight
Á ð1 À duÞ;
u~y1i
¼
uy1i
À
 1
Á
jux1i jux11
À ux1 j À ux1 j
þ
 2
Á
du;
Czeye ¼ Cxeye Á tan :
ð7Þ
The surface,
cÀouox1riedyei;nuaz1etiyee Ás,
of u1i should
projected onto the satisfy (8), with r
spherical being the
radius of the sphere:
Àux1ieye À Cxeye Á2þÀuz1eiye À Czeye Á2¼ r2:
ð8Þ
end if where 1 and 2 are constant.
The boundary also appears skewed horizontally when the eye is not straight to the camera because it is on a spherical eyeball. The model represents it by horizontally
The x coordinate of horizontally skewed positions of u1i
(us1kiewed;x) in the x-z plane is obtained as
us1ki ewed;x
¼
uxeye
1i
Á
cosðskewÞ
þ
uz1eiye 
Á
sinðskewÞ:
ð9Þ
skewing curve1 by using parameter skew [0 1]. As shown in
The first two rows of Table 3 shows examples of the
Fig. 4, the vertices of curve1 (u1) defined by (1) are appearance changes due to parameters height and skew.
Authorized licensed use limited to: Technische Informationsbibliothek (TIB). Downloaded on October 01,2025 at 09:16:22 UTC from IEEE Xplore. Restrictions apply.
MORIYAMA ET AL.: METICULOUSLY DETAILED EYE REGION MODEL AND ITS APPLICATION TO ANALYSIS OF FACIAL IMAGES
741
TABLE 2 Appearance Changes Controlled by Structure Parameters
Fig. 3. The up-down position of curve1.
TABLE 3 Appearance Changes Controlled by Motion Parameters
Fig. 4. The horizontal skew of curve1.
3. an infraorbital furrow parallel to and below the lower eyelid, running from near the inner corner of the eye and following the cheek bone laterally [9], and
4. a brighter region on the bulge, which is mainly caused by the reflection of illumination.
As shown in Table 1, the model represents these features by four polygonal curves (curve3, curve4, curve5, and curve6) and two regions (region2 surrounded by curve3 and curve4 and region3 surrounded by curve3 and curve6). Curve3, curve4, and curve6 consist of Nl vertices and are denoted by l1, l2, and l4, respectively. Curve5 is the middle portion of curve4, consisting of Nf vertices denoted by l3.
2.2.1 Structure of Lower Eyelid
Distance ratio parameter db [0 1] controls the distance between curve3 and curve4. The vertices of curve4 (l2) have the predefined positions for both the thinnest bulge (lt2op) and the thickest bulge (lb2ottom), as shown in Fig. 5. The positions of the jth vertex of l2 are defined by using parameter db as
l2j ¼ db Á lb2ojttom þ ð1 À dbÞ Á lt2ojp;
ð10Þ
2.2 Lower Eyelid A lower eyelid is a skin region that covers the lower area of the palpebral fissure. It has four descriptive features:
where lt2ojp and lb2ojttom are the positions of the jth vertices of lt2op and lb2ottom, respectively.
Distance ratio parameter dr [0 1] controls the distance
between curve3 and curve6. The position of the jth vertex of
l4 is defined by using l1, l2, and parameter dr as
1. a boundary between the lower eyelid and the palpebral fissure,
l4j ¼ dr Á l2j þ ð1 À drÞ Á l1j:
ð11Þ
2. a bulge below the boundary, which results from the shape of the covered portion of the eye, shortening 2.2.2 Motion of Lower Eyelid
of the inferior portion of the orbicularis oculi muscle When the lower eyelid moves up or down (e.g., eyelid
(a sphincter muscle around the eye) on its length, tightening), the boundary between the lower eyelid and the
and the effects of gravity and aging,
palpebral fissure moves, correspondingly changing in area.
Authorized licensed use limited to: Technische Informationsbibliothek (TIB). Downloaded on October 01,2025 at 09:16:22 UTC from IEEE Xplore. Restrictions apply.
742
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 28, NO. 5, MAY 2006
Fig. 5. The model for a lower eyelid.
Fig. 6. The model for the outer corner.
The bulge, the infraorbital furrow, and the brighter region on the bulge also move together with the boundary.
Our model represents this motion by moving the vertices of curve3, curve5, and curve6. The vertices of curve3 have predefined positions for both the highest (lt1op) and the lowest (lb1ottom). Parameter height [0 1] gives the position within this range. The position of the jth vertex of l1 is obtained using parameter height as
l1j ¼ height Á lt1ojp þ ð1 À heightÞ Á lb1ojttom;
ð12Þ
consist of Nc vertices, denoted by c1 and c2, respectively. Fig. 6 depicts the details of the outer corner model.
When the upper eyelid and/or the lower eyelid move, the shape of the eye corners changes. Our model controls the motion of the upper and the lower boundaries by parameters height, skew, and height as mentioned. The x coordinates of c12 and c13 are moved from predefined neutral positions based on parameter skew according to the horizontal proportion P c12=P Q and P c13=P Q, respectively, and their y coordinates are so determined as to keep the vertical proportions same.
where lt1ojp and lb1ojttom are the positions of the jth vertices of lt1op and lb1ottom, respectively. Likewise, parameter height controls the positions of l2, lt2op, and lb2ottom in (10).
2.5 Iris
The iris is a circular and colored region on the eyeball. The apparent color of the iris is mainly determined by reflection of
lt2ojp ¼ height Á lt2ojp;t þ ð1 À heightÞ Á lt2ojp;b;
environmental illumination and the iris texture and patterns ð13Þ including the pupil (an aperture in the center of the iris). Our
model represents the iris by a circular region, region7, as
lb2ojttom ¼ height Á lb2ojttom;t þ ð1 À heightÞ Á lb2ojttom;b;
ð14Þ
where ðlt2ojp;t; lt2ojp;bÞ and ðlb2ojttom;t; lb2ojttom;bÞ are the preset dynamic ranges for lt2ojp and lb2ojttom.
Parameter height also controls both the intensity of curve3 and that of curve5 (Ic3 and Ic5) by Ic3 ¼ Icb3rightest À 3 Á height and Ic5 ¼ Icb5rightest À 4 Á height ( 3; 4 : constant).
Table 3 shows examples of the appearance changes
controlled by parameter height.
shown in Table 1. Parameter ri and parameter I
7 control the radius and the variable single color of region7, respectively.
The color of the iris is represented as the average gray level
inside the iris. The position of the iris center moves when gaze direction
moves. Our model represents the motion by moving the
vertex of the center coordinate (ix; iy) of region7. It has predefined positions for gaze left (ilx), gaze right (irx), gaze up (iuy ), and gaze down (idy), respectively. Parameters x [0 1] and y [0 1] give the position within these ranges as
2.3 Sclera The sclera is the white portion of the eyeball. We limit it to
ix ¼ x Á irx þ ð1 À xÞ Á ilx;
ð15Þ
the region that can be seen in the palpebral fissure, which is surrounded by the upper eyelid and the lower eyelid. Our model represents the sclera by a region (region4) surrounded by curve1 and curve3, which are defined to represent upper and lower eyelids, as shown in Table 1.
iy ¼ y Á iuy þ ð1 À yÞ Á idy:
ð16Þ
Table 3 includes examples of the appearance changes due to parameters x and y.
When the upper eyelid and/or the lower eyelid move, the
sclera changes its shape. Our model controls the change 3 MODEL-BASED EYE IMAGE ANALYSIS
indirectly by parameters height, skew, and height. These
primarily control the appearance changes of the upper eyelid
and the lower eyelid due to the motions. Parameter height also controls the intensity of region4 by Ir4 ¼ 5 Á height þ Ird4arkest ( 5: constant).
Fig. 7 shows a schematic overview of the whole process of a model-based eye region image analysis system. An input image sequence contains facial behaviors of a subject. Facial behaviors usually accompany spontaneous head motions. The appearance changes of facial images thus comprise both
2.4 Corners
Corners are regions at the medial (close to the midline) and lateral regions of the sclera. They are usually darker than other parts of the sclera due to shadow and color of the caruncle (a small, red portion of the corner of the eye that contains sebaceous and sweat glands). As shown in Table 1, our model represents the outer corner by a region surrounded by three
rigid 3D head motions and nonrigid facial actions. Decoupling these two components is realized by recovering the 3D head pose across the image sequence and by accordingly warping the faces to a canonical head pose (frontal and upright), which we refer to as the stabilized images. Stabilized images are intended to include appearance changes due to facial expression only. Eye image analysis proceeds on these stabilized images. For a given stabilized image sequence, the
polygonal curves (curve1, curve3, and curve7) and the inner system registers the eye region model to the input in the initial
corner by curve1, curve3, and curve8. Both curve7 and curve8 frame and individualizes the model by adjusting the structure
Authorized licensed use limited to: Technische Informationsbibliothek (TIB). Downloaded on October 01,2025 at 09:16:22 UTC from IEEE Xplore. Restrictions apply.
MORIYAMA ET AL.: METICULOUSLY DETAILED EYE REGION MODEL AND ITS APPLICATION TO ANALYSIS OF FACIAL IMAGES
743
Fig. 8. Automatic recovery of 3D head motion and image stabilization [26]. (a) Frames 1, 10, and 26 from original image sequence. (b) Face tracking in corresponding frames. (c) Stabilized face images. (d) Localized face regions.
such a frame that contains a neutral eye (an open eye with the
iris at the center), which may be different from the initial frame
used in head tracking. In the current implementation, the
individualized structure parameters s are obtained manually
by using a graphical user interface and fixed across the entire
sequence. Example results of individualization with respect
to each factor of the appearance diversities in Fig. 1 are shown
Fig. 7. A schematic overview of the model-based eye image analysis in Table 4.
system.
3.3 Tracking of Eye Motion
parameters s (Table 1). Motion of the eye is then tracked by estimating the motion parameters mt across the entire image sequence. If the tracking results at any time t are off the right positions, the model is readjusted, otherwise we finally get the estimated motion together with the structure of the eye.
The pixel intensity values of both the input eye region and the eye region model are normalized prior to eye motion tracking so that they have the same average and standard deviation. The motion parameters in the initial frame m0 are manually adjusted when the eye region model is individualized.
With the initial motion parameters m0 and the structure
3.1 Head Motion Stabilization
parameters s, the system tracks the motion of the eye across
We use a head tracker that is based on a 3D cylindrical head model [26]. Manually given the head region with the pose and feature point locations (e.g., eye corners) in an initial frame, the tracker automatically builds the cylindrical model and recovers 3D head poses and feature point locations across the rest of the sequence. The initial frame is selected such that it has the most frontal and upright face in it. The tracker recovers full 3D rigid motions (three rotations and three translations) of the head. The performance evaluation on both synthetic and real images has demonstrated that it can track
the rest of the sequence starting from t ¼ 0 to obtain mt at
all t. The system tracks the motion parameters by an
extended version of the Lucas-Kanade gradient descent
algorithm [27], which allows the template searched (the eye
region model here) to deform while tracking. Starting with
the values in the previous frame, the motion parameters mt
at the current frame t are estimated by minimizing the
following objective function D:
X
D ¼ ½T ðx; mt þ mtÞ À IðW ðx; pt þ ptÞފ2;
ð17Þ
as large as 40 degrees and 75 degrees of yaw and pitch, where I is the input eye region image, W is a warp from the
respectively, within 3.86 degrees of average error.
coordinate system of the eye region model to that of the eye
As shown in Fig. 8, the stabilized face images cancel out region image, and pt is a vector of the warp parameters that
most of the effect of 3D head pose and contain only the includes only translation in this implementation. Structure
remaining nonrigid facial expression.
parameters s do not show up in T because they are fixed while
tracking.
3.2 Individualization of Eye Region Model
mt and pt are obtained by solving the simulta-
The system first registers the eye region model to a stabilized neous equations obtained from the first-order Taylor
face in an initial frame t ¼ 0 by scaling and rotating the model expansion of (17) as explained in detail in the Appendix
so that both ends of curve1 (u1) of the upper eyelid coincide which can be viewed for free at http://computer.org/
with the eye corner points in the image. The initial frame is tpami/archives.htm. mt and pt are updated:
Authorized licensed use limited to: Technische Informationsbibliothek (TIB). Downloaded on October 01,2025 at 09:16:22 UTC from IEEE Xplore. Restrictions apply.
744
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 28, NO. 5, MAY 2006
TABLE 4 Example Results of Structure Individualization
mt mt þ mt; pt pt þ pt:
ð18Þ small (Cohn-Kanade) and head pose is frontal. Image
The iteration process at a particular frame t converges when sequences were recorded using VHS or S-VHS video and
the absolute values of mt and pt become less than the digitized into 640 by 480 gray scale or 16-bit color pixel arrays. preset thresholds or the number of iterations reaches the Image sequences begin with a neutral or near-neutral
maximum. The region surrounded by curve1 (u1) and expression and end with a target expression (e.g., lower
curve3 (l1) of the eyelids is used for the calculation process eyelids tightened). In Cohn-Kanade, image sequences are
so that more weight is placed on the structure inside the eye continuous (30 frames per second). In Ekman-Hager, they are
(the palpebral fissure) and other facial components (such as an discontinuous and include the initial neutral or near-neutral
eyebrow) that may appear in the eye region will not interfere. When parameter height is less than a preset threshold, the position of region7 (x and y) is not updated because the iris is so occluded that its position estimation is unreliable. Also, the warp parameters pt are not updated when height is less than a preset threshold because a closed (or an almost closed) eye appears to have only horizontal structure that gives only the vertical position of the eye region reliably.
expression and two each of low, medium, and high-intensity
facial action sampled from a longer image sequence.
In the experiments reported here, we empirically chose the
following parameter values for the eye model: Nu ¼ 8, Nl ¼ 11,
Nf ¼ 8, 1 ¼ 30, 2 ¼ 40, 1 ¼ 20, 2 ¼ 10, 3 ¼ 80, 4 ¼ 30, 5 ¼ 70, Irb1rightest ¼ 160, Icb3rightest ¼ 160, Icb5rightest ¼ 130, Ird4arkest ¼ 120, wtch2ickest ¼ 5, and  ¼ =6. The initialization for tracking was done to the first neutral or near-neutral
expression frame in each sequence. The system generates the
4 EXPERIMENTS
eye region model as a graphic image with a particular
We applied the proposed system to 577 image sequences from two independently collected databases: the Cohn-Kanade AU-coded Facial Expression Image Database [28] and the Ekman-Hager Facial Action Exemplars [29]. The subjects in these databases are young adults and include both men and
resolution. Because the size and positions of the graphics objects (e.g., lines) are specified in integers, the resolution and sharpness of the graphic images must be high enough for the model to represent the fine structures of an eye region. In our initial implementation, resolution was set at 350 by 250 pixels.
women of varied ethnic background. They wear no glasses or Thesystem thenregisteredthemodel totheinputeyeregion by
other accessories that could occlude their faces. With few scalingandrotatingitasexplainedinSection3.2.Weexamined
exceptions, head motion ranges from none (Ekman-Hager) to the results for diverse static eye structures and for the whole
Authorized licensed use limited to: Technische Informationsbibliothek (TIB). Downloaded on October 01,2025 at 09:16:22 UTC from IEEE Xplore. Restrictions apply.
MORIYAMA ET AL.: METICULOUSLY DETAILED EYE REGION MODEL AND ITS APPLICATION TO ANALYSIS OF FACIAL IMAGES
745
TABLE 5 Example Results for a Variety of Upper Eyelids
(a) Single-fold eyelids. (b) Double-fold eyelids. (c) Thick eyelids. (d) Revealing eyelids.
range of appearance changes from the neutral to the utmost intensities in dynamic motion.
4.1 Cohn-Kanade AU-Coded Facial Expression Image Database
This database was collected by the Carnegie Mellon and University of Pittsburgh group. A large part of this database has been publicly released. For this experiment, we used 490 image sequences of facial behaviors from 101 subjects, all but one of which were from the publicly released subset of the database. The subjects are adults that range from 18 to 50 years old with both genders (66 females and 35 males) and a variety of ethnicities (86 Caucasians, 12 African Americans, 1 East Asian, and two from other groups). Subjects were instructed by an experimenter to perform single AUs and their combinations in an observation room. Their facial behavior was then manually FACS labeled [9]. Image sequences that we used in this experiment began with a neutral face and had out-of-plane motion as large as 19 degrees.
differences in eyelid structure (e.g., single versus double fold) and features that would be necessary for accurate action unit recognition (direction of gaze, infraorbital furrow motion, and eyelid widening and closing). In quantitative evaluation, we investigated system performance with respect to resolution and sharpness of input eye region images, initialization, and complexity of the eye model.
5.1 Examples
Of the total 577 image sequences with 9,530 frames, the eye region model failed to match well in only five image sequences (92 frames total duration) from two subjects. One of the sequences contained relatively large and rapid head motion (approximately 20 degrees within 0.3 seconds) not otherwise present in either database. This motion caused interlacing distortion in the stabilized image that was not parameterized in the model. The other four error cases from a second subject were due to limitations in individualization as discussed below.
4.2 Ekman-Hager Facial Action Exemplars
5.1.1 Upper Eyelids
This database was provided by Ekman at the Human Interaction Laboratory, University of California San Francisco, whose images were collected by Hager, Methvin, and Irwin. For this experiment, we used 87 image sequences from 18 Caucasian subjects (11 females and 7 males). Some sequences have large lighting changes between frames. For these, we normalized the intensity so as to keep the average intensity constant throughout the image sequence. Each image sequence in this database consists of six to eight frames that were sampled from a longer sequence. Image sequences begin with neutral expression (or a weak facial action) and end with stronger facial actions.
A most likely failure would be that a curve of the upper eyelid model matches with the second (upper) curve of a double-fold eyelid in the input when they have similar appearance. As shown in Table 5b, our system was not compromised by such double-fold eyelids. Note that these eye region images shown in the table are after the image stabilization. The face itself moves in the original image sequence. (This is true with the subsequent tables through Table 10.)
When an upper eyelid appears thick due to cosmetics, eyelashes, or shadow, a model with a single thin line could match mistakenly at many locations within the area of thickness. Such errors did not occur; by considering boldness
of the upper eyelids as a variable, our system was able to track
5 RESULTS AND EVALUATION
We used both qualitative and quantitative approaches to evaluate system performance. Qualitatively, we evaluated the systems ability to represent the upper eyelids, localize and track the iris, represent the infraorbital furrow, and track
the correct positions of upper eyelids, as shown in Table 5c. Some subjects had double-fold eyelids that appeared
single-folded when the face was at rest (i.e., neutral expression). In these cases, the second (hidden) curves were revealed when the eyelids began to widen or narrow, which unfolded the double-fold. The boldness parameter absorbed
widening and closing of the eyelids. Successful performance this “revealing effect” and the system was able to track
ensures that the system is robust to ethnic and cosmetic correctly the upper eyelid contour, as shown in Table 5d.
Authorized licensed use limited to: Technische Informationsbibliothek (TIB). Downloaded on October 01,2025 at 09:16:22 UTC from IEEE Xplore. Restrictions apply.
746
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 28, NO. 5, MAY 2006
TABLE 6 Example Results for Irises of Different Colors
TABLE 7 Example Results for Differences in Appearance Below the Eye
(a) Bright iris. (b) Dark iris.
(a) Bulge. (b) Bulge with reflection.
system robustness relative to individually tracking feature
5.1.2 Irises
points (such as in [3], [22], [30], [31]) or using a generic eye
A most likely failure in tracking irises would be for an iris model to match another dark portion in the eye region, such as shadow around the hollow between the inner corner of the eye and the root of the nose. An especially bright iris could contribute to this type of error. This situation could happen if one were to try to find the location of the iris by
model. Studies [22], [32] that have used parabolic curves to represent eye shape have been less able to represent skewed eyelid shapes. Our model explicitly parameterizes skewing in the upper eyelid model; accordingly, the system was able to track such skewing upper eyelids in their motions as shown in Tables 8b and 8d.
finding only a circular region with a fixed dark color (e.g., Tian et al. [22]). Because our method uses a whole eye region as a pattern in matching and includes color and size of the irises as variables, the system was able to track the positions of irises accurately over a wide range of brightness, as shown in Table 6a.
5.1.5 Failure
Iris localization failed in a Caucasian female who had a bright iris with strong specular reflection and a thick and bold outer eye corner. Fig. 9 shows the error. While the eyelids were correctly tracked, the iris model mistakenly located the iris at the dark eye corner. Failure to correctly
5.1.3 Bulge with Reflection Below the Eye
A most likely failure would be that a curve of the lower eyelid model matches with the lower edge of the bulge or the infraorbital furrow. This could occur when the appearance of a bright bulge and the furrow below it are similar to that of the sclera with a lower eyelid curve below it. By considering the bulge, the illumination reflection on the
model the texture inside the iris appeared to be the source of this error. To solve this problem in future work, we anticipate that recurrently incorporating the appearance of the target eye region into the model during tracking would be effective and, more generally, would improve ability to accommodate unexpected appearance variation.
5.2 Quantitative Evaluation
bulge, and the infraorbital furrow in modeling the appear- To quantitatively evaluate the systems accuracy, we com-
ance below the eye, our system tracked lower eyelids pared the positions of the model points for the upper and
accurately, as shown in Table 7.
lower eyelids and the iris center (u1, l1, x, and y in Table 1,
5.1.4 Motion
respectively) with ground truth. Ground truth was determined by manually labeling the same number of points
Of 44 AUs defined in FACS [9], six single AUs are defined in around the upper and lower eyelids and the iris center using a
the eye region. These include AU 5 (upper lid raiser), AU 6 computer mouse. These points then were connected using
(cheek raiser and lid compressor), AU 7 (lid tightener, which polygonal curves. We then computed the Euclidean distance
encompasses AU 44 in the 2002 edition of FACS), AU 43 (eye from each of the model points to the closest line segment
closure), AU 45 (blink), and AU 46 (wink). Gaze directions are between manually labeled points. If model points were
also defined as AU 61 (turn left), AU 62 (Turn right), AU 63 located horizontally outside of the eye, the line segment from
(up), and AU 64 (down). Tables 8a, 8b, 8c, 8d, and 8e are the closest manually labeled endpoint was used. For the iris
correspondent with AU 5, AU 6+62, AU 6, AU 45 and AU 7, center, the Euclidean distance to the manually labeled iris
and AU 6+7, respectively, which cover the AUs related to the center was computed. The Euclidean distances were normal-
eye region. The frames shown range from neutral to ized by dividing them by the width of the eye region. The
maximum intensity of the AUs. A most likely failure due to vector of tracking errors is denoted as vector ".
appearance changes by the motion of an eye would be that tracking of the upper eyelid and the lower eyelid fails when 5.2.1 Sensitivity to Input Image Size and Sharpness
the distance between them closes, such as in blinking (AU45). When the size of the input eye region is small relative to the
Our system tracked blinking well, as shown in Table 8. actual size of the eye or the input image is not sufficiently
Tracking eye motion by matching an eye region increased sharp, the fine structure of the eye may not be sufficiently
Authorized licensed use limited to: Technische Informationsbibliothek (TIB). Downloaded on October 01,2025 at 09:16:22 UTC from IEEE Xplore. Restrictions apply.
MORIYAMA ET AL.: METICULOUSLY DETAILED EYE REGION MODEL AND ITS APPLICATION TO ANALYSIS OF FACIAL IMAGES
747
TABLE 8 Example Results for Motions
(a) Upper eyelid raising. (b) Gaze change and cheek raising. (c) Cheek raising. (d) Blinking and eyelid tightening. (e) Cheek raising and eyelid tightening.
visible. Image sharpness refers to large gain in the highfrequency components of an image. To evaluate system robustness to input image size and sharpness, we compared tracking error with respect to multiple sizes and sharpness of input eye region images. Sharpness of the input images was sampled by applying a high pass filter to the image sequences. We selected for analysis nine sequences based on the response: three sequences that had the strongest response, three the weakest, and three in halfway between these. To vary image size, we resampled the images into make
three levels: the original scale, 50 percent scale (0:5 Â 0:5), and quarter scale (0:25 Â 0:25). Eye motion tracking in the smaller scales used the same structure parameters as those used in the original scale. Table 9 shows an example of multiple scales of a particular eye region image. The table also shows the computation time for updating the model parameters (Pentium M, 1.6GHz, 768MB RAM, Windows XP, the average over 10 time trials). Fig. 10 shows the tracking error plotted against the widths of the image sequences. Tracking error up to about 10 percent of eye region width may have resulted from error in manual labeling. The most likely cause of small error in manual labeling was ambiguity of the boundary around the palpebral fissure (Fig. 11). We found that an eye region width of about 15 pixels was the margin under which tracking became impaired for the upper eyelid, lower eyelid, and the iris position. Above this value, performance was relatively robust with respect to both size and sharpness of the input eye region.
5.2.2 Effect of Eye Model Details
The eye region model defines many structural components
Fig. 9. A failure case with a bright and specular iris. The dashed circle to represent the diversities of eye structure and motion. To
indicates the correct position manually labeled and the solid circle systems output. The eyelids were correctly tracked, whereas the iris
investigate whether all are necessary, we systematically
mistakenly located at the dark eye corner.
omitted each component and examined the resulting
Authorized licensed use limited to: Technische Informationsbibliothek (TIB). Downloaded on October 01,2025 at 09:16:22 UTC from IEEE Xplore. Restrictions apply.
748
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 28, NO. 5, MAY 2006
TABLE 9 Computation Time for Multiple Image Resolution
change in tracking error. Table 10 shows the results of this comparison. When the model for double eyelid folds was omitted, tracking of the upper eyelid (Table 10a) was compromised. Omitting components for the appearance below the eye (Table 10b) and only the brightness region on the bulge (Table 10c) had similar effects. To achieve accurate and robust eye motion tracking for diverse eye appearances and motion, all the detailed components of the eye region model proven necessary.
In Table 10a, the tracking error  shows that tracking of the other parts of the eye model was also compromised without the model for double eyelid folds (the error for the upper eyelid curve u1, the lower eyelid curve l1, and the iris center x and y are shown in parentheses). This indicates that the model components support tracking accuracy as a whole and erroneous individualization of one component affected tracking accuracy of the other parts.
5.2.3 Sensitivity to Model Initialization
The eye region model is manually initialized in the first frame with respect to both the structure parameters s and the motion parameters mt. We observed that the initialization of the structure parameters (individualization of the model) dominantly affected the tracking results. To
Fig. 10. Sensitivity to image resolution. (a) Tracking error for the upper
eyelid. (b) Tracking error for the lower eyelid. (c) Tracking error for the Fig. 11. An example of ambiguous boundaries around the palpebral
iris center.
fissure.
Authorized licensed use limited to: Technische Informationsbibliothek (TIB). Downloaded on October 01,2025 at 09:16:22 UTC from IEEE Xplore. Restrictions apply.
MORIYAMA ET AL.: METICULOUSLY DETAILED EYE REGION MODEL AND ITS APPLICATION TO ANALYSIS OF FACIAL IMAGES
749
TABLE 10 Different Levels of Detail of the Model and Their Effects
(a) Double eyelid folds. (b) Bulge, infraorbital furrow, and reflection on the bulge. (c) Reflection on the bulge.
evaluate sensitivity of the system to initialization, we model individualized to an example of input eye region individually manipulated each structure parameter in turn images shown in Figs. 12a and 12c shows changes in while leaving the others fixed. Fig. 12b is the eye region tracking error when parameter duwas varied from 0.0 to 1.0
Authorized licensed use limited to: Technische Informationsbibliothek (TIB). Downloaded on October 01,2025 at 09:16:22 UTC from IEEE Xplore. Restrictions apply.
750
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 28, NO. 5, MAY 2006
Fig. 12. An example of individualization of the model and the sensitivity to the parameter changes. (a) Input eye region image. (b) Individualized eye model s ¼ fdu; f; db; dr; ri; Ir7g ¼ f0:5; 0:7; 0:71; 0:7; 0:5; 0:7g. (c) Sensitivity to parameter du. (d) Sensitivity to parameter f. (e) Sensitivity to parameter db. (f) Sensitivity to parameter dr. (g) Sensitivity to parameter ri. (h) Sensitivity to parameter Ir7.
while leaving other parameters fixed to f ¼ 0:7, db ¼ 0:71, dr ¼ 0:7, ri ¼ 0:5, and Ir7 ¼ 0:7. Figs. 12d, 12e, 12f, 12g, and 12h were similarly obtained. Each of the individualized structure parameters that provided stable tracking also locally minimized the tracking error. Only the parameter dr was sensitive to the initialization in this particular example (the tracking error rapidly increased for the slight change of dr). We also observed that parameters were intercorrelated. Fig. 13 shows a contour plot of tracking error against the changes of an example pair of structure parameters for the same image sequence used in Fig. 12. Nonlinearity is obvious, yet with weak linearity.
color of the iris, the width, boldness, and number of eyelid folds, the width of the bulge below the eye, and the width of the illumination reflection on the bulge. Eye motion includes the up-down action of the upper and lower eyelids and the 2D movement of the iris. This variation together with selfocclusion and change of reflection and shape of furrows and bulges has made robust and precise analysis of the eye region a challenging problem. To meticulously represent detailed appearance variation in both structural individuality and eye motion, we developed a generative eye-region model and evaluated its effectiveness by using it to analyze a large number of face image sequences from two independent
6 CONCLUSION
databases. The use of the detailed model led to better results than those previously reported. The system achieved precise
The appearance of the eyes varies markedly due to both tracking of the eyes over a variety of eye appearances and
individual differences in structure and the motion of the motions. Future work includes initialization of the eye region
eyelids and iris. Structural individuality includes the size and model by automatic registration.
Authorized licensed use limited to: Technische Informationsbibliothek (TIB). Downloaded on October 01,2025 at 09:16:22 UTC from IEEE Xplore. Restrictions apply.
MORIYAMA ET AL.: METICULOUSLY DETAILED EYE REGION MODEL AND ITS APPLICATION TO ANALYSIS OF FACIAL IMAGES
751
[13] I. Ravyse, H. Sahli, and J. Cornelis, “Eye Activity Detection and
Recognition Using Morphological Scale-Space Decomposition,”
Proc. IEEE Intl Conf. Pattern Recognition, vol. 1, pp. 5080-5083, 2000.
[14] S.H. Choi, K.S. Park, M.W. Sung, and K.H. Kim, “Dynamic and
Quantitative Evaluation of Eyelid Motion Using Image Analysis,”
Medical and Biological Eng. and Computing, vol. 41, no. 2, pp. 146-
150, 2003.
[15] R. Herpers, M. Michaelis, K.H. Lichtenauer, and G. Sommer,
“Edge and Keypoint Detection in Facial Regions,” Proc. IEEE Face
and Gesture Conf., pp. 212-217, 1996.
[16] H. Chen, Y.Q. Yu, H.Y. Shum, S.C. Zhu, and N.N. Zheng,
“Example Based Facial Sketch Generation with Non-Parametric
Sampling,” Proc. IEEE Intl Conf. Computer Vision, vol. 2, pp. 433-
438, 2001.
[17] S.P. Lee, J.B. Badler, and N.I. Badler, “Eyes Alive,” Proc. Intl Conf.
Computer Graphics and Interactive Techniques, pp. 637-644, 2002.
[18] X. Xie, R. Sudhakar, and H. Zhuang, “On Improving Eye Feature
Extraction Using Deformable Templates,” Pattern Recognition,
vol. 27, no. 6, pp. 791-799, June 1994.
[19] J. Deng and F. Lai, “Region-Based Template Deformable and
Masking for Eye-Feature Extraction and Description,” Pattern
Recognition, vol. 30, no. 3, pp. 403-419, Mar. 1997.
Fig. 13. A contour plot of tracking error against an example pair of the structure parameters: the intensity ratio of the iris model Ir7 versus the distance ratio between eyelid folds du. The other parameters were left fixed. The brighter region indicates larger error. White lines are the values individualized in Fig. 12.
[20] G. Chow and X. Li, “Towards a System for Automatic Facial Feature Detection,” Pattern Recognition, vol. 26, no. 12, pp. 17391755, Dec. 1993.
[21] A. Yuille, D. Cohen, and P. Hallinan, “Feature Extraction from Faces Using Deformable Templates,” Intl J. Computer Vision, vol. 8, no. 2, pp. 99-111, Aug. 1992.
[22] Y. Tian, T. Kanade, and J.F. Cohn, “Eye-State Detection by Local
ACKNOWLEDGMENTS
Regional Information,” Proc. Intl Conf. Multimodal User Interface,
pp. 143-150, Oct. 2000.
The authors would like to thank Jessie Van Swearingen and [23] L. Sirovich and M. Kirby, “Low-Dimensional Procedure for the
Karen L. Schmidt for their insights on the anatomy of the
Characterization of Human Faces,” J. Optical Soc. of Am., vol. 4,
eyes and Zara Ambadar, Nicki Ridgeway, Camilla Kydland, Lindsey Morris, and Simon Cohn for technical assistance. This work was supported by grants R01 MH51435 from the US National Institutes of Mental Health.
pp. 519-524, 1987. [24] M.A. Turk and A.P. Pentland, “Face Recognition Using Eigen-
faces,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 586-591, 1991. [25] I. King and L. Xu, “Localized Principal Component Analysis
T. Moriyama and J. Xiao were with Carnegie Mellon
Learning for Face Feature Extraction,” Proc. Workshop 3D Computer
University, Pittsburgh, Pennsylvania.
Vision, pp. 124-128, 1997. [26] J. Xiao, T. Moriyama, T. Kanade, and J.F. Cohn, “Robust Full-
Motion Recovery of Head by Dynamic Templates and Re-
REFERENCES
Registration Techniques,” Intl J. Imaging Systems and Technology, vol. 13, pp. 85-94, Sept. 2003.
[1] A. Kapoor, Y. Qi, and R.W. Picard, “Fully Automatic Upper Facial [27] B.D. Lucas and T. Kanade, “An Iterative Image Registration
Action Recognition,” Proc. IEEE Intl Workshop Analysis and
Technique with an Application to Stereo Vision,” Proc. Intl Joint
Modeling of Faces and Gestures, pp. 195-202, Oct. 2003.
Conf. Artificial Intelligence, pp. 674-679, 1981.
[2] T. Moriyama, T. Kanade, J.F. Cohn, J. Xiao, Z. Ambadar, J. Gao, [28] T. Kanade, J.F. Cohn, and Y. Tian, “Comprehensive Database for
and H. Imamura, “Automatic Recognition of Eye Blinking in
Facial Expression Analysis,” Proc. IEEE Face and Gesture Conf.,
Spontaneously Occurring Behavior,” Proc. IEEE Intl Conf. Pattern
pp. 46-53, 2000.
Recognition, pp. 78-81, Aug. 2002.
[29] P. Ekman, J. Hagar, C.H. Methvin, and W. Irwin “Ekman-Hagar
[3] Y. Tian, T. Kanade, and J.F. Cohn, “Recognizing Action Units for
Facial Action Exemplars,” Human Interaction Laboratory, Univ.
Facial Expression Analysis,” IEEE Trans. Pattern Analysis and
of California, San Francisco: unpublished data.
Machine Intelligence, vol. 23, no. 2, pp. 97-115, Feb. 2001.
[30] M. Pantic and L.J.M. Rothkrantz, “Expert System for Automatic
[4] Y. Matsumoto, T. Ogasawara, and A. Zelinsky, “Behavior
Analysis of Facial Expression,” Image and Vision Computing, vol. 18,
Recognition Based on Head Pose and Gaze Direction Measure-
no. 11, pp. 881-905, Aug. 2000.
ment,” Proc. IEEE/RSJ Intl Conf. Intelligent Robots and Systems, [31] J.J. Lien, T. Kanade, J.F. Cohn, and C. Li, “Detection, Tracking, and
pp. 2127-2132, 2000.
Classification of Subtle Changes in Facial Expression,” J. Robotics
[5] J. Zhu and J. Yang, “Subpixel Eye Gaze Tracking,” Proc. IEEE Intl
and Autonomous Systems, vol. 31, pp. 131-146, 2000.
Conf. Automatic Face and Gesture Recognition, pp. 131-136, May 2002. [32] Active Vision. A. Blake and A. Yuille eds., chapter 2, pp. 21-38,MIT
[6] J.G. Wang and E. Sung, “Study on Eye Gaze Estimation,” IEEE
Press, 1992.
Trans. Systems, Man, and Cybernetics, Part B, vol. 32, no. 3, pp. 332-
350, 2002.
[7] K. Fukuda, “Eye Blinks: New Indices for the Detection of
Deception,” Psychophysiology, vol. 40, no. 3, pp. 239-245, 2001.
[8] R. Gross, J. Shi, and J. Cohn, “Quo Vadis Face Recognition?” Proc.
Third Workshop Empirical Evaluation Methods in Computer Vision,
Dec. 2001.
[9] Facial Action Coding System. P. Ekman et al., eds., Research Nexus,
Network Research Information, Salt Lake City, Utah, 2002.
[10] P. Ekman and E. Rosenberg, What the Face Reveals, second ed. New
York: Oxford Univ. Press, 1994.
[11] S.B. Gokturk, J.Y. Bouguet, C. Tomasi, and B. Girod, “Model-
Based Face Tracking for View-Independent Facial Expression
Recognition,” Proc. IEEE Face and Gesture Conf., pp. 272-278, 2002.
[12] M. Pantic and L.J.M. Rothkrantz, “Automatic Analysis of Facial
Expression: The State of the Art,” IEEE Trans. Pattern Analysis and
Machine Intelligence, vol. 22, no. 12, pp. 1424-1445, Dec. 2000.
Authorized licensed use limited to: Technische Informationsbibliothek (TIB). Downloaded on October 01,2025 at 09:16:22 UTC from IEEE Xplore. Restrictions apply.
752
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 28, NO. 5, MAY 2006
Tsuyoshi Moriyama received the PhD degree in electrical engineering from Keio University, Japan, in 1999. He is an assistant professor in the Department of Information and Computer Science at Keio University. After being a JSPS research fellow at Institute of Industrial Science at the University of Tokyo, Japan, and a postdoctoral fellow at the Robotics Institute at Carnegie Mellon University, he joined Keio University in 2004. He has worked in many projects involved with multidisciplinary areas, including analysis/synthesis of emotional speech, automated summarization of movies, and automated facial expression analysis on computer vision. In addition to research activities, he has also dedicated himself to musical activities as a tenor, including performances with the Wagner Society Male Choir of Japan 1990-2000 and the Pittsburgh Camerata 2001-2003. He is a member of the IEEE and a member of the IEICE of Japan. He received IEICE Young Investigators Award 1998.
Takeo Kanade received the PhD degree in electrical engineering from Kyoto University, Japan, in 1974. He is a UA Helen Whitaker University Professor of Computer Science and Robotics at Carnegie Mellon University. After holding a junior faculty position in the Department of Information Science, Kyoto University, he joined Carnegie Mellon University in 1980, where he was the director of the Robotics Institute from 1992 to 2001. Dr. Kanade has worked in multiple areas of robotics: computer vision, multimedia, manipulators, autonomous mobile robots, and sensors. He has written more than 250 technical papers and reports in these areas and has more than 15 patents. He has been the principal investigator of more than a dozen major vision and robotics projects at Carnegie Mellon. He has been elected to the National Academy of Engineering and to American Academy of Arts and Sciences. He is a fellow of the IEEE, a fellow of the ACM, a founding fellow of the American Association of Artificial Intelligence (AAAI), and the former and founding editor of the International Journal of Computer Vision. He has received several awards, including the C & C Award, Joseph Engelberger Award, Allen Newell Research Excellence Award, JARA Award, Marr Prize Award, and FIT Funai Accomplishment Award. Dr. Kanade has served on government, industry, and university advisory or consultant committees, including Aeronautics and Space Engineering Board (ASEB) of National Research Council, NASAs Advanced Technology Advisory Committee, PITAC Panel for Transforming Healthcare Panel, the Advisory Board of Canadian Institute for Advanced Research.
Jing Xiao received the BS degree in electrical engineering from the University of Science and Technology of China in 1996, the MS degree in computer science from the Institute of Automation, Chinese Academy of Science in 1999, and the PhD degree in robotics from the Robotics Institute, Carnegie Mellon University in 2005. His research interests include computer vision, pattern recognition, image processing, human computer interface, computer animation, and related areas. He has authored or coauthored more than 30 publications in these areas. He is a member of the IEEE.
Jeffrey F. Cohn earned the PhD degree in clinical psychology from the University of Massachusetts in Amherst. He is a professor of psychology and psychiatry at the University of Pittsburgh and an Adjunct Faculty member at the Robotics Institute, Carnegie Mellon University. For the past 20 years, he has conducted investigations in the theory and science of emotion, depression, and nonverbal communication. He has co-led interdisciplinary and interinstitutional efforts to develop advanced methods of automated analysis of facial expression and prosody and applied these tools to research in human emotion and emotion disorders, communication, biomedicine, biometrics, and human-computer interaction. He has published more than 120 papers on these topics. His research has been supported by grants from the US National Institutes of Mental Health, the US National Institute of Child Health and Human Development, the US National Science Foundation, the US Naval Research Laboratory, and the US Defense Advanced Research Projects Agency. He is a member of the IEEE and the IEEE Computer Society.
. For more information on this or any other computing topic, please visit our Digital Library at www.computer.org/publications/dlib.
Authorized licensed use limited to: Technische Informationsbibliothek (TIB). Downloaded on October 01,2025 at 09:16:22 UTC from IEEE Xplore. Restrictions apply.