Looking while listening

This eye tracking task is a simplified version of a visual world paradigm, in which every trial presents pairs of familiar images/objects of roughly the same size (example: a chair and a bath), accompanied with a pre-recorded Dutch sentence that asks the participant to look at one of these images (e.g., ‘where is a chair?’). This paradigm – known as “looking while listening” – is developed by Anne Fernald (cf. Fernald, Zangl, Pottillo & Marchman, 2008). Note that we collect the data using an eye tracker (Tobii TX 300Hz), which measure gaze direction objectively, as opposed to video recordings of subject’s eye movements.

There are typically two key variables that you can obtain:
1) Reaction time (how quickly participants respond to verbal instruction: this includes only trials when participants are fixating the distracter image at target word onset and the DV is the latency it takes participants to switch to the target image) 
2) Accuracy (after word onset, proportion looking time to target relative to total looking time, e.g. how long they fixate target ‘chair’ relative to distracter ‘bath’//or total looking time.

Note that if researchers only report accuracy data, they tend to refer to this paradigm as the inter-modal preferential looking paradigm (cf. Golinkoff, Ma, Song & Hirsh-Pasek, 2013). More complex measures are growth curve analyses (with proportion target fixation on y-axis, and time as a continuous variable on the x-axis).

 

Participants:

Participants (age range: 2 year, 0 months - 4 years, 11 months, 30 days) came to the Child Research Center for half a day to participate in a battery of tasks. The described task is always the last task in the set of our four eye-tracking tasks (1. Social gaze; 2. Gap-overlap; 3; face pop-out; 4; looking while listening). 

 

Stimuli:

Visual stimuli: Objects were typical highly frequent items from categories considered to be familiar for most Dutch infants by the age of 15 months. There were 12 categories, a subset from the 20 categories used in Junge et al 2012, who presented these categories as examples of familiar items for Dutch monolingual 9-month-olds. We then created six object-pairs: objects always appeared with another object that was matched in semantic class (e.g., both food items); whose labels did not share speech sounds (e.g., the two labels did not both start with the sound /b/)/.

Pair

Category 1

Category 2

Class

Syllable

Phonetic W1

Phonetic W2

1

cookie

banana

food

2

/’kuki/

/bə ‘nɑ:n/

2

chair

bath

furniture

1

/stul/

/bɑt/

3

poes

baby

animate

1, 2

/pus/

/’be: bi:/

4

hond

koe

animate

1

/hɔnt/

/ku/

5

schoen

jas

clothing

1

/sxun/

/jɑs/

6

voet

hand

body parts

1

/vut/

/hɑnt/

 

Each category-pair was presented four times: twice W1 was the target, and twice W1 was the distracter (and W2 was the target). To avoid too much repetition of visual stimuli, (and to keep the experiment interesting for the child), we selected per category two images from the set used in Junge et al. 2012. Thus each picture was presented twice, but always paired with the same stimulus; and occurred once as target and once as distracter.

We used Photoshop CS6 to make the two images to appear roughly of the same size. Each object had to fall within an AOI of 730 x 820 pixels (see Left image below). Objects appeared on a dark grey background (see for example the Right image below).

YOUth cohort study

Auditory stimuli:

A female native speaker of Dutch (35yr; no children) produced the stimuli in a child-friendly manner in a sound-proof booth, recorded and digitized at 44.1KHz, mono-channel. For each category, we recorded multiple utterances of the type carrier sentence + x. (thus target words are recorded in natural contexts; not-spliced across different utterances; the speaker read the stimuli in a randomized order). Again, to keep the interest maximum, we varied the carrier sentence. There were three possible carrier sentences (“zie je een X” – ‘do you see a X ‘ /”kijk! Een X” – ‘Look! A X’/  “waar is een…” ‘where is a X?’), counterbalanced across trials. Via Praat, we edited the sound files, and set the mean intensity of all waveforms to 75 DB (maximum measured value in the pilot was 70.1 dB). We then added silence prior to the carrier sentence to make sure that each target starts at 3s.

YOUth cohort study

Each target word appeared twice in the experiment, always with different carrier sentences. The mean length of carrier sentences including target words was 2262.9 ms (range 1788 – 3129; SD 419). The mean length of target words was 894 ms (range 618 ‘koe_waarIsEen’ – 1194 ‘banaan_KijkEen’, SD 122).

 

Trial Structure:

Before every trial – a fixation star in centre of 55 x 55 pixels. The fixation star is on screen for at least 1.5 seconds. If 0.5 seconds of gaze samples is available in the 5x5° bounding box around the fixation star, the trial will commence. After 3 seconds the trial starts regardless of the available gaze data.

Trial: Picture with paired images are presented for 5s, together with the matched audio file. The audio file is manipulated such that the target onset word is presented at 3000ms from audio/picture onset.

 

Design

The experiment consisted of 24 trials (12 categories x target position (2) Left/Right). Each object-pair appeared twice, with the image once as distracter and once as target. There was a pseudo-random order, fixed for every child. We counter-balanced target position across object-pairs. There was no direct picture repetition or word repetition across trials. Targets appeared no more than twice at the same side in a row, and carrier sentences were also no more than twice repeated in a row.

trial

carrier

target

pair (left-right)

picture_token

position_target (1=l;2=r)

1

Where

banana

cookie - banana

1

2

2

See

chair

chair-bath

1

1

3

Look

baby

baby-cat

1

1

4

Where

dog

cow-dog

1

2

5

Look

coat

shoe-coat

1

2

6

See

foot

foot-hand

1

1

7

Look

cookie

banana-cookie

2

2

8

See

bath

bath-chair

2

1

9

Where

cat

cat-baby

2

1

10

Where

cow

dog-cow

2

2

11

See

shoe

coat-shoe

2

2

12

Look

hand

hand-foot

2

1

13

See

cat

baby-cat

1

2

14

Look

cow

cow-dog

1

1

15

Where

chair

bath-chair

2

2

16

Look

banana

banana-cookie

2

1

17

See

hand

foot-hand

1

2

18

Where

bath

chair-bath

1

2

19

Look

shoe

shoe-coat

1

1

20

See

cookie

cookie - banana

1

1

21

Where

foot

hand-foot

2

2

22

Look

dog

dog-cow

2

1

23

Where

baby

cat-baby

2

2

24

See

coat

coat-shoe

2

1


A video attention getter can be played whenever a child’s attention start waning. De current trial will be skipped from the moment the key is pressed for the attention getter.

 

General set-up

Infants sit in a car seat (10 month-olds; R3) approximately 65 cm away from the eyetracker. Testing occurs in a bright small room (300-350 Lux, Temperature 18-25 C), which does not have windows.

The Tobii TX300 eye-tracker (Tobii Technology, Stockholm, Sweden) with an integrated 23-inch monitor (1920 by 1080 pixels; 60 Hz refresh rate) was used to record eye movements. The Tobii TX300 ran at 300 Hz and communicated with MATLAB (version R2015b, MathWorks Inc., Natick, MA, USA) and the Psych Toolbox (version 3.0.12; Brainard, 1997) running on a MacBook Pro (OS X 10.9) via the Tobii SDK.

An operator-controlled calibration was run, which consisting of colored expanding and contracting spirals presented at the four corners and the center of the screen. The spirals were accompanied by a sound. A web-cam was used to monitor the participant. When the operator judged the participant to be looking at the spiral, a button was pressed, after which the spiral contracted and was calibrated. Details of the calibration stimuli are given in Hessels et al. (2015). The operator judged the calibration output from the Tobii SDK, after which a decision was made to accept the calibration or re-calibrate.

Once the child is calibrated, the experimenter closed the curtain that divided the room in two halves, and sat in the other half of the room, behind a desk with the stimulus MAC laptop. The experimenter could also see the child via a closed-circuit camera.

After the calibration, the experiment began.

 

Reference:

Hessels, R. S., Andersson, R., Hooge, I. T. C., Nystro€m, M., & Kemner, C. (2015). Consequences of eye color, positioning, and head movement for eye-tracking data quality in infant research. Infancy, 20, 601–633.