All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.

Related Documents

Share

Transcript

Absolute pitch information affects on-line melody recognition in non-AP perceivers
Sarah C. Creel (creel@cogsci.ucsd.edu)
University of California, San Diego, Department of Cognitive ScienceLa Jolla, CA 92093-0515
Melanie A. Tumlin (mtumlin@cogsci.ucsd.edu)
University of California, San Diego, Department of Cognitive ScienceLa Jolla, CA 92093-0515
Abstract
Perception of absolute pitch (AP) has often been regarded as aqualitatively distinct ability, yet recent work has demonstratedthat perceivers unable to label absolute pitches—the hallmark of true AP perception—still possess some knowledge of absolute pitch level. This is sometimes termed “implicit AP.”What distinguishes the two types of AP? In two experimentsusing a melody-learning paradigm and eye tracking, weexplore the pervasiveness and automaticity of implicit AP.We argue here that implicit AP reflects a phylogeneticallyolder encoding of pitch information shared with other species,while “true” AP primarily reflects perception of pitch chroma,which may be unique to humans.
Keywords:
absolute pitch, implicit absolute pitch, melodyrecognition, eye tracking, music perception
Introduction
Do all listeners experience sound, music, in the sameway? One major divergence from “normal” musicalexperience seems to be absolute pitch (AP), sometimescalled perfect pitch. It consists of the ability to explicitlylabel particular pitches without reference to an externalstandard, and is extremely rare (Takeuchi & Hulse, 1993).Due to its rarity and apparently distinct manner of processing sound, there has been much interest in AP perception, as a developmental phenomenon (Miyazaki &Ogawa, 2006), as a correlate of brain morphology (Keenan,Thangaraj, Halpern, & Schlaug, 2001), and as a potentiallygenetically-specified trait (Gregersen, Kowalsky, Kohn &Marvin, 2000). However, the exact phylogenetic srcins of AP perception remain somewhat mysterious, though it is acuriosity that animals tend to default to processing pitch inabsolute terms (e.g. MacDougall-Shackleton & Hulse,1996).
True AP perception.
Several factors seem to be conducive to acquiring AP perception. One is music education early in life (Takeuchi &Hulse, 1993). However, not all individuals who receiveearly musical training acquire AP perception, whichsuggests that other factors must be at work. Another postulated factor is language exposure: Deutsch andcolleagues (Deutsch, Henthorn, & Dolson, 2004) havesuggested that speakers of tone languages (e.g. Mandarin)are more likely to develop absolute pitch than non-tone-language-speakers, because language forces them to attendto pitch. Other researchers have implicated geneticinfluences, suggesting that an apparently higher likelihoodof AP perception in East Asians is likely hereditary(Gregersen et al., 2000). The ultimate outcome of thisinteraction of learning and biology is the effortless labelingof pitches according to pitch class—C, D, G
#
, E-flat, and soforth, with no need to hear an additional reference tone.Studies of memory encoding and interference in AP perceivers suggest that this ability is rapid and automatic: possessors can name individual pitches at much lower latencies than non-AP perceivers can calculate them basedon a reference tone.
Implicit AP.
Despite the rarity of AP perception, there have beennumerous recent reports (Levitin, 1994; Schellenberg &Trehub, 2003) of non-AP possessors demonstrating someknowledge of absolute pitch content in their musicalmemories. This has been termed implicit AP: listenerscannot label individual pitches in the way that AP perceiverscan, but perceive and produce music with some degree of absolute pitch accuracy. Levitin (1994) found thatindividuals without AP can reproduce the absolute pitch of a popular song relatively accurately. Also, individuals withoutAP are better than chance at discriminating between correctand pitch-shifted (1-2 semitones) versions of familiar songs(Schellenberg & Trehub), and infants can learn predictiveAP patterns but not relative pitch patterns (Saffran &Griepentrog, 2001). These studies suggest that under somecircumstances, listeners may store and recognize musicalmaterial in an absolute, rather than relative, form. Thisconverges with numerous other demonstrations that listenersencode other detailed aspects of musical “surface” inmemory, such as timbre (Schellenberg, Iverson &McKinnon, 1999) and articulation (Palmer, Jungers, &Jusczyk, 2001). These studies can be taken more broadly asevidence that listeners store acoustically accurate memories,and can discern whether a new instance does or does notmatch those memories. On this view, implicit AP perceptionis one of several consequences of having highly-detailedmusical memory. Nonetheless, there is much that is not understood aboutimplicit AP perception and how it differs from true AP perception. First, how automatic is implicit AP perception— is it something listeners only attend to effortfully duringrecognition? If implicit AP perception is instead relatively
2245
automatic, then effects of AP match to memory should beevident fairly rapidly. Second, how obligatory is implicit AP perception? Is it something that listeners can ignore when ina more relative-pitch processing mode? If AP recognition isobligatory, listeners should experience interference whenAP provides bad information for recognition.In the current pair of experiments, we delve into thenature and pervasiveness of implicit AP perception. Usingnon-AP-perceivers, we ask whether absolute pitchinformation is an obligatory part of musical recognition, andhow rapidly it is computed. For experimental control, wetrained listeners to recognize brief (5-note, 1-second) novelmelodies as “words” for unfamiliar pictures. After training,we tracked listeners’ eye movements to correct andincorrect pictures as they heard a melody. Importantly, eyemovements, which have been used for measuring wordrecognition for a number of years (e.g. Allopenna,Magnuson, & Tanenhaus, 1998), are a relatively implicitindex of recognition. Thus listeners’ eye movements should be minimally susceptible to conscious strategies.Results from eye tracking as words are spoken havedemonstrated that recognition is rapid and incremental (seeAllopenna et al. 1998). That is, during a spoken word,listeners are updating a set of guesses as to what word theyare hearing. This is reflected in eye movements. If twowords share sounds initially, such as mask and mast, alistener will be equally likely to look at either a displayed picture of a mask or one of a mast until the end of hearing“mask” spoken. However, if the two words are dissimilar,such as mask and flute, the listener hearing “mask” will look to the mask around the beginning of mask. The time pointwhere looks to two similarly-named pictures divergesuggests what sound information listeners are able to use inthe speech signal to identify words.The rationale in the current studies is similar. We taughtlisteners melodies with certain properties, and thenexamined how rapidly they fixated the correct picture (of two) when the melody “labels” did or did not overlap inabsolute pitch. Among the melodies learned (Figure 1),certain pairs of melodies matched each other until the end,with either identical absolute pitch (AP-same; CDEFG:CDEFE) or with absolute pitch level differing by 6semitones (AP-different; GFAGC; C
#
BD
#
C
#
F
#
). If listenerscan use AP information to recognize melodies, they shouldlook sooner to the correct object on AP mismatch trials thanAP match trials.
Experiment 1
In this experiment, we trained listeners to associate melodieswith pictures. We then measured looks to the pictures whilelisteners heard a melody “label” in real time to determinewhat cues listeners used to distinguish paired melodies.Some paired melodies matched in AP content, while the restonly matched in relative pitch terms. Importantly, all pairedmelodies were discriminable based on their final tone (in both relative and absolute terms), so that AP perception wasnot necessary to achieve perfect accuracy in the task.
Method
Participants.
N
=17 members of the UCSD community,with varied musical backgrounds, received course credit for experimental participation. One participant was excluded for possessing AP perception, and was replaced. The finalsample comprised 16 participants without AP perception.
Stimuli.
Participants learned 16 melodies (Figure 1) aslabels for 16 black-and-white pictures (examples in Figure2). Melodies were all drawn from the diatonic major set, andwere recorded in BarFly 1.73 software (Taylor, 1997;available at http://barfly.dial.pipex.com/) using theQuickTime instruments flute timbre. Melodies weredistributed across 4 pitch ranges: C4-G4, F
#
4-C
#
5, C5-G5,F
#
5-C6. There were 8 pairs of melodies, and each pair beganidentically and diverged at the last note. The final intervaldiffered in direction between the two members of a pair (one rose, one fell), to make melodies maximallydiscriminable. The onset of the last note in all melodies was500 milliseconds (ms).Figure 1. Sample melodies from Experiment 1. (a) AP-same pair; (b) AP-different pair.Figure 2. A sample test trial, with examples of twononsense pictures. The pictures here are labeled with AP-same melodies.For each pair, all intervals up to the final one wereidentical (same ratios between subsequent pitches).
2246
However, for half the pairs, the pitches as well as therelative pitch intervals were the same (AP-same), while for the other half, the actual pitches were separated by a tritoneand only the intervals were the same (AP-different). Thetritone separation was selected to be comparable to Saffranand Griepentrog’s (2001) AP experiment in which adultsfailed to learn to distinguish tone groups in a segmentationtask. This also served to minimize confusion of the key areafrom melody to melody, as closely-related pitch areas tendto be parsed according to the preceding context (Bartlett &Dowling, 1980). AP match/mismatch was counterbalancedacross melody pairs and participants.Four different quasirandom melody-to-shape assignmentswere used to control against spurious cross-modalsimilarities between particular melodies and particular pictures. Each trial (see Procedure) showed pictures in twoof four locations (upper left, upper right, lower left, lower right of screen); one of the two pictures was the target. Theother picture was either the picture for the paired melody, or the picture for a particular dissimilar melody. The two typesof “other” pictures occurred equally often, and each targetappeared equally often in each of the four screen locations.This circumvented potential strategies that learners coulduse to avoid having to learn the melodies themselves (e.g.,when picture X appears in the upper left, it is the target).
Procedure.
During training, participants were instructedthat they would see two pictures, would hear a melody, andwould be asked to select the picture that went with themelody. After each trial, the correct picture stayed onscreen, providing feedback as to correctness. Correctnesswas assessed after each 128-trial block. When a participantscored 90% correct in one block, they proceeded to the test phase. Testing was identical to training, except that nofeedback was provided.
Equipment.
All testing took place in a quiet room.Participants were seated in front of an Eyelink Remote eyetracker (SR Research, Mississauga, ON), as experimentalstimuli were presented via headphones on a Mac Minirunning OS 10.4 and Matlab 7.6. Matlab software waswritten by the first author using the PsychToolbox 3(Brainard, 1997; Pelli, 1997) and Eyelink Toolbox(Cornelissen, Peters & Palmer, 2002). PsychToolbox also provided calibration routines. The eye tracker itself wascontrolled by a networked PC running Eyelink software inDOS. Data were processed off-line using custom scripts inPython written by the first author.
Results
Accuracy.
During the first three blocks of testing (Figure3), a small but significant difference in error rates occurred between AP-matched trials and AP-mismatched trials(
p
=.002). Restricted just to paired trials, the effect did notreach significance (
p
=.1). This is an important result because it suggests that participants are not strategicallyusing pitch height as a cue to discern between melodies (or if they are, they are not very successful). There was aneffect of trial type (unpaired > paired) on error rates,
p
=.002, indicating that listeners found trials showing pictureswith similar melodies to be more difficult.Figure 3. Accuracy during the first three blocks of training and test, Experiment 1. Error bars are standarderrors.
Gaze fixation patterns.
As is done in word recognitiontasks, we defined a set of windows over which early effectsshould be visible, from 200 ms to 1000ms, and analyzedeach 100ms window for a divergence in looks to the target(the correct object for that melody) or the other objectonscreen (Figure 4). For AP-matched trials, the target-other difference did not reach significance until 700-800 ms (
p
=.007), the first conceivable time point at which listenersshould be able to discern these melodies (onset of last note plus the 200 ms delay that it takes to plan and carry out aneye movement; see Hallett, 1986). However, for RP-matched trials, this divergence point was somewhat sooner,at 600-700 ms (
p
= .0008). This means that eye movementson RP-matched trials must have been planned prior to the point that final-interval information was available (between400-500 ms).
Discussion
AP rapidly and implicitly aids listeners in melodyrecognition. While we cannot rule out deliberate strategyuse, if such strategies were in play, listeners did not seem to benefit: there was no significant reduction of errors for AP-different trials either before or during the test. That is,listeners were not significantly more accurate with AP-different melodies than with AP-same melodies. However,eye movements, which are difficult to consciously control,reflected more rapid recognition when an AP mismatch was present. This result supports the notion that non-AP- possessors both represent and use absolute pitch informationin recognizing melodies. Further, storage of this informationis consistent with a body of work demonstrating a high levelof acoustic detail in listeners’ musical representations, rather
2247
than representations that abstract over qualities such asmusical prosody or absolute pitch content.Figure 4. Looks to correct (thick lines) and incorrect (thinlines) pictures during test, Experiment 1. Error bars arestandard errors. **
p
<.01One potential counterexplanation of the above result isthat listeners were not using an absolute pitch frame of reference, but a frame of reference relative to the pitch rangeof the entire set of stimuli (a “relative range” strategy).Recall that four pitch ranges were used in Experiment 1.That is, instead of encoding the absolute pitches of thestimuli, perhaps they encoded the pitch range, for instance,as low, mid-low, mid-high, and high. This is difficult todiscriminate from absolute pitch even with a stronglydelayed test phase, because as soon as the test phase beginsthe pitch range is reestablished.We addressed this in Experiment 2 issue by requiringlisteners to use relative pitch information, and to look for interference from absolute pitch processing. We trained participants on melodies at one set of absolute pitch levels(around C4, around F
#
4, around C5, around F
#
5) and thentested them at a different pitch level (F
#
4, C5, F
#
5, C6). Wecreated a set of melodies where not two but three melodiesoverlapped until a final note. Two of the melodies were inone pitch range at training (such as F
#
4), while the third was pitched a tritone below at training (such as around C4). Thefirst test block continued this pattern. The second and thirdtest blocks, however, shifted all melodies up by exactly atritone. If listeners are encoding pitch relative to the range of the experimental stimuli, then performance after the shift tothe new pitch range should be equivalent to performance before the shift. If, instead, listeners are implicitly activatingabsolute-pitch matches, then trials which had not been AP-same during training should show interference at test (seeFigure 5).
Experiment 2
Method
Participants.
N=16 participants from the same pool asExperiment 1 completed the training and test phases.
Stimuli.
There were 18 different melodies consisting of 6sets of three (Figure 5), distinguished only at the final tone.Two of each set were identical in both RP and AP, while thethird melody was a tritone lower and matched only inrelative terms. All possible pairings of the melodies in a setof three yielded 1/3 AP-match trials and 2/3 AP-mismatchtrials. The onset of the final tone in each melody occurred at667 ms. Which melody in a triple was the low one wascounterbalanced across participants.
Procedure.
Training and testing proceeded similarly toExperiment 1, except that after one 72-trial block of testing,all melodies were shifted up in pitch by 6 semitones. Therewas a brief break before the shift during which participantsconversed with experimenters. The effect of this shift was toset up the potential for interference from AP memory. Thatis, if memories of melodies were encoded in AP terms, thencertain shifted melodies would now be competing with AP-identical traces of other melodies. In Figure 5b, for instance,if participants are comparing shifted melodies to APmemory traces, then shifted melody C’ is now an AP matchto (unshifted) melody A. Thus, interference for C’ trialswith A or B objects as competitors was expected to increaseafter the shift. This could manifest itself in terms of errors,fixation proportions, or both.
Equipment.
This was identical to Experiment 1.Figure 5. (a) Sample stimuli from Experiment 2. (b)Depiction of post-shift test trials. Gray indicates APmemories and black indicates the (shifted) melody presentedon a trial. Circled area shows a new AP competitor.
Results
Accuracy.
We measured accuracy both during and after training. In training, AP pairs showed numerically lower accuracy than the two RP pair types, which did not differ. Inthe first test block, AP pairs were nonsignificantly lessaccurate than the two RP pair types combined, which againdid not differ (srcinal AP: 85% correct; new AP: 92%;shifted RP: 91%). In post-shift block 1 (Figure 6), there wasa decided alteration in performance: while shifted-AP trial
2248
error rates and shifted-RP error rates stayed the same, new-AP trial performance declined (
p
=.005). One explanationmight be that these errors occurred primarily in the trialsimmediately after the shift, during which listeners might beexperiencing some confusion before adopting a RP perspective. Discounting this explanation, new-AP trialswere still below the unshifted baseline in the shifted block 2(
p
<.05), which presumably was ample time for recoveryfrom the pitch shift. Note that this is not a general increasein all errors, only the errors for trials with an AP competitor in memory.Figure 6. Accuracy changes in post-shift test blocks,Experiment 2. Error bars are standard errors. **
p
<.01,*
p
<.05
Gaze fixation patterns.
For the first (unshifted) block of test trials, correct looks on RP trials (that is, AP-mismatchedtrials) reached significance at 800-900 ms (
p
=.002), whilecorrect looks on AP trials did not reach significance until1100-1200 ms (
p
=.0009). This generally resembles the pattern in the first experiment, where AP-mismatchedmelodies were also recognized sooner. Fixations for the twoshifted test blocks in general patterned with error rates, butwere extremely noisy, presumably due to increaseduncertainty on the part of participants.
Discussion
In the current experiment, we tested whether participantswere able to make an AP shift without any cost torecognition, and found that they could not. While the shift toRP processing was overall quite good—performance waswell above chance (86%,
p
< .0001) after all melodiesunderwent a pitch shift of six semitones—participants werestill hindered when a shifted melody occurred at theabsolute pitch level of a previously-learned competitor melody, making more errors when a shifted melodyoverlapped in AP with an unshifted melody. This suggeststhat listeners were unable to ignore the AP content of thesrcinally-learned stimulus. Such a result is consistent withthe notion of obligatory use of acoustically accuraterepresentations.
General discussion
Implicit AP perception—access to accurate absolute pitchinformation in memory—appears to be rapid and obligatoryin non-AP perceivers. In Experiment 1, listeners’ ease of learning was not strongly affected by AP match or mismatch between melodies, yet listeners’ eye movementsreflected faster recognition of AP-different melodies(Experiment 1). Furthermore, listeners seemed unable totune out AP information in a context where relative pitch processing would be advantageous (Experiment 2),suggesting that accessing musical memory obligatorilyreferences absolute pitch content. Thus, both fixationlatencies (Experiment 1) and pitch-shift errors (Experiment2) reflect recognition costs associated with AP overlap. Allof this implies that absolute pitch content is a necessary andrelevant part of musical memory and the recognition of musical material.
Comparison to true AP.
True AP is automatic, obligatory, and involves labeling of pitch chroma. Implicit AP seems to share some of these properties. It is automatic in that listeners use it rapidly for on-line recognition of melodies (Experiment 1), and isobligatory in that listeners cannot ignore AP content in anRP task (Experiment 2). Only labeling seems to be absent inimplicit AP.Recall that one aspect of true AP perception is thatlisteners identify certain pitches—those related by integer multiples that are powers of 2—as the same pitch class or “chroma.” For instance, 220, 440, and 880 Hz are all perceived as the note A. This is salient enough to AP possessors that they occasionally make “octave errors,” suchas identifying an 880 Hz A as a 440 Hz A. There is noevidence that implicit AP contains chroma information. Infact, in Experiment 2, the RP-to-RP shifted trials were suchthat the melody closer in absolute pitch was correct, whilethe melody closer in chroma was incorrect. This did not leadto any increase in errors after the pitch shift. Thus, implicitAP may be more about pitch height than about pitchchroma.
Origins of implicit AP perception.
One account of this pattern of results is that humans beginwith the same pitch-processing abilities animals do—andthat what animals possess is essentially implicit AP perception. For instance, animals generally do not displayknowledge of chroma (though see Wright et al., 2000 for chroma use in a task tapping short-term memory). Animalsinstead show normally-distributed response distributions tolearned AP cues, without spikes at octave doublings (e.g.Cynx, 1993). Animals also show interference from APinformation when relative pitch processing becomesirrelevant (e.g. MacDougall-Shackleton & Hulse, 1996), asdid humans in Experiment 2. Whether animals process pitch
2249

Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.