🎯 A Distribution of Absolute Pitch Ability as Revealed by Computerized Testing
Music Perception (2009) Vol. 27, Issue 2, pp. 89–101
🎯 Key Finding
Absolute pitch is not binary — it exists on a continuum. Using a novel computerized test that measures both accuracy and reaction time, Bermudez & Zatorre found a continuous distribution of AP ability across 51 musicians, with a substantial number of intermediate performers (mean deviation 1–2 semitones) who fall between clear AP possessors and non-possessors. This challenges the occasional claim of bimodal distribution and demonstrates that scoring method and recruitment strategy heavily influence whether AP appears binary or continuous.
📊 Study Design
Participants
- N=51 musicians (39 females, 12 males)
- 27 self-reported as AP possessors, 24 as non-possessors (NAP)
- Average age: 23.1 years (SE = 0.52)
- Mean age of training onset: 6.1 years (SE = 0.32)
- Mean total training: 16.4 years (SE = 0.63)
- Recruited from music faculties of two Montreal universities
- All gave informed consent; approved by Montreal Neurological Institute ethics
- 2 participants reclassified based on performance (1 self-reported AP scored 2 SD below AP mean; 1 NAP scored 2 SD above NAP mean)
Stimuli
- 108 trials (36 notes × 3 intensity levels)
- Range: C3 to B5 (3 octaves)
- Based on A = 440 Hz equal temperament
- Each note presented at 3 intensities: −1, −4, and −7 dB (to prevent loudness cues)
- Synthetic multiharmonic tones: fundamental + ~9 harmonics (12 dB amplitude decrease between harmonics)
- Duration: 1 second (50 ms linear onset and offset ramps)
- 16-bit sampling depth
- Presented at ~75 dB SPL via headphones
🎮 The Computerized Test Interface
Chroma Response (Step 1)
- Circular wheel with 12 positions (all pitch classes equidistant from center)
- Cursor resets to center after each trial (no positional bias)
- All 12 responses equally accessible (unlike piano keyboard)
- No timeout: self-paced (allows measurement of natural response speed)
- Critical innovation: avoids keyboard familiarity confounds
Octave Response (Step 2)
- After selecting chroma, indicate which octave (C to B range)
- Color-coded bands in greyscale
- Emphasized that exact grand staff position not required
- “Simply click anywhere in the color band representing the octave”
- Allows analysis of chroma accuracy and octave accuracy separately
Design Innovations (5 Key Advances)
- Multiharmonic synthetic stimuli: Equally unfamiliar to all participants (unlike piano/violin tones)
- Both chroma and octave judgments collected: Separates pitch class from pitch height
- Precise reaction times: 10 ms resolution (identifies strategy differences)
- Circular response interface: All 12 responses equidistant (no keyboard bias)
- Self-paced: Captures natural response speed (no artificial time pressure)
📈 Results
AP vs NAP Performance
All differences highly significant: accuracy F(1,49) = 217.78, p < .001; MAD F(1,49) = 221.24, p < .001; RT F(1,49) = 30.59, p < .001. (*after reclassification of 2 outliers)
The Continuum of Ability
- Best performers: Mean deviation ~0 semitones, >95% correct (essentially perfect AP)
- Intermediate performers: MAD 1–2 semitones, 40–60% correct — substantial group
- Random performers: MAD ~3 semitones (flat response distribution)
- Not clearly bimodal: When considering MAD (not just % correct), the gap between groups is filled by intermediates
- 8 high-performing participants: MAD < 1 semitone, all responded within 6 seconds
Pitch Class Dependence (White-Key Advantage)
- Diatonic notes (C major) identified more accurately and quickly than non-diatonic
- Marginally significant interaction: F(1,49) = 3.72, p = .06
- Driven by AP group: white keys significantly more accurate (Tukey HSD)
- For RT: significant interaction F(1,49) = 23.12, p < .001 (AP faster on white keys)
- Pitch class A identified best overall: highest accuracy + fastest RT in AP group
- Replicates Miyazaki 1988, 1989, 1990; Takeuchi & Hulse 1991
- NAP participants also showed A advantage (likely using it as relative reference)
Reaction Time as Key Dimension
- Strong correlation: MAD vs log RT: r = .63, p < .0001
- Better performers respond faster (not trading speed for accuracy)
- NAP with low MAD show longer reaction times → suggests alternative strategies (relative pitch calculations)
- Combined index (MAD × logRT) still shows continuum, not bimodality
- RT captures what % correct misses: two participants both at 8.3% correct had vastly different MADs (1.44 vs 2.81) and RTs (4905 vs 9124 ms)
Split-Half Reliability
- MAD: r(49) = .99, p < .001
- Log RT: r(49) = .98, p < .001
- Exceptionally high reliability — test is internally consistent
- Suggests even a shortened version (54 trials) would be sufficiently accurate for screening
Age of Training Onset
- AP group started training significantly earlier: M = 5.46 years vs NAP M = 6.95 years; t(43) = 2.52, p = .02
- MAD significantly correlated with training onset: r(43) = .46, p = .01
- % correct correlated with onset: r(43) = .44, p = .002
- Log RT correlated with onset: r(42) = .40, p = .007
- Consistent with early-learning theory (Takeuchi & Hulse 1993)
💡 Why Scoring Method Matters
A critical methodological contribution of this paper is demonstrating how scoring strategy creates or destroys the appearance of bimodality:
- Strict % correct: Only counts exact chroma matches → artificially sharpens the gap between AP and non-AP, creating apparent bimodality
- Semitone credit (3/4 point for ±1 semitone): Diminishes distinction between high performers who perform perfectly and those who are consistently close
- Mean Absolute Deviation (MAD): Most informative single measure — captures consistency regardless of absolute accuracy, ranges 0 (perfect) to 3 (random)
- MAD + Reaction Time combined: Best overall descriptor, as it penalizes time-consuming alternative strategies (relative pitch calculations)
Conclusion: The bimodal distribution sometimes reported in AP literature may be an artifact of strict scoring methods, not a genuine feature of the underlying ability distribution.
💬 Critical Analysis
Strengths
- Novel computerized interface eliminates keyboard familiarity bias
- Captures both accuracy AND speed (multidimensional assessment)
- Exceptional split-half reliability (r = .99)
- Separates chroma and octave judgments
- Controls for loudness cues (3 intensity levels)
- Uses synthetic stimuli (no instrument-familiarity confound)
- Rigorous statistical analysis with reclassification of outliers
- Directly addresses long-standing controversy about bimodality vs continuum
Limitations
- N=51 (moderate sample; larger would better characterize intermediate zone)
- Self-selected participants (motivated musicians from university programs)
- Montreal recruitment only (cultural/linguistic homogeneity)
- No test-retest reliability data (only split-half)
- No non-musician comparison group
- Synthetic stimuli may underestimate AP for instrument-familiar tones
- No longitudinal component (single session snapshot)
Impact & Legacy
Foundational methodology paper. The computerized test designed here became the basis for subsequent AP research at the Zatorre Lab and influenced the field’s move toward more rigorous, multidimensional AP assessment. Bermudez & Zatorre’s approach directly inspired Mosing et al.’s 2025 systematic review calling for a standardized AP phenotyping task.
The key insight — that AP exists on a continuum with important intermediate levels — has been confirmed repeatedly and is now considered the consensus view in the field.
📚 Related Studies
🔗 Access & Resources
📄 Full Text
📊 Citation
- DOI: 10.1525/mp.2009.27.2.89
- Journal: Music Perception, Vol. 27, Issue 2, pp. 89–101
- ISSN: 0730-7829 (print), 1533-8312 (electronic)
- Affiliation: Montreal Neurological Institute & BRAMS Laboratory, McGill University