Convergent vocal representations in parrot and human forebrain motor networks

Yang, Zetian; Long, Michael A.

doi:10.1038/s41586-025-08695-8

Article
Published: 19 March 2025

Convergent vocal representations in parrot and human forebrain motor networks

Nature volume 640, pages 427–434 (2025)Cite this article

8549 Accesses
1 Citations
405 Altmetric
Metrics details

Subjects

Abstract

Cortical networks for the production of spoken language in humans are organized by phonetic features^1,2, such as articulatory parameters^3,4 and vocal pitch^5,6. Previous research has failed to find an equivalent forebrain representation in other species^7,8,9,10,11. To investigate whether this functional organization is unique to humans, here we performed population recordings in the vocal production circuitry of the budgerigar (Melopsittacus undulatus), a small parrot that can generate flexible vocal output^12,13,14,15, including mimicked speech sounds¹⁶. Using high-density silicon probes¹⁷, we measured the song-related activity of a forebrain region, the central nucleus of the anterior arcopallium (AAC), which directly projects to brainstem phonatory motor neurons^18,19,20. We found that AAC neurons form a functional vocal motor map that reflects the spectral properties of ongoing vocalizations. We did not observe this organizing principle in the corresponding forebrain circuitry of the zebra finch, a songbird capable of more limited vocal learning²¹. We further demonstrated that the AAC represents the production of distinct vocal features (for example, harmonic structure and broadband energy). Furthermore, we discovered an orderly representation of vocal pitch at the population level, with single neurons systematically selective for different frequency values. Taken together, we have uncovered a functional representation in a vertebrate brain that displays unprecedented commonalities with speech-related motor cortices in humans. This work therefore establishes the parrot as an important animal model for investigating speech motor control and for developing therapeutic solutions for addressing a range of communication disorders^22,23.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on SpringerLink
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Comparative vocal behaviour and underlying neural activity.**

**Fig. 2: Different neural coding schemes in zebra finch RA and budgerigar AAC.**

**Fig. 3: Distinct mapping of vocal acoustic types in the AAC state space.**

**Fig. 4: Neural representation of vocal pitch in the AAC.**

Molecular specializations of deep cortical layer analogs in songbirds

Article Open access 30 October 2020

Discrimination of natural acoustic variation in vocal signals

Article Open access 13 January 2021

Syringeal vocal folds do not have a voice in zebra finch vocal development

Article Open access 19 March 2021

Data availability

Data used in this study have been uploaded to Zenodo (https://doi.org/10.5281/zenodo.14057061)⁷⁵. The Switchboard Telephone Speech Corpus is accessible through the Linguistic Data Consortium (https://catalog.ldc.upenn.edu/LDC97S62) and existing annotations of the corpus are provided by The Institute for Signal and Information Processing (https://isip.piconepress.com/projects/switchboard/). Source data are provided with this paper.

Code availability

The primary code used in this study is available at Zenodo (https://doi.org/10.5281/zenodo.14057061)⁷⁵.

References

Eichert, N., Papp, D., Mars, R. B. & Watkins, K. E. Mapping human laryngeal motor cortex during vocalization. Cereb. Cortex 30, 6254–6269 (2020).
Article PubMed Google Scholar
Bouchard, K. E., Mesgarani, N., Johnson, K. & Chang, E. F. Functional organization of human sensorimotor cortex for speech articulation. Nature 495, 327–332 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Chartier, J., Anumanchipalli, G. K., Johnson, K. & Chang, E. F. Encoding of articulatory kinematic trajectories in human speech sensorimotor cortex. Neuron 98, 1042–1054 (2018).
Article CAS PubMed PubMed Central Google Scholar
Mugler, E. M. et al. Differential representation of articulatory gestures and phonemes in precentral and inferior frontal gyri. J. Neurosci. 38, 9803–9813 (2018).
Article CAS PubMed PubMed Central Google Scholar
Lu, J. et al. Neural control of lexical tone production in human laryngeal motor cortex. Nat. Commun. 14, 6917 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Dichter, B. K., Breshears, J. D., Leonard, M. K. & Chang, E. F. The control of vocal pitch in human laryngeal motor cortex. Cell 174, 21–31 (2018).
Article CAS PubMed PubMed Central Google Scholar
Leonardo, A. & Fee, M. S. Ensemble coding of vocal control in birdsong. J. Neurosci. 25, 652–661 (2005).
Article CAS PubMed PubMed Central Google Scholar
Banerjee, A., Chen, F., Druckmann, S. & Long, M. A. Temporal scaling of motor cortical dynamics reveals hierarchical control of vocal production. Nat. Neurosci. 27, 527–535 (2024).
Article CAS PubMed Google Scholar
Zhao, L. & Wang, X. Frontal cortex activity during the production of diverse social communication calls in marmoset monkeys. Nat. Commun. 14, 6634 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Rose, M. C., Styr, B., Schmid, T. A., Elie, J. E. & Yartsev, M. M. Cortical representation of group social communication in bats. Science 374, eaba9584 (2021).
Article CAS PubMed PubMed Central Google Scholar
Hage, S. R. & Nieder, A. Single neurons in monkey prefrontal cortex encode volitional initiation of vocalizations. Nat. Commun. 4, 2409 (2013).
Article ADS PubMed Google Scholar
Farabaugh, S. M., Brown, E. D. & Dooling, R. J. Analysis of warble song of the budgerigar Melopsittacus undulatus. Bioacoustics 4, 111–130 (1992).
Article Google Scholar
Zhao, Z. et al. Anterior forebrain pathway in parrots is necessary for producing learned vocalizations with individual signatures. Curr. Biol. 33, 5415–5426 (2023).
Article CAS PubMed PubMed Central Google Scholar
Hile, A. G., Plummer, T. K. & Striedter, G. F. Male vocal imitation produces call convergence during pair bonding in budgerigars, Melopsittacus undulatus. Anim. Behav. 59, 1209–1218 (2000).
Article CAS PubMed Google Scholar
Moussaoui, B., Overcashier, S. L., Kohn, G. M., Araya-Salas, M. & Wright, T. F. Evidence for maintenance of key components of vocal learning in ageing budgerigars despite diminished affiliative social interaction. Proc. R. Soc. B 290, 20230365 (2023).
Article PubMed PubMed Central Google Scholar
Lavenex, P. B. Vocal production mechanisms in the budgerigar (Melopsittacus undulatus): the presence and implications of amplitude modulation. J. Acoust. Soc. Am. 106, 491–505 (1999).
Article ADS CAS PubMed Google Scholar
Egger, R. et al. Local axonal conduction shapes the spatiotemporal properties of neural sequences. Cell 183, 537–548 (2020).
Article CAS PubMed PubMed Central Google Scholar
Durand, S. E., Heaton, J. T., Amateau, S. K. & Brauth, S. E. Vocal control pathways through the anterior forebrain of a parrot (Melopsittacus undulatus). J. Comp. Neurol. 377, 179–206 (1997).
Article CAS PubMed Google Scholar
Paton, J. A., Manogue, K. R. & Nottebohm, F. Bilateral organization of the vocal control pathway in the budgerigar, Melopsittacus undulatus. J. Neurosci. 1, 1279–1288 (1981).
Article CAS PubMed PubMed Central Google Scholar
Striedter, G. F. The vocal control pathways in budgerigars differ from those in songbirds. J. Comp. Neurol. 343, 35–56 (1994).
Article CAS PubMed Google Scholar
Castellucci, G. A., Guenther, F. H. & Long, M. A. A theoretical framework for human and nonhuman vocal interaction. Annu. Rev. Neurosci. 45, 295–316 (2022).
Article CAS PubMed PubMed Central Google Scholar
Utianski, R. L. et al. Prosodic and phonetic subtypes of primary progressive apraxia of speech. Brain Lang. 184, 54–65 (2018).
Article PubMed PubMed Central Google Scholar
Guenther, F. H. Neural Control of Speech (MIT Press, 2016).
Flinker, A. et al. Redefining the role of Broca’s area in speech. Proc. Natl Acad. Sci. USA 112, 2871–2875 (2015).
Article ADS CAS PubMed PubMed Central Google Scholar
Dronkers, N. F. A new brain region for coordinating speech articulation. Nature 384, 159–161 (1996).
Article ADS CAS PubMed Google Scholar
Khanna, A. R. et al. Single-neuronal elements of speech production in humans. Nature 626, 603–610 (2024).
Article ADS CAS PubMed PubMed Central Google Scholar
Zann, R. A. The Zebra Finch: a Synthesis of Field and Laboratory Studies (Oxford Univ. Press, 1996).
Farabaugh, S. M., Linzenbold, A. & Dooling, R. J. Vocal plasticity in budgerigars (Melopsittacus undulatus): evidence for social factors in the learning of contact calls. J. Comp. Psychol. 108, 81–92 (1994).
Article CAS PubMed Google Scholar
Tu, H. W. & Dooling, R. J. Perception of warble song in budgerigars (Melopsittacus undulatus): evidence for special processing. Anim. Cogn. 15, 1151–1159 (2012).
Article PubMed PubMed Central Google Scholar
Simpson, H. B. & Vicario, D. S. Brain pathways for learned and unlearned vocalizations differ in zebra finches. J. Neurosci. 10, 1541–1556 (1990).
Article CAS PubMed PubMed Central Google Scholar
Yu, A. C. & Margoliash, D. Temporal hierarchical control of singing in birds. Science 273, 1871–1875 (1996).
Article ADS CAS PubMed Google Scholar
Heaton, J. T. & Brauth, S. E. Effects of lesions of the central nucleus of the anterior archistriatum on contact call and warble song production in the budgerigar (Melopsittacus undulatus). Neurobiol. Learn. Mem. 73, 207–242 (2000).
Article CAS PubMed Google Scholar
Plummer, T. K. & Striedter, G. F. Auditory responses in the vocal motor system of budgerigars. J. Neurobiol. 42, 79–94 (2000).
Article CAS PubMed Google Scholar
Elmaleh, M., Kranz, D., Asensio, A. C., Moll, F. W. & Long, M. A. Sleep replay reveals premotor circuit structure for a skilled behavior. Neuron 109, 3851–3861 (2021).
Article CAS PubMed PubMed Central Google Scholar
Chi, Z. & Margoliash, D. Temporal precision and temporal drift in brain and behavior of zebra finch song. Neuron 32, 899–910 (2001).
Article CAS PubMed Google Scholar
Chettih, S. N., Mackevicius, E. L., Hale, S. & Aronov, D. Barcoding of episodic memories in the hippocampus of a food-caching bird. Cell 187, 1922–1935 (2024).
Article CAS PubMed Google Scholar
Schneidman, E., Bialek, W. & Berry, M. J. 2nd Synergy, redundancy, and independence in population codes. J. Neurosci. 23, 11539–11553 (2003).
Article CAS PubMed PubMed Central Google Scholar
Mugler, E. M. et al. Direct classification of all American English phonemes using signals from functional speech motor cortex. J. Neural Eng. 11, 035015 (2014).
Article ADS PubMed PubMed Central Google Scholar
Mann, D. C., Fitch, W. T., Tu, H. W. & Hoeschele, M. Universal principles underlying segmental structures in parrot song and human speech. Sci. Rep. 11, 776 (2021).
Article CAS PubMed PubMed Central Google Scholar
Manabe, K., Kawashima, T. & Staddon, J. E. Differential vocalization in budgerigars: towards an experimental analysis of naming. J. Exp. Anal. Behav. 63, 111–126 (1995).
Article CAS PubMed PubMed Central Google Scholar
Manabe, K., Staddon, J. E. R. & Cleaveland, J. M. Control of vocal repertoire by reward in budgerigars (Melopsittacus undulatus). J. Comp. Psychol. 111, 50–62 (1997).
Article Google Scholar
Seki, Y. Cockatiels sing human music in synchrony with a playback of the melody. PLoS ONE 16, e0256613 (2021).
Article CAS PubMed PubMed Central Google Scholar
Brauth, S. E., Heaton, J. T., Shea, S. D., Durand, S. E. & Hall, W. S. Functional anatomy of forebrain vocal control pathways in the budgerigar (Melopsittacus undulatus). Ann. NY Acad. Sci. 807, 368–385 (1997).
Article ADS CAS PubMed Google Scholar
Moore, B. R. The evolution of learning. Biol. Rev. Camb. Philos. Soc. 79, 301–335 (2004).
Article PubMed Google Scholar
Fee, M. S., Shraiman, B., Pesaran, B. & Mitra, P. P. The role of nonlinear dynamics of the syrinx in the vocalizations of a songbird. Nature 395, 67–71 (1998).
Article ADS CAS PubMed Google Scholar
Abdel-Maksoud, F. M., Hussein, M. M., Hamdy, A. & Ibrahim, I. A. Anatomical, histological, and electron microscopic structures of syrinx in male budgerigars (Melopsittacus undulatus). Microsc. Microanal. 26, 1226–1235 (2020).
Article ADS CAS PubMed Google Scholar
Elemans, C. P. et al. Universal mechanisms of sound production and control in birds and mammals. Nat. Commun. 6, 8978 (2015).
Article ADS CAS PubMed Google Scholar
Suthers, R. A., Goller, F. & Pytte, C. The neuromuscular control of birdsong. Philos. Trans. R Soc. Lond. B 354, 927–939 (1999).
Article CAS Google Scholar
Wild, J. M. Neural pathways for the control of birdsong production. J. Neurobiol. 33, 653–670 (1997).
Article CAS PubMed Google Scholar
Manogue, K. R. & Nottebohm, F. Relation of medullary motor nuclei to nerves supplying the vocal tract of the budgerigar (Melopsittacus undulatus). J. Comp. Neurol. 204, 384–391 (1982).
Article CAS PubMed Google Scholar
Farabaugh, S. M. & Dooling, R. J. in Ecology and Evolution of Acoustic Communication in Birds (eds Kroodsma, D. E. & Miller, E. H.) Ch. 6 (Cornell Univ. Press, 1996).
Gaunt, A. S. & Gaunt, S. L. L. Electromyographic studies of the syrinx in parrots (Aves, Psittacidae). Zoomorphology 105, 1–11 (1985).
Article Google Scholar
Suthers, R. A., Goller, F. & Wild, J. M. Somatosensory feedback modulates the respiratory motor program of crystallized birdsong. Proc. Natl Acad. Sci. USA 99, 5680–5685 (2002).
Article ADS CAS PubMed PubMed Central Google Scholar
Sober, S. J., Wohlgemuth, M. J. & Brainard, M. S. Central contributions to acoustic variation in birdsong. J. Neurosci. 28, 10370–10379 (2008).
Article CAS PubMed PubMed Central Google Scholar
Bottjer, S. W., Miesner, E. A. & Arnold, A. P. Forebrain lesions disrupt development but not maintenance of song in passerine birds. Science 224, 901–903 (1984).
Article ADS CAS PubMed Google Scholar
Cheung, C., Hamiton, L. S., Johnson, K. & Chang, E. F. The auditory representation of speech sounds in human motor cortex. eLife 5, e12577 (2016).
Article PubMed PubMed Central Google Scholar
Solomon, S. G. & Lennie, P. The machinery of colour vision. Nat. Rev. Neurosci. 8, 276–286 (2007).
Article CAS PubMed Google Scholar
Castellucci, G. A., Kovach, C. K., Howard, M. A. 3rd, Greenlee, J. D. W. & Long, M. A. A speech planning network for interactive language use. Nature 602, 117–122 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Long, M. A. et al. Functional segregation of cortical regions underlying speech timing and articulation. Neuron 89, 1187–1193 (2016).
Article CAS PubMed PubMed Central Google Scholar
Hozhabri et al. Differential behavioral engagement of inhibitory interneuron subtypes in the zebra finch brain. Neuron https://doi.org/10.1016/j.neuron.2024.11.003 (2024).
Moll, F. W. et al. Thalamus drives vocal onsets in the zebra finch courtship song. Nature 616, 132–136 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Stringer, C. et al. Rastermap: a discovery method for neural population recordings. Nat. Neurosci. 28, 201–212 (2025).
Article CAS PubMed Google Scholar
Elmaleh, M., Yang, Z., Ackert-Smith, L. A. & Long, M. A. Uncoordinated sleep replay across hemispheres in the zebra finch. Curr. Biol. 33, 4704–4712 (2023).
Article CAS PubMed PubMed Central Google Scholar
Fukushima, M. & Margoliash, D. The effects of delayed auditory feedback revealed by bone conduction microphone in adult zebra finches. Sci. Rep. 5, 8800 (2015).
Article CAS PubMed PubMed Central Google Scholar
Tobin, C., Medina-Garcia, A., Kohn, G. M. & Wright, T. F. Does audience affect the structure of warble song in budgerigars (Melopsittacus undulatus)? Behav. Processes 163, 81–90 (2019).
Article PubMed Google Scholar
Godfrey, J. J. & Holliman, E. Switchboard-1 Release 2 LDC97S62. Linguistic Data Consortium https://doi.org/10.35111/sw3h-rw02 (1993).
Tchernichovski, O., Nottebohm, F., Ho, C. E., Pesaran, B. & Mitra, P. P. A procedure for an automated measurement of song similarity. Anim. Behav. 59, 1167–1176 (2000).
Article CAS PubMed Google Scholar
Pachitariu, M., Steinmetz, N., Kadir, S., Carandini, M., Harris, K. D. Kilosort: realtime spike-sorting for extracellular electrophysiology with hundreds of channels. Preprint at bioRxiv https://doi.org/10.1101/061481 (2016).
Rossant, C. et al. Spike sorting for large, dense electrode arrays. Nat. Neurosci. 19, 634–641 (2016).
Article CAS PubMed PubMed Central Google Scholar
Rolls, E. T. & Tovee, M. J. Sparseness of the neuronal representation of stimuli in the primate temporal visual cortex. J. Neurophysiol. 73, 713–726 (1995).
Article CAS PubMed Google Scholar
Goffinet, J., Brudner, S., Mooney, R. & Pearson, J. Low-dimensional learned feature spaces quantify individual and group differences in vocal repertoires. eLife 10, e67855 (2021).
Article CAS PubMed PubMed Central Google Scholar
Engelhard, B. et al. Specialized coding of sensory, motor and cognitive variables in VTA dopamine neurons. Nature 570, 509–513 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Ito, B. S., Gao, Y., Kardon, B. & Goldberg, J. H. A collicular map for touch-guided tongue control. Nature 637, 1143–1151 (2025).
Chang, L. & Tsao, D. Y. The code for facial identity in the primate brain. Cell 169, 1013–1028 (2017).
Article CAS PubMed PubMed Central Google Scholar
Yang, Z. & Long, M. A. Data and code for ‘convergent vocal representations in parrot and human forebrain motor networks’. Zenodo https://doi.org/10.5281/zenodo.14057061 (2025).

Download references

Acknowledgements

We thank A. Bahle, C. Elemans, A. Fishbein, D. Jin, C. Kemere, M. Schmidt and current members of the Long laboratory for comments on earlier versions of this manuscript, B. Cooper, T. Movshon and D. Sanes for helpful conversations and A. Paulson for technical assistance. This research was supported by Simons Collaboration on the Global Brain (M.A.L.).

Author information

Authors and Affiliations

NYU Neuroscience Institute and Department of Otolaryngology, New York University Langone Medical Center, New York, NY, USA
Zetian Yang & Michael A. Long
Center for Neural Science, New York University, New York, NY, USA
Zetian Yang & Michael A. Long

Authors

Zetian Yang
View author publications
Search author on:PubMed Google Scholar
Michael A. Long
View author publications
Search author on:PubMed Google Scholar

Contributions

Z.Y. and M.A.L conceived the study and designed the experiments; Z.Y. conducted the research; Z.Y. performed data analyses; Z.Y. and M.A.L. created the figures; Z.Y. wrote the initial draft of the manuscript; Z.Y. and M.A.L. edited and reviewed the final manuscript; M.A.L acquired funding and supervised the project.

Corresponding author

Correspondence to Michael A. Long.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature thanks Todd Troyer and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Acoustic representation of vocal repertoires across species.

a-c, Mean pitch, duration, and entropy variance from a human (a), a zebra finch (b), and a budgerigar (c). Data represent the same spoken words and vocal elements analyzed in Fig. 1a–f. d, Overlay of data from panels a through c.

Extended Data Fig. 2 Histology and single unit quality.

a, Histological validation of silicon probe placement in zebra finch RA. Left, sagittal slice illustrating probe position, marked by DiI applied to the probe shank (red). Right, expanded view of the boxed region highlighting RA (white arrows) on the right. b, Histological validation of silicon probe placement in budgerigar AAC. Left, brightfield image of a coronal slice showing the position of the silicon probe, marked by DiI applied to the probe shank (red). Right, expanded view of the boxed region indicating dorsal AAC (AACd) and ventral AAC (AACv) by white and yellow arrows, respectively. c, Example single-unit waveforms and autocorrelograms from budgerigar AAC recordings. Mean waveforms (red) are overlaid on single-trial waveform traces, which are represented as gray lines. In the autocorrelogram, orange vertical lines indicate the refractory period (−1 to 1 ms). d, e, Distributions of peak-to-peak amplitude (d) and the percentage of refractory period violations (e) across all AAC single units (n = 220 from 4 birds). Illustrations of the zebra finch in a are reproduced from ref. ⁶⁰, Elsevier.

Extended Data Fig. 3 Vocal motor responses in AAC and RA.

a, Mean firing rates of AAC neurons during vocalization and baseline for each individual budgerigar. Exact p-values displayed within each plot are from two-sided Wilcoxon signed-rank tests (n = 49, 81, 43, 47 neurons for each plot). b, Population-averaged responses aligned to the onset of vocal elements that are preceded by a silent gap of at least 100 ms. c, Population-averaged responses aligned to the offset of vocal elements followed by a silent gap of at least 100 ms. d, Spiking activity from a population of AAC neurons (BG3) during example vocalization, playback, and quiet baseline periods. e,f, Mean firing rates of AAC neurons during vocalization compared with playback (e) and during playback compared with baseline (f). Exact p-values displayed within each plot are from two-sided Wilcoxon signed-rank tests (n = 43, 47 neurons for the BG3 and BG4 plots, respectively). g, Bursting ratios of AAC neurons during vocalization and baseline for each individual budgerigar. h, Distribution of interspike intervals of AAC neurons pooled across budgerigars. i,j, Mean firing rates (i) and bursting ratios (j) of RA neurons during vocalization and baseline for each zebra finch. Illustrations of the zebra finch in i are reproduced from ref. ⁶⁰, Elsevier.

Extended Data Fig. 4 Neural similarity in budgerigar AAC is not driven by vocal reuse of elemental components.

a, Sonogram of an example zebra finch song recorded using the same piezoelectric microphone as used for budgerigar recordings. b, Schematic of the methodology for quantifying correlations between unique and repeated vocal segments. Each zebra finch syllable was decomposed into 20-ms segments using a sliding window with a 10-ms step size. Correlations for repeated segments (vocal reuse) were calculated between corresponding segments across 10 renditions (2 renditions shown for illustration) of each syllable. Correlations for unique segments (vocal similarity) were calculated between segments from different syllables. c, Distributions of correlation values between vocal similarity (orange) and reuse (blue), with 0.6 set as a threshold to distinguish between the two conditions. d, Distribution of correlation values across all vocal segments analyzed in Fig. 2 for each budgerigar, with percentage of correlations above 0.6 indicated on the right, representing potential vocal reuse. e, Correlation between spectral similarity and neural similarity matrices for an example budgerigar (left) and accompanying population data (right), excluding all cases with spectral correlation values exceeding 0.6. ρ denotes Spearman’s correlation value. P-values displayed within each plot are from two-sided permutation tests (n = 216,848, 1,113,918 pairs of vocal segments for each plot, respectively).

Extended Data Fig. 5 Single-neuron representation of vocal acoustic categories.

a, Distribution of fundamental frequencies in budgerigar vocalizations shows a clear minimum at 700 Hz. b-d, Distribution of the selectivity index for low frequency (b), harmonic (c), and noisy (d) vocalizations across the AAC population. Colored bins indicate neurons with a selectivity larger than 0.5 for each index. e, Venn diagram displaying the number of neurons with selectivity indices larger than 0.5 for each of the three categories.

Extended Data Fig. 6 Constructing the neural state space.

a, Schematic illustrating the construction of the neural state space. Population neural response within 25-ms time windows were mapped to a three-dimensional space using PCA (see Methods). b, Scree plots depicting the eigenvalues and explained variances for the first 10 neural PCs. Each line represents a single budgerigar. c, Harmonic index (indicated by color of each dot) plotted on state space diagram for all 4 budgerigars. Low frequency notes (that is, gray dots) not included in this analysis.

Extended Data Fig. 7 Comparison of vocal repertoires across budgerigars.

a, UMAP projections of latent representations of budgerigar vocalizations. Each plot represents an individual bird. b, Combined UMAP plot displaying data from all budgerigars.

Extended Data Fig. 8 Populational representation of acoustic categories and neural dynamics during vocal production in AAC.

a, Cumulative distribution of Euclidean distances between neural states within (colored line) or across (black line) acoustic categories for each budgerigar. Distance within each category is significantly smaller than between categories (P < 10⁻¹⁰⁰ for all tests). P-values shown within each plot are from two-sided Wilcoxon rank-sum tests (n = 66,349,440, 144,270,591, 671,739,531, 44,391,753, 1,854,597,921 instances of distances [1^st plot]; n = 613,278, 7,657,741, 57,829,635, 3,801,903, 101,861,288 instances of distances [2^nd plot]; n = 13,512,201, 318,087,253, 270,688,278, 94,772,028, 1,578,197,393 instances of distances [3^rd plot]; n = 20,196,190, 86,638,866, 710,136,141, 37,840,650, 1,317,021,524 instances of distances [4^th plot]). b, Mean difference between within-category and between-category neural distances derived from the true data (black vertical lines) plotted against the null distribution generated by randomly shifting spike times (see Methods). Exact p-values displayed in each plot are from one-sided permutation tests (n = 1,000 permutations). c, Cumulative distribution of Euclidean distances between neural states within (solid line) or across (dotted line) acoustic categories, based on a subsample of non-overlapping neural states (see Methods). P-values shown within each plot are from two-sided Wilcoxon rank-sum tests (n = 1,435,416 within and 2,902,569 between category instances of distances [1^st plot]; n = 119,859 within and 168,561 between category instances of distances [2^nd plot]; n = 1,113,510 within and 2,508,576 between category instances of distances [3^rd plot]; n = 1,326,700 within and 2,000,210 between category instances of distances [4^th plot]). d, Similar to c but with each within-category distances plotted separately as colored lines. P-values shown within each plot are from two-sided Wilcoxon rank-sum tests (n = 106,491, 216,811, 1,038,961, 73,153, 2,902,569 instances of distances [1^st plot]; n = 990, 11,628, 101,025, 6,216, 168,561 instances of distances [2^nd plot]; n = 21,528, 503,506, 445,096, 143,380, 2,508,576 instances of distances [3^rd plot]; n = 30,381, 125,250, 1,113,778, 57,291, 2,000,210 instances of distances [4^th plot]). P < 10⁻¹⁰⁰ for all tests. e, Neural trajectories for three example vocal elements spanning two acoustic types. Color indicates time within the element. Grayscale dots are the neural states corresponding to the three acoustic types (see labels in panel at left).

Extended Data Fig. 9 A neural frequency axis in AAC state space.

a, Pitches (white lines on top) were estimated from vocalizations with harmonic indices that exceeded the median of the distribution (shaded region). b, Distribution of estimated pitches for each individual budgerigar. c, Schematic of the calculation of the neural frequency axis. The axis was computed as the vector between the mean neural responses to low and high pitches in half of the data. The other half of the data was then used to calculate the relationship between pitch and projection onto the axis (see Methods). d, Slope of the relationship between pitch and projections onto the neural frequency axis for each budgerigar. Black vertical lines denote slope from data; gray distributions indicate slopes from pitch-shuffled data. Exact p-values displayed in each plot are derived from one-sided permutation tests (n = 5,000 permutations). e, Population neural responses at different time windows relative to vocalizations were mapped to a two-dimensional state space using PCA. Colored dots represent neural states underlying vocalizations with frequency estimated (color indicates pitch), and gray dots are associated with less harmonic vocalizations whose pitch was difficult to estimate. f, Performance of a linear model in predicting pitch values using scores of the first two PCs for neural responses calculated within a motor time window (−30 to −5 ms), compared to two other time windows with large temporal shifts (−125 to −100 ms and 100 to 125 ms). P-values displayed within the plot are from two-sided Wilcoxon rank-sum tests (n = 14,566 neural states) with Bonferroni correction. Boxplot elements: center line, median; box limits, upper and lower quartiles; whiskers, minimum and maximum values excluding outliers. g, Population-level pitch representation (Fig. 4c) displayed separately for calls and warble syllables. Colored dots denote neural states associated with pitches in calls (left) or warble syllables (right). Gray dots represent neural states for other vocalizations.

Extended Data Fig. 10 Pitch encoding in budgerigar AAC.

a, Schematic of pitch tuning index calculation. b, Pitch tuning in three example neurons from Fig. 4g, presented as binned scatter plots (gray) and averaged firing rates (mean ± SEM) across pitches (green) for all vocal elements (left), only calls (middle), and only warble syllables (right). Blue circles represent baseline firing rates. c, Relative contributions of four acoustic features in GLM models explaining single AAC neuron responses shown for all neurons (left), pitch-tuned neurons (pitch tuning index > 6; middle), and non-tuned neurons (pitch tuning index <6; right), respectively. Boxplot elements: center line, median; box limits, upper and lower quartiles; whiskers, minimum and maximum values excluding outliers. Exact p-values shown within each plot are from Bonferroni-corrected, two-sided Wilcoxon signed-rank tests (n = 220, 112, 108 neurons for each plot, respectively). d, Relationship between decoded and measured pitches across all calls for each budgerigar, shown as binned scatter plots. Linear decoders trained on warble syllables were used to predict pitch in calls. R values represent Pearson correlation between decoded and measured pitches.

Extended Data Fig. 11 Neurons in zebra finch RA do not appear to encode pitch within the song.

a-c, Burst-associated pitch values for an example RA neuron. Black boxes in a denote vocalization periods associated with bursting events (indicated by red bars in c), with vocal periods shifted by 12 ms relative to burst events. Pitch contour within each period is highlighted in b, with the mean pitch value marked by a green dot and a number next to the contour. d, Pitch values for all bursting events across the RA population. Each dot represents a burst event and its associated pitch. e, Relationship between firing rates and pitch for three example RA neurons, presented as binned scatter plots (gray) and averaged firing rates (mean ± SEM) across pitches (blue). f, Distributions of the range of burst-associated pitch values for observed data (blue) and for randomly sampled pitch values from d. g, Distributions of the variance of burst-associated pitch values for observed data (blue) and for randomly sampled pitch values from d. In f and g, Exact p-values shown on top are derived from two-sided Wilcoxon rank-sum tests (n = 116 samples).

Extended Data Fig. 12 Pitch decoding using population activity.

a, Relationship between decoded and measured pitches using all RA neurons for each zebra finch shown as binned scatter plots. b, Relationship between decoded and measured pitches using all AAC neurons for each budgerigar. In a and b, R values represent Pearson correlation between decoded and measured pitches.

Extended Data Fig. 13 Behavioral recording methodology.

a, Photograph of recording arena for budgerigars with one budgerigar tethered on each side. b, Example recordings from an omnidirectional ambient microphone (top) and two piezoelectric microphones implanted on two budgerigars (bottom). Piezoelectric microphones enable excellent isolation of individual vocalizations which are mixed in the ambient microphone.

Extended Data Table 1 Information for statistical tests

Full size table

Supplementary information

Reporting Summary

Peer Review File

Supplementary Video 1

Various perspectives of the neural state space The neural state space of AAC fully rotated to better display the three-dimensional representation of different acoustic categories. (Modified from Fig. 3f.)

Source data

Source Data Fig. 1

Source Data Fig. 2

Source Data Fig. 3

Source Data Fig. 4

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yang, Z., Long, M.A. Convergent vocal representations in parrot and human forebrain motor networks. Nature 640, 427–434 (2025). https://doi.org/10.1038/s41586-025-08695-8

Download citation

Received: 16 July 2024
Accepted: 23 January 2025
Published: 19 March 2025
Issue Date: 10 April 2025
DOI: https://doi.org/10.1038/s41586-025-08695-8

This article is cited by

Bird brains help scientists to unveil the secrets of speech
- Joshua Neunuebel
Nature (2025)