Extended Data Fig. 1: Canary song annotation and sequence statistics. | Nature

Extended Data Fig. 1: Canary song annotation and sequence statistics.

From: Hidden neural states underlie canary song syntax

Extended Data Fig. 1

a, Architecture of syllable segmentation and annotation machine learning algorithm. (i) A spectrogram is fed into the algorithm as a 2D matrix in segments of 1 s. (ii) Convolutional and max-pooling layers learn local spectral and temporal filters. (iii) Bidirectional recurrent LSTM layer learns temporal sequencing features. (iv) Projection onto syllable classes assigns a probability for each 2.7-ms time bin and syllable. b, After manual proofreading (see Methods), a support vector machine classifier was used to assess the pairwise confusion between all syllable classes of bird 1 (see Methods). The test set confusion matrix (right) and its histogram (left) show that in rare cases the error exceeded 1% and at most reached 6%. As the higher values occurred only in phrases with 10 s of syllables, this metric guarantees that most of the syllables in every phrase cannot be confused as belonging to another syllable class. Accordingly, the possibility of making a mistake in identifying a phrase type is negligible. c, Number of phrases per song for the three birds used in this study. d, Song durations for the three birds. e, Mean syllable durations for 85 syllable classes from three birds. Red arrow marks the duration below which all trill types have more than ten repetitions on average. f, Relation between phrase class mean duration (x axis) and standard deviation (y axis). Syllable classes (dots) of three birds are coloured according to bird number. Dashed line marks 450 ms (upper limit for the decay time constant of GCaMP6f). g, Range of mean number of syllables per phrase (y axis) for all syllable types with mean duration shorter than the x-axis value. Red line is the median, light grey marks the 25% and 75% quantiles and dark grey marks the 5% and 95% quantiles (blue line marks the number of syllable types contributing to these statistics). The red arrow matches the arrow in e. h, Cumulative histogram of trill phrase durations. i, All complex phrase transitions with second-order or higher dependence on song history context (for birds 1 and 2). For each phrase type that precedes a complex transition, the context dependence is visualized by a PST (see Methods). Transition outcome probabilities are marked by pie charts at the centre of each node. The song context (phrase sequence) that leads to the transition is marked by concentric circles, the innermost being the phrase type that preceded the transition. Nodes are connected to indicate the sequences in which they are added in the search for longer Markov chains that describe context dependence (for example, i–iii for first- to third-order Markov chains). Grey arrows indicate additional incoming links that are omitted for simplicity.

Source Data

Back to article page