Fig. 7: Reconstructing novel faces from single neurons.

a Responses of 159 neurons (grey circles) in face-patch area AM were recorded while two primates viewed 62 novel faces. One-to-one match was found between each model unit (pink circles) and a corresponding single neuron. Linear regression (blue arrow) was used to decode the responses of each individual model latent unit (pink circles) from the activations of its corresponding single neuron. The pre-trained model decoder was used to reconstruct the novel face. Face image reproduced with permission from Chang et al.6. b Cosine distance between real standardised latent unit responses and those decoded from single neurons are significantly smaller for β-VAE compared to baseline models and the “gold standard” provided by the AAM model (all p < 0.05, single-sided Welsch’s t-test; AE p = 0.0195, VAE p = 1.0596e–14, PCA p = 4.0370e–25, ICA p = 1.5758e–12, VGG (PCA) p = 5.0467e–24, Classifier p = 0.0, AAM p = 2.3216e–14, VGG (raw) p = 4.8840e–20). Circles, median cosine distance per model (β-VAE, n = 51; VGG (raw), n = 22; Classifier, n = 64; VAE, Variational AutoEncoder36, n = 50; AE, AutoEncoder35, n = 50; VGG (PCA)32, n = 41; PCA, n = 41; ICA, n = 50; AAM, active appearance model3, n = 21). Boxplot centre is median, box extends to 25th and 75th percentiles, whiskers extend to the most extreme data that are not considered outliers, outliers are plotted individually. Source data are provided as a Source Data file. c β-VAE can decode and reconstruct novel faces from 12 matching single neurons. The reconstructions are better than those from the closest baselines, AE and VAE, which required 30 and 27 neurons for decoding, respectively. The β-VAE instance was chosen to have the best disentanglement quality as measured by the UDR score; AE and VAE instances were chosen to have the highest reconstruction accuracy on the training dataset. Face images reproduced with permission from Ma et al.53 and Phillips et al.55.