Introduction

Alcohol use disorders (AUD) represent a worldwide unmet medical emergency accounting for 3 million yearly deaths globally. The AUD prevalence in Europe and in the Americas is estimated at 8–12% across genders, with peaks of 17% in the USA and >22% in eastern Europe; similar prevalence is found in developing and westernizing countries [1].

Alcohol dependence has long been recognized as a clinical syndrome [2, 3]. However, notwithstanding the bulk of time and resources dedicated to study this condition, and the fairly advanced biological characterization achieved [4], approved AUD medications show limited and heterogeneous efficacy [5,6,7]. In addition, in spite of the high expectancy and effort devoted, promising targets have failed to translate into clinical use [8,9,10,11]. These failures brought despondency in professionals working in the field of psychiatric disorders in general, and of addiction in particular, and led major pharmaceutical companies to abandon drug research and development (R&D) in this field as it is perceived as at high risk of business [12]. Indeed, psychiatric drugs are the longest and more expensive to develop, and those with the lowest clinical approval success rate [13, 14].

To address this stagnation, AUD experts are now highlighting the need for AUD personalized diagnosis and treatment approaches [15, 16]. Noteworthy, AUD develops only in 10–20% of the subjects (vulnerable users) consuming excessive amounts of alcohol [17]. Moreover, AUD diagnosis is a combination of two to eleven diagnostic criteria, therefore patients population is a heterogeneous group of individuals that cannot be expected to develop the same disease trajectory and to equally respond to treatments. There is, therefore, an increasing awareness that treatments should be tailored toward subgroups of patients rather than targeting the disease as a whole [6, 15, 16, 18]. While this perspective is widely acknowledged at the clinical level, at the preclinical level new groundworks are needed to rethink the contribution of animal models to drug R&D by the implementation of individual heterogeneity approaches.

AUD preclinical studies have traditionally adopted a group-based approach and focused on primary symptoms, like relapse, consumption, and craving [19]. Conversely, little attention has been paid to the heterogeneity in vulnerability traits of AUD patients [20,21,22,23]. In a classical group-based experiment, different doses of drugs are usually tested on behaviorally homogeneous groups of animals. Here, individual subjects are required to meet two basic inclusion criteria: (i) show voluntary drug self-administration/seeking, and (ii) show homogeneous level of self-administration/seeking. This approach offers two unquestionable advantages: first, it guarantees the high construct validity of the test by optimizing the conditions for the experimental drug to show its efficacy and facilitating the sorting between compounds with and without therapeutic potential; second, it is cost-effective, it requires a small number of animals, and it is relatively quick. On the other hand, group-based approaches are blind to individual variability. Single animals that do not respond to the treatment are often perceived as outliers, laying on the right tail of an otherwise left-shifted distribution, and thus often excluded from the analysis. However, when preclinically successful compounds enter clinical trials, even those trials with stricter inclusion criteria, they will be tested on cohorts of patients showing a heterogeneity that was not accounted for at preclinical level.

Here, we propose that implementing individual variability approaches in preclinical drug screening would help refining the prediction of clinically successful new treatments for AUD. To substantiate our idea, we adopted a proof-of-concept reversed translational pharmacology approach in which two drugs were investigated. One of these drugs, (Memantine) has failed to reduce alcohol consumption in clinical settings [24,25,26]. The other drug (Naltrexone) is FDA/EMA approved for AUD, and it reduces alcohol consumption in patients [27, 28]. These drugs were tested on alcohol self-administration in a population of NIH genetically heterogeneous stock rats (HS) subjected to a multi-symptomatic screening of alcohol consumption and seeking. We chose the HS rats because this diverse outbred line was created to reflect the genetic heterogeneity of the human population [29,30,31], and it exhibits greater diversity than other outbred lines [31]. Moreover, HS rats were already demonstrated to show heterogeneity in addiction-like behaviors [32,33,34,35] and response to pharmacological treatments in addiction models [36].

Our primary working hypothesis was that Naltrexone, but not Memantine, would selectively reduce alcohol self-administration in HS rats. Then, as in clinical practice subpopulations of treatment responder and non-responder individuals do exist, our secondary hypothesis was that our individual-based approach would enable us to identify subgroups of rats that respond (treatment-responder) and do not respond (non-responder) to treatment. Finally, we run a retrospective analysis to profile alcohol seeking behavior in treatment-responder and non-responder rats and seek for features that can be back-translated in clinic.

Materials and methods

Animals

One hundred NIH-HS male and female rats (n = 50/sex); Wake Forest University, (North Carolina, USA) weighed 325–375 and 175–225 g respectively at the beginning of the experimental procedures. Rats were housed 4 per cage according to their sex in a room with reversed (12:12 h) light/dark cycle and controlled temperature (20–22°) and humidity (45–50%). Food (4RF18, Mucedola, Italy) and tap water were provided ad libitum. After a week of acclimatation to the new environment, rats were handled 5 min a day for an additional week before the beginning of behavioral screening. All procedures were conducted during the dark phase of the light/dark cycle.

Ethics statement

Procedures were in adherence with the European Community Council Directive for the Care and Use of Laboratory Animals and the National Institutes of Health Guide for the Care and Use of Laboratory Animals; Italian Ministry of Health approval 1D580.47.

Drugs

Alcohol solutions were prepared diluting 95%v/v alcohol (Carsetti, Italy) with tap water. Saccharin (Sigma-Aldrich) was dissolved in tap water.

Naltrexone hydrochloride (Sigma-Aldrich) was dissolved in saline solution.

Memantine 20 mg coated tablets (Memantina Mylan, Mylan Italia S.r.l.) were purchased from the local pharmacy store and suspended in tap water.

Self-administration apparatus

Operant training and testing were performed in self-administration (SA) stations (Med Associates, St Albans, VT, USA) equipped with two retractable levers located in the front panel of the chamber, and a house light on the opposite wall. Pressing on the lever designated as “active” according to the programmed reinforcement schedule activated a syringe pump delivering 0.1 ml of solution in a drinking reservoir (volume capacity 0.3 ml) located between the levers. Pressing on the other lever, designated as “inactive”, was recorded but had no scheduled consequences. Each SA chamber was enclosed in sound-attenuating ventilated cubicles. Behavioral sessions were controlled and recorded by a windows compatible PC equipped with Med-PC-5 software (Med Associates).

Experimental timeline

The experimental timeline (Fig. 1A) is composed by three consecutive phases: (1) screening for alcohol related behaviors, (2) effect of Memantine and Naltrexone on alcohol self-administration, (3) effect of Memantine and Naltrexone on saccharin SA.

Fig. 1
figure 1

A Schematic representation of experimental timeline. 3BC 3-bottle choice, ASA alcohol self-administration, PR progressive ratio, Cued Reinst cued reinstatement, MEM memantine, NTX naltrexone, SSA saccharin self-administration. B Effect of Memantine (n = 83) on alcohol self-administration at whole population level. The intermediate and highest dose of Memantine significantly reduced alcohol self-administration. Groups mean ± 95%CI: 0.0 mg/kg, 17.99 ± 1.65; 6 mg/kg, 16.64 ± 2.3; 12.0 mg/kg, 12.27 ± 1.81; 25.0 mg/kg, 6.024 ± 1.445. C Effect of Naltrexone (n = 82) on alcohol self-administration at whole population level. Both doses of Naltrexone significantly reduced alcohol self-administration. Groups mean ± 95%CI: 0.0 mg/kg, 18.87 ± 1.59; 0.3 mg/kg, 13.21 ± 1.61; 1.0 mg/kg, 10.67 ± 1.35. Bars represent the Mean ± 95%CI of number of rewards earned in a 30 min session. Statistical significance: *p < 0.05 and ***p < 0.001 vs vehicle.

Phase 1: Screening of alcohol related behaviors

This phase is composed of three consecutive tests, 3-bottle choice (3BC) drinking, motivation and cued reinstatement of alcohol seeking.

Three-bottle choice alcohol drinking

Alcohol naïve rats were given ad libitum access to three bottles containing water, 5 and 10%v/v alcohol solutions respectively. The 3BC screening lasted for fifteen days. To acclimate rats to alcohol taste, in the first five days the bottles were provided in the common home cages. The following ten days rats were housed in single cages to monitor their individual liquid intake. To avoid the development of side preference, the position of the three bottles was changed every day. Bottles weight was recorded every 24 h. At the end of the 3BC screening, rats were housed back into common cages with their original cage mates, and alcohol solutions were no longer provided in the home cage.

Motivation for alcohol expressed under progressive ratio contingency

Following 3BC screening, rats were trained to self-administer 10% alcohol (v/v) in 30-min daily sessions under FR1 schedule of reinforcement. Sessions were run daily for five days a week. Each active lever response resulted in the delivery of 0.1 ml of 10% alcohol solution followed by a 5 s time out (TO) during which further lever presses were not reinforced. The house light was illuminated contingently with the reinforcement delivery and remained on during the TO.

After seventeen sessions of FR1 training the motivation for alcohol was tested in three consecutive sessions run with a progressive ratio (PR) schedule of reinforcement in which the number of active lever presses required to obtain a single reward increased according to the following order: 1, 2, 3, 4, 6, 8, 10, 12, 16, PR + 4 [21]. Session stopped when more than 30 min had elapsed from the last reward earned. The last ratio completed was defined as the break point (BP) and used as a measure of motivation for alcohol.

Cued reinstatement of alcohol seeking

Following PR tests, rats were subjected to six additional alcohol SA (ASA) baseline sessions under FR1 contingency before entering extinction of alcohol seeking. During daily 30-min extinction sessions both levers were extracted but lever pressing was not reinforced by alcohol delivery, house light illumination, and pump activation. When responding at the previously active lever dropped in average below ten responses for three consecutive days, the cued reinstatement test began.

The cued reinstatement test was run the day after the last extinction session. In this test, the first lever press delivered one reward and illuminated the house light like in a standard ASA session. For the remainder of the session, active lever presses illuminated the house light but did not result in alcohol delivery.

Phase 2: Effect of Memantine and Naltrexone on ASA

At the end of the cued reinstatement test, FR1 self-administration of 10% alcohol was re-baselined and the effect of Memantine and Naltrexone on ASA was evaluated in two separate tests. All rats were subjected to both treatment tests. The drug (either Memantine or Naltrexone) administered in the first test was counterbalanced between rats. A new ASA baseline was established between the first and the second treatment tests.

Group allocation and blinding: Memantine and Naltrexone starting groups were equal in size and balanced in sex prevalence, with no further arbitrary constrains in group allocation. Group allocation, treatment delivery, data collection, and data analysis were performed by independent operators.

Effect of Memantine on ASA

On test days, rats received oral administration of Memantine (0.0, 6.0, 12.0 and 25.0 mg/kg) in a volume of 4 ml/kg, one hour before SA session [37]. Higher doses of Memantine were excluded because in preliminary studies they completely abolished ASA (Supplementary Fig. S1), preventing the observation of a dose/response relationship. Each rat received all Memantine doses or its vehicle in a within-subject counterbalanced order; size and sex ratio was balanced between the latin-square subgroups. Test sessions were run every fourth day until each rat had received the whole dose range. The first day after the test, rats remained in their home cage, while the second and third days they were subjected to ASA baseline.

Effect of Naltrexone on ASA

This test was identical to Memantine test except that the rats received subcutaneous administration of NTX (0.0 0.3 or 1.0 mg/kg) in a volume of 1 ml/kg, 30 min before SA session [38].

Phase 3: Effect of Memantine and Naltrexone on saccharin self-administration (SSA)

At the end of the Phase 2, rats were trained to self-administer 0.2%w/v of saccharin under FR1 schedule of reinforcement before the effect of Memantine and Naltrexone on SSA was tested in conditions identical to that described for ASA on Phase 2.

Effect of Memantine on SSA

This test was run in the same condition described for Memantine in Phase 2.

Effect of Naltrexone on SSA

This test was run in the same condition described for Naltrexone in Phase 2.

Statistical analysis

Data were analyzed by one-way, two-way, or three-way ANOVA with factors for the respective analysis indicated in conjunction with its results. Dunnett’s or Sidak post-hoc analysis followed ANOVA when appropriate.

Power Analyses: initial sample size was estimated to allow detecting small Cohen’s f effect size for both treatments using conventional power = 0.8 and α = 0.05 as parameters (Fig. S2). Observed Cohen’s f effect size were calculated a posteriori using ANOVA results and conventional power = 0.8 as parameters.

Data analysis structure

Step 1, Inclusion/exclusion criterion: The ultimate goal of this study was to identify and characterize subjects not responding to treatments. However, subjects showing very low ASA level under vehicle condition could be identified as false non-responder due to a floor effect. To prevent having false non-responders we applied the following inclusion criterion: the number of rewards earned under vehicle treatment being higher than the “ASA baseline average minus one Standard Deviation” threshold.

Step 2: The effect of Memantine and Naltrexone on ASA was analyzed at whole population level.

Step 3: For each drug rats were allocated into two groups, later identified as Responders and Non-Responders. To this purpose, we used the difference in rewards between the vehicle and each treatment dose to allocate individual rats into clusters with different sensitivity to drug effects using a k-mean approach. The k number of clusters was determined for each drug using cluster silhouette analysis as described in Supplemental Material.

Step 4: we compared the effect of Memantine and Naltrexone on ASA and SSA between the clusters identified by k-means analyses. Only Naltrexone demonstrated a specific efficacy in reducing ASA, thus all subsequent analyses were retrospectively performed exclusively on the Naltrexone group (see results and discussion for the rationale to do so).

Step 5: The difference between the observed and expected prevalence of male and female subjects in the k clusters showing different sensitivity to Naltrexone were verified by Pearson’s χ2 crosstabulation analysis.

Step 6: A factor analysis of 3BC, motivation and cued reinstatement was performed using principal component extraction followed by normalized varimax rotation. Finally, we compared the performance of the k Naltrexone clusters in 3BC, motivation and cued reinstatement tests.

Statistical significance was set at conventional p = 0.05.

Results

Naltrexone, but not Memantine, selectively reduced ASA in HS rats

One male rat was excluded from drug treatment tests due to health issues. Filtering rats for the inclusion criterion (Fig. S3) left 83 rats (41 males) in the Memantine experiment and 82 rats (40 males) in the Naltrexone experiment. Males and females showed similar responses to treatments (Fig. S4), therefore they were pooled to analyze the effect of Memantine and Naltrexone on ASA. We set out analyzing drug treatments at population level. Memantine significantly affected ASA [F(3, 246) = 42.6; p < 0.0001; f = 0.262], specifically 12 mg/kg (p < 0.05) and 25 mg/kg (p < 0.0001) of Memantine significantly reduced the number of alcohol rewards earned (Fig. 1B). Similarly, Naltrexone significantly reduced ASA [F(2, 162) = 48.84; p < 0.0001; f = 0.232] at both 0.3 mg/kg (p < 0.0001) and 1.0 mg/kg (p < 0.0001) doses (Fig. 1C). Neither drug affected the responses at the inactive control lever (Fig. S5).

Next, we used the difference in rewards between the vehicle and each treatment dose to allocate individual rats into clusters with different sensitivity to drug effects using a k-mean approach; based on the cluster silhouette (Fig. S6), k = 2 was applied to Memantine and Naltrexone data separately to allocate rats into clusters MEM1 or MEM2 (Fig. 2A) and NTX1 or NTX2 (Fig. 2B) respectively. Importantly, to validate the robustness of the k-mean clusters, we also applied hierarchical clustering for both drugs. Memantine and naltrexone hierarchical clusters were embedded for 92.8% and 87.8% respectively into k-means clusters (Fig. S7), confirming the robustness of the k-mean’s clustering approach adopted here.

Fig. 2: Effect of Memantine and Naltrexone treatment on alcohol self-administration in clusters based on individual effect of the drugs on ASA.
figure 2

A, B Silhouette plot of K = 2 clustering of individual response to A Memantine and B Naltrexone on alcohol self-administration. Horizontal bars represent individual silhouette coefficient, the vertical dashed line indicates the k = 2 cluster silhouette score. C Memantine reduced alcohol self-administration in cluster MEM1 (n = 35) at all doses tested and in cluster MEM2 (n = 48) only at the highest dose. Groups mean ± 95%CI: MEM1 0.0 mg/kg, 23.89 ± 2.45; MEM1 6 mg/kg, 15.57 ± 4.12; MEM1 12.0 mg/kg, 8.06 ± 2.65; MEM1 25.0 mg/kg, 3.943 ± 1.779; MEM2 0.0 mg/kg, 13.69 ± 1.24; MEM2 6 mg/kg, 17.42 ± 2.71; MEM2 12.0 mg/kg, 15.33 ± 2.11; MEM2 25.0 mg/kg, 7.542 ± 2.082. D Both doses of Naltrexone reduced alcohol self-administration in both NTX1 (n = 47) and NTX2 (n = 35) clusters. Groups mean ± 95%CI: NTX1 0.0 mg/kg, 21.7 ± 2.09; NTX1 0.3 mg/kg, 11.17 ± 1.95; NTX1 1.0 mg/kg, 9.511 ± 1.787; NTX2 0.0 mg/kg, 15.06 ± 1.84; NTX2 0.3 mg/kg, 15.94 ± 2.51; NTX2 1.0 mg/kg, 12.23 ± 2.06. Bars represent the Mean ± 95% CI of number of rewards earned in a 30 min session. Statistical significance: *p < 0.05 and ****p < 0.0001 vs vehicle.

When we compared the effect of Memantine treatment between the two Memantine response clusters, we observed no significant effect of clusters [F(1, 81) = 0.29; p > 0.05] but there was a significant effect of dose [F(3, 243) = 62.8; p < 0.0001] and dose by cluster interaction [F(3, 243) = 26.98; p < 0.0001; f = 0.264]. Dunnett’s post-hoc analysis revealed that all doses of Memantine reduced ASA in cluster MEM1, while only the highest dose was efficacious in cluster MEM2 (Fig. 2C). Similarly, when Naltrexone data were analyzed, we found no effect of cluster [F(1, 80) = 0.06; p > 0.05] but there was an overall effect of dose [F(2, 160) = 56.3; p < 0.0001] and dose by cluster interaction [F(2, 160) = 36.0; p < 0.0001; f = 0.259]. All Naltrexone doses decreased ASA in cluster NTX1 while only the highest dose was efficacious in cluster NTX2 (Fig. 2D).

These results indicated that clusters MEM1 and NTX1 included subjects showing high sensitivity, while MEM2 and NTX2 subjects showed low sensitivity, to Memantine and Naltrexone respectively.

To verify whether the effects observed were specific to alcohol or generalized to natural rewards, the same doses of Memantine and Naltrexone were tested on SSA. In thirty-four rats we had a partial data loss because of a power cut during a Memantine on SSA test session. Therefore, this test was analyzed by mixed-effect ANOVA. Mixed-effect two-way ANOVA found an overall effect of dose [F(3, 172) = 77.11; p < 0.0001] but no effect of cluster [F(1, 80) = 0.93; p > 0.05] or dose by cluster interaction [F(3, 172) = 0.37; p > 0.05]. This result indicated that Memantine affected SSA in both MEM1 and MEM2 clusters. However, to check whether the two-way analysis was blind to shifts in D/R curves between the two clusters, we run secondary analyses on MEM1 and MEM2 data separately. One-way mixed-effect ANOVAs confirmed an overall effect of Memantine on SSA in both MEM1 [F(3, 77) = 33.8; p < 0.0001; f = 0.226] and MEM2 [F(3, 95) = 45.4; p < 0.0001; f = 0.173] clusters, and Dunnett’s post-hoc analyses confirmed that the three doses of Memantine decreased SSA in both MEM1 and MEM2 clusters (Fig. 3A, B). When we analyzed the effect of Naltrexone on SSA, we found an overall effect of dose [F(2, 138) = 17.5; p < 0.0001] and of cluster [F(1, 69) = 5.4; p < 0.05] but no dose by cluster interaction [F(2, 138) = 0.7; p > 0.05]. Similarly to Memantine, we run secondary analyses on NTX1 and NTX2 data separately. One-way ANOVAs confirmed an overall effect of NTX on SSA in both NTX1 [F(2, 82) = 9.2; p < 0.001; f = 0.29] and NTX2 [F(2, 56) = 9.2; p < 0.001; f = 0.286] clusters. However, in this case Dunnett’s post hoc revealed that in cluster NTX1 only the highest dose of NTX significantly reduced SSA (Fig. 3C), whereas in cluster NTX2 both doses resulted efficacious (Fig. 3D).

Fig. 3: Effect of Memantine and Naltrexone treatment on saccharin self-administration in clusters based on individual effect of the drugs on ASA.
figure 3

A All doses of Memantine reduced saccharin self-administration in MEM1 cluster. Groups mean ± 95%CI: 0.0 mg/kg, 54.18 ± 10.69; 6 mg/kg, 28.17 ± 12.8; 12.0 mg/kg, 17.61 ± 7.22; 25.0 mg/kg, 8.912 ± 3.94. B All doses of Memantine reduced saccharin self-administration in and MEM2 cluster. Groups mean ± 95%CI: 0.0 mg/kg, 54.83 ± 8.6; 6 mg/kg, 37.48 ± 11.74; 12.0 mg/kg, 24.76 ± 9.07; 25.0 mg/kg, 11.17 ± 5.076. C Only by the highest dose of Naltrexone reduced Saccharin self-administration in cluster NTX1. Groups mean ± 95%CI: 0.0 mg/kg, 53.24 ± 9.42; 0.3 mg/kg, 45.93 ± 8.97; 1.0 mg/kg, 36.67 ± 7.1. D Saccharin self-administration was reduced by both Naltrexone doses in cluster NTX2. Groups mean ± 95%CI: 0.0 mg/kg, 70.48 ± 14.11; 0.3 mg/kg, 56.41 ± 11.64; 1.0 mg/kg, 52.07 ± 9.41. Bars represent the Mean ± 95% CI of number of rewards earned in a 30 min session. Statistical significance: **p < 0.01, ***p < 0.001, and ****p < 0.0001 vs vehicle.

Altogether, these results indicated that Naltrexone but not Memantine selectively reduced alcohol seeking, specifically in cluster NTX1. To rule out the possibility that Memantine failed to show selectivity toward alcohol because the doses tested were too high, we also tested a lower Memantine dose (2.0 mg/kg) that, however, did not reduce ASA neither in cluster MEM1 nor in cluster MEM2 (Fig. S8) confirming that the drug lacked selective efficacy towards alcohol.

Importantly, repeating the analyses excluding the thirty-four rats affected by the power cut issue, both Memantine (Fig. S9) and Naltrexone (Fig. S10) results were confirmed, corroborating their robustness.

These results are in line with the heterogeneous clinical efficacy of Naltrexone and the lack of clinical efficacy of Memantine. Specifically, while Memantine failed to show alcohol selective efficacy in both Memantine clusters (i.e. neither cluster can be characterized as Memantine responder), the NTX1 and NTX2 clusters corresponded to Naltrexone Responders (NTX-R) and Non-Responders (NTX-NR) patient respectively and were accordingly renamed. Further analyses were therefore conducted exclusively on Naltrexone clusters to (i) explore the extent to which the behavioral profile distinguishing NTX-R to NTX-NR also reverse translates from clinic and (ii) provide novel insights to back translate to clinic.

Male and female subjects show different propensity to fall into Naltrexone response clusters

The number of males in the NTX-R cluster was 1.78-fold the number of females, conversely, the number of females in the NTX-NR cluster was 2.5-fold the number of males (Fig. 4A). The observed count in the sex by Naltrexone cluster crosstabulation (Fig. 4B) significantly deviated from the expected count (χ2 = 9.98; p < 0.01); indicating that males were more likely than females to show response to Naltrexone treatment and vice versa.

Fig. 4: Prevalence of male and female rats in NTX-R and NTX-NR clusters.
figure 4

A Relative (y-axis) and absolute (numbers within bars) frequencies of male and female rats in NTX-R and NTX-NR clusters. B Sex by Naltrexone clusters crosstabulation showing the difference between observed and expected count for each sex by cluster combination.

Alcohol paired cues failed to reinstate alcohol seeking in male NTX-NR

Next, we run retrospective analyses to compare the performance of male and female NTX-R and NTX-NR rats in three tests of alcohol seeking that were acquired before any treatment: alcohol intake in 3BC drinking, motivation to obtain alcohol in three consecutive PR sessions, and cued reinstatement of alcohol seeking. The three behaviors laid on separate components (Fig. 5A), indicating that they represented three different subdimensions of alcohol seeking. No differences between Naltrexone response clusters were observed in 3BC drinking (alcohol concentration [F(1, 78) = 0.7, p > 0.05]; sex [F(1, 78) = 10.7, p < 0.01]; cluster [F(1, 78) = 0.02, p > 0.05]; alcohol concentration by sex by cluster [F(1, 78) = 0.0006, p > 0.05]; Fig. 5B), and in break point for alcohol (session [F(2, 156) = 7.6, p < 0.001]; sex [F(1, 78) = 0.5, p > 0.05]; cluster [F(1, 78) = 1.5, p > 0.05]; session by sex by cluster [F(2, 156) = 1.4, p > 0.05]; Fig. 5C).

Fig. 5: Comparison of alcohol drinking, motivation and cued reinstatement between NTX-R (female n = 17. male n = 30) and NTX-NR (female n = 25. male n = 10) cluster.
figure 5

A Factor analysis using principal component extraction followed by varimax normalized rotation of alcohol drinking in three-bottle choice test (3BC), break point in progressive ratio test (PR) and cued reinstatement test (Cue). B Male and female NTX-R and NTX-NR rats showed similar level of daily alcohol intake at both 5% and 10% alcohol concentration in three-bottle choice test. Groups mean ± 95%CI: Female alcohol 5%, NTX-R 2.098 ± 0.947, NTX-NR 1.661 ± 0.603; Female alcohol 10%, NTX-R 1.25 ± 0.451, NTX-NR 1.461 ± 0.561; Male alcohol 5%, NTX-R 0.988 ± 0.277, NTX-NR 0.84 ± 0.495; Male alcohol 10%, NTX-R 0.842 ± 0.235, NTX-NR 1.321 ± 1.162. C Male and female NTX-R and NTX-NR rats showed similar level of motivation expressed by the break point reached under PR contingency over three consecutive PR sessions. Groups mean ± 95%CI: Female session1, NTX-R 8.353 ± 2.596, NTX-NR 8.6 ± 1.39; Female session2, NTX-R 7.294 ± 2.403, NTX-NR 6.04 ± 1.127; Female session3, NTX-R 6.118 ± 2.418, NTX-NR 4.96 ± 0.964; Male session1, NTX-R 8.667 ± 1.722, NTX-NR 6.9 ± 2.79; Male session2, NTX-R 7.7 ± 1.157, NTX-NR 6.8 ± 1.679; Male session3, NTX-R 7.2 ± 1.191, NTX-NR 7.0 ± 1.686. D Alcohol olfactory, taste and visual cues reinstated alcohol seeking in both NTX-R and NTX-NR female rats and in NTX-R male rats but not in NTX-NR male rats. Groups mean ± 95%CI: Female Ext, NTX-R 8.176 ± 2.392, NTX-NR 8.72 ± 2.163; Female Cue, NTX-R 17.47 ± 5.7, NTX-NR 19.36 ± 3.1; Male Ext, NTX-R 7.433 ± 2.375, NTX-NR 12.1 ± 12.86; Male Cue, NTX-R 23.43 ± 6.685, NTX-NR 13.6 ± 6.025. Bars represent the Mean ± 95%CI of respectively B) average 24 h alcohol intake, C) break point, and D) number active lever presses produced in a 30 min session on the last day of extinction (Ext) and on cued reinstatement test (Cue). Statistical significance: *p < 0.05, ***p < 0.001, and ****p < 0.0001 vs Ext same group.

Analysis of cued reinstatement found an overall effect of session (extinction vs cue) [F(1, 78) = 38.8, p < 0.0001], no overall effect of sex [F(1, 78) = 0.1, p > 0.05] and cluster [F(1, 78) = 0.1, p > 0.05], but a significant session by cluster [F(1, 78) = 4.8, p < 0.05] and session by sex by cluster [F(1, 78) = 6.9, p = 0.01; f = 0.385] interaction. Sidak post hoc analysis showed that alcohol paired cues reinstated alcohol seeking in all groups except males belonging to cluster NTX-NR (Fig. 5D). Inactive lever response was not affected by any factor (Fig. S11).

Discussion

In this work we conducted a proof-of-concept study to test the hypothesis that the implementation of an individual variability approach in a preclinical setting can help predicting the clinical efficacy of potential treatments for drug abuse. More specifically, we hypothesized that a drug that showed efficacy in cross-sectional preclinical tests of ASA, but then failed to reduce alcohol consumption in clinical settings, would also fail to show efficacy in a preclinical test of ASA accounting for individual variability and genetic heterogeneity. As a positive control we predicted that, under the same conditions, a drug FDA/EMA approved for the treatment of AUD would confirm its efficacy on ASA. To this purpose, we chose Memantine as a test drug because of its lack of efficacy on alcohol drinking in clinical tests [24,25,26] while showing efficacy in preclinical studies [39,40,41,42]. Naltrexone was chosen as positive control drug due to its efficacy in reducing alcohol drinking both in the clinical practice [27, 28] and in preclinical settings [43, 44]. Our choice fell on Memantine over other drugs that failed to reduce alcohol drinking in patients because Memantine is currently prescribed in humans for diseases other than alcohol dependence [45]. Therefore, its failure to reduce alcohol consumption cannot be attributed to a lack of pharmacological activity in humans or to safety issues. Similarly, we tested the two drugs on ASA rather than alcohol craving and relapse prevention because Memantine showed efficacy in reducing alcohol craving in humans [24, 46, 47] and therefore the lack-of-efficacy assumption of our proof-of-concept study was not met by relapse tests. Finally, while a choice had to be made and we selected Memantine and Naltrexone as negative and positive drug in our test, we recognize that alternative options, both in terms of positive control in lieu of Naltrexone (e.g., acamprosate [28]) and negative controls in lieu of Memantine (e.g., quetiapine [10] or levetiracetam [11]), were available and should be the topic of future studies aimed at further validating the generalizability of the hypothesis tested here.

Memantine but not Naltrexone lacked selectivity in reducing alcohol consumption

Our data indicated that Memantine reduced ASA, in the 6–25 mg/kg range in the cluster showing higher sensitivity to Memantine (cluster MEM1) and at the highest dose in the cluster showing lower sensitivity to the drug (cluster MEM2). However, in neither case this effect was selective for alcohol as the same doses also reduced self-administration of the natural reinforcer saccharin. In addition, when we completed the dose/response curve with 2.0 mg/kg of Memantine we found a lack of efficacy toward ASA, demonstrating that the dose range tested was enough to completely characterize Memantine’s pharmacological profile. Our results are in contrast with studies adopting a homogeneous group-based approach, in which Memantine showed selectivity toward alcohol over natural rewards [41, 42], thus confirming our hypothesis and not supporting the use of Memantine by itself to treat alcohol drinking. Conversely, as expected we observed an alcohol-selective effect of the positive control drug Naltrexone. Here, both drug doses reduced ASA in cluster NTX1, with the lowest dose resulting alcohol selective. Conversely, in cluster NTX2 only the highest dose of naltrexone reduced ASA, but this dose was not alcohol selective. While we already had an inefficacious dose for cluster NTX2, expanding the naltrexone dose range to lower doses would have allowed finding the inefficacious dose of Naltrexone also for cluster NTX1. However, this would not provide additional information on naltrexone selectivity and was therefore beyond the scope our study.

Altogether, and in the context of the published literature, our reverse translational pharmacology study indicates that preclinical experimental settings accounting for individual variability show a finer sensitivity than group-based studies in predicting clinical outcomes.

To check out the robustness and representativeness of the k-mean clusters, we also run a hierarchical clustering, in which the number of clusters were not set a priori. Hierarchical clustering of memantine efficacy yielded five clusters. Noteworthy, more than 90% of rats fell into two large clusters that corresponded de facto to k-mean cluster MEM1 and MEM2. Hierarchical clustering of Naltrexone efficacy yielded seven clusters. In this case, 90% of the population fell into two large clusters and one intermediate-size cluster. Interestingly, one large cluster included exclusively rats that k-mean identified as NTX2, the intermediate cluster included exclusively rats that k-mean identified as NTX1, and the second large cluster was for 77.5% composed of rats identified as NTX1 by k-mean. This brings two important information: first, hierarchical clustering separated rats into two families of clusters that corresponded the k = 2 k-mean clusters, confirming the robustness and reliability of the k = 2 k-mean approach for Naltrexone as well; second, the fact that the k-mean cluster NTX1 corresponded to two hierarchical clusters could indicate that NTX1 might be further separated into subgroups.

It is important to note that Memantine and Naltrexone were intended here as tools to proof a concept rather than being the primary focus of the study. In this view, we purposedly chose to administer Memantine alone and not in combination with other treatments because in this condition the drug met the clinical lack-of-efficacy assumption of our study. However, it is worth noting that in clinical settings, Memantine has proven efficacious toward alcohol craving [24, 46, 47] and that the combination of Memantine and Naltrexone increases the efficacy of Naltrexone alone [48].

Male NTX-R showed enhanced cued reinstatement of alcohol seeking

In view of the selectivity toward alcohol shown by Naltrexone in the NTX1 and NTX2 clusters, we renamed these clusters as NTX-R (Naltrexone responder) and NTX-NR (Naltrexone non-responder) respectively. To further validate the reverse translational efficacy of our individual based approach, we sought to characterize the AUD-like behavioral features of these two clusters.

The attempt to profile Naltrexone responder and non-responder patients has been traditionally conducted through hypothesis driven approaches. Clinical studies stratified patients cohorts based on different factors such as genotype [49,50,51], severity of symptomatology [52], preference for sweet tastes [53], alcohol reward/relief seeking [54, 55], and alcohol cues reactivity [56]. Then, the effect of Naltrexone on alcohol drinking outcomes was compared between these a priori-stratified groups. In other words, the research question common to all these clinical studies can be summarized as: do group A and group B differ in their response to Naltrexone? This approach can be easily modelled by cross-sectional group-based animal studies, as it stems from an a priori hypothesis, the grouping factor is a specific behavioral or biological feature, and the response to Naltrexone is the outcome measure upon which the groups are compared. Conversely, here we adopted an individual variability model in which rats were stratified based on their response to Naltrexone (i.e. Naltrexone response was the grouping factor and not the outcome measure) that allowed us to look for Naltrexone response endophenotypes in a hypothesis-free approach. To this purpose, we compared the behavioral performance of NTX-R and NTX-NR in three subdimensions of alcohol dependence: alcohol drinking, motivation to pursue alcohol, and cued alcohol craving. The three subdimensions were modelled by alcohol intake, breakpoint, and cued reinstatement scores respectively; three behaviors that loaded on three separate principal components, confirming that they represented distinct constructs of alcohol seeking. NTX-R and NTX-NR did not differ in the amount of alcohol consumed, or the breakpoint reached during PR ASA sessions. On the contrary, alcohol visual, olfactory and taste cues reinstated alcohol seeking in male NTX-R but not in NTX-NR clusters. These results align with clinical data indicating that the reactivity to alcohol associated cues predicts Naltrexone response. Mann and colleagues [56] median split their Naltrexone treated patients in groups with high and low alcohol cues-induced ventral striatum activation and reported a better survival rate in time to first relapse in high activation groups. Schacht and co-workers [50] genotyped their patients for the A118G SNP of the oprm1 gene, and Naltrexone selectively reduced the percentage of heavy drinking days in patients with A/A genotype. The same patients showed a higher ventral striatum activation induced by alcohol cues that was reduced by Naltrexone. Similarly, in an independent work, Naltrexone decreased cortical activation induced by alcohol olfactory and visual cues [57]. Additionally, two meta-analyses reported that alcohol cue reactivity directly correlated with self-reported craving [58, 59]. In rodent operant models, drug craving is typically assessed by the response at the drug-paired lever induced by drug-paired cues or by a drug priming dose in the absence of the reinforcer [60, 61]. Thus, our cued reinstatement data can be interpreted as the ability of alcohol paired cues to elicit craving in NTX-R but not in NTX-NR rats. This is also in agreement with human data in which alcohol craving has been shown to predict the efficacy of Naltrexone [53, 62,63,64].

Altogether, these studies have proposed that the efficacy of Naltrexone derives from its ability to decrease alcohol craving and alcohol craving is an endophenotype enabling the prediction of Naltrexone efficacy [50, 53, 56, 57, 62,63,64]. It should be mentioned, however, that in few studies Naltrexone was effective in patient with a less severe symptomatology [52], and in reward- but not in relief- seekers [54, 55]. Although craving was not analyzed as a predictor of treatment efficacy, in these cases the Naltrexone responder group was the one showing a lower craving rate. However, in these case craving was scored by the obsessive-compulsive drinking scale, while in the studies discussed above craving was scored through analogue-assisted scale or Penn Alcohol Craving Scale.

In summary, our data obtained using Naltrexone-response as grouping factor sustain the interpretation of clinical studies proposing cue reactivity and craving as predictors of Naltrexone response. Interestingly, the association to cued reinstatement and Naltrexone response was specific to male rats and the prevalence of male and female rats differed between NTX-R and NTX-NR groups. Perhaps because of the paucity of sex-specific studies on Naltrexone response, these observations do not find correspondence in the clinical literature, and therefore they represent a novel set of information awaiting translation into clinic.

Novel insights to back-translate into clinic

χ2 analysis indicated that the NTX-R cluster predominantly consisted of males while the NTX-NR was composed mainly by females. These data indicate that Naltrexone was more likely to selectively prevent alcohol drinking in males rather than females. Whether this result correlates with clinical prevalence is presently unclear. To the best of our knowledge, clinical studies that focused on sex difference did not categorize their patient cohorts into Naltrexone responder and non-responder groups [65,66,67] and when groups with different sensitivity to Naltrexone were investigated, the relative frequency of women and men was not reported [49,50,51,52,53,54,55]. Interestingly, the consistency of Naltrexone efficacy on alcohol drinking across studies is stronger in men than women. While Greenfield and co-workers [66] found no sex difference, Baros and colleagues [65] reported a similar effect size in women and men but a significant difference between placebo and Naltrexone only in men, which they attributed to the smaller women group size. Finally, Kranzler and co-worker [67] reported that Naltrexone was effective in men but not in women. Based on these results one could speculate that the heterogeneous results observed in women may stem from a higher number of non-responders in this gender.

In addition, in our study the difference between NTX-R and NTX-NR in cue reactivity was specific to males. However, the extent to which the lack of difference in females translates to humans is presently unclear, as the interaction between gender and cue reactivity or craving on Naltrexone response has not been analyzed in clinical studies.

Altogether, our prevalence and cued-reinstatement analyses provide the rational for dedicated clinical studies or meta-analyses exploring the prevalence of men and women in Naltrexone responder and non-responder groups of patients, and the difference in predictive endophenotypes between the two genders.

Study limitations and future developments

Our results stimulate a number of considerations that would need future attention.

As discussed above, our work provides new insights on sex-differences that should be back translated in dedicated trials.

The individual variability and sex-difference in drug-response may derive from different genetic factors and consequently from differences in the pharmacokinetic and/or pharmacodynamics of the drug. Addressing this point in future studies may uncover translational genetic and molecular biomarkers to help highlighting the subpopulation of patient more suitable to receive Naltrexone.

Cue-reactivity is a key element that our work highlights as a translational behavioral marker of Naltrexone response, specifically in the male population. However, while humans do not normally go through an extinction training, our cue reactivity test was preceded by an extinction training in the absence of alcohol cues. In future studies, testing cue reactivity in the absence of extinction training and after different abstinence period should be taken into consideration.

Finally, it would be important to study the stability of responder and non-responder groups after a chronic treatment. Our results and related data bank can be the bases to design future between-subjects chronic treatment studies that would better mimic human treatment conditions.

Conclusions

In conclusion, using a reverse translational approach, we demonstrated that an experimental design accounting for individual variability would have accurately predicted the clinical lack of efficacy of Memantine on alcohol drinking, as well as the presence of Naltrexone responder and non-responder subjects. In a wider perspective, our work advocates for the implementation of individual-based approaches in drug screening prior to entering clinical trials. While the classical group-based experiment maintains its primary importance as initial step to assess the therapeutic potential and characterize the toxicology of the experimental drugs, the individual based approach would complement the screening of highly promising compounds before entering clinical trials. If successful, this would address a significant unmet medical need in the development of treatments for psychiatric disorders.

Moreover, this approach would enable the prediction and profiling of drug responder and non-responder patients, thereby enhancing the efficiency and effectiveness of drug development processes. In this regard, we provided evidence that male and female show different propensity to fall into Naltrexone responder and non-responder clusters, and that endophenotypes predicting Naltrexone response in males may not be valid in females. The extent to which these observations translate into clinic is presently unknown, and we encourage clinical scholars to verify it.