Introduction

A significant number of people are affected by upper limb motor impairments initially after stroke1 - two thirds still have deficits at six months post stroke2. It is critical to restore upper limb motor function to allow full independence and reduce the need for costly long term supportive care. Innovative clinical interventions such as virtual reality (VR) and robotics have been developed to foster neuroplasticity post stroke as they incorporate attention, motivation, goal-oriented tasks, progression, and repetition3. Although there are numerous studies evaluating the use of VR and robotic systems for upper limb therapy post stroke, many utilize small samples and have varied methods and goals4. Heterogeneity in the design of these studies is accompanied by a corresponding heterogeneity in their outcomes, making definitive conclusions based on this literature as a whole, difficult4,5,6,7,8.

Other key ingredients to promote neuroplasticity and positive change include (1) the timing of therapy initiation, and (2) the dose of rehabilitation9. Literature review indicates controversy regarding the value of intensive, high dosage therapy in the subacute period post stroke. Two important studies did not show added benefit to an increased dose of therapy in the subacute period10,11. In contrast, other studies suggest that additional VR-based training provided during the early subacute period following a stroke might have a positive effect on measures of motor function and impairment in this early period post stroke12,13. With regard to the optimal time to introduce therapy, there exists a period of unique and heightened neuroplasticity in both animals and humans similar to that seen during development14. This period lasts about three months in humans post ischemic stroke14. Thus, introduction of therapies based on established principles of motor learning during this unique period of plasticity may optimally affect recovery of motor skills. However, few studies have directly evaluated the optimal time to introduce therapies within this three-month window. Recent examples include Dromerick et al., 2021 who showed that 20 extra hours of task specific upper limb therapy introduced at two different time points in the subacute period compared to a similar amount introduced in the chronic phase led to greater functional gains on the Action Research Arm Test (ARAT) when introduced early for both the subacute time periods (greatest benefit at 60–90 days post stroke)15. Similarly, Meyer et al., 2021 demonstrated that the earliest program of intensive arm exercise in combination with a robotic intervention introduced in the subacute period led to better outcomes16.

As a general rule, larger doses of rehabilitation lead to better outcomes than smaller doses17. However, this may not hold in the subacute period as there are conflicting results for higher dosage during this time10,11,12,13. The unique neuroplastic environment occurring in the first three months after stroke may confound this relationship and thus this interaction warrants further investigation12,17. To explore this, the primary aim of this study was to empirically test both the value of additional upper extremity training, as well as the optimal timing for this additional training in the first two months after stroke on affected upper extremity motor function (ARAT- activity level outcome measure based on the International Classification of Functioning, Disability and Health)18. To test the question of optimal timing, we evaluated whether adding intensive VR/robotic motor training to a participant’s standard of care rehabilitation at 5–30 days post-stroke, improved motor function of the hemiparetic upper limb more than when initiating the additional VR/robotic motor training 30–60 days post stroke . To test the relative benefits specific to VR/robotic intervention, we compared early VR/robotic training to early dose-matched usual care. To address the question of increased training in the early period post stroke, we compared dose-matched usual care (10 additional hours of standard therapy) initiated 5–30 days post to usual care (no additional therapy) initiated in the same early time frame. We hypothesized that the outcomes would favor the early, increased training in the VR/robotic group when compared to both the delayed VR/robotic training group and the early dose matched usual care group, and that the early dose matched usual care group would have better outcomes than the usual care group. To test these three hypotheses, we measured changes in our primary outcome measure, ARAT (a measure of motor function), from pre training to four months since stroke (4M)18.

We conducted two secondary analyses using data collected at pre training, post training, and at three additional timepoints during the first six months since stroke. One was to identify distinct patterns of recovery among the participants. Importantly, we also investigated whether initial hand movement was associated with any recovery patterns we discovered. Second, we tracked changes in the EuroQol index over time and investigated whether our primary clinical tests correlated over time with the EuorQol index – a health related measure of quality of life. This would provide greater insight about whether functional status of the affected upper limb translates to activities of daily living and mobility tasks.

Methods

Trial design

This was a randomized controlled trial with four parallel arms. The study was approved by the Internal Review Boards (IRBs) of Kessler Foundation, Rutgers University, and the New Jersey Institute of Technology. All research was performed in accordance with relevant guidelines and regulations set by the IRBs. The trial was posted on 26/06/2018 (registration number: NCT03569059) at https://ClinicalTrials.gov prior to participant recruitment. The detailed study protocol can be found in Merians et al.18.

Participants

Participants were recruited from the Kessler Institute for Rehabilitation – a large inpatient rehabilitation center in New Jersey, USA. The study included participants who: (1) had a diagnosed stroke (ischemic or hemorrhagic) less than 30 days prior to study initiation, (2) were between 30 and 95 years of age, 2) were able to follow instructions, (3) had severe to moderate arm weakness (≥ 10/66 and ≤ 49/66 on the Upper Extremity Fugl-Meyer Assessment – UEFMA19), and (4) had intact cutaneous sensation. Individuals with UEFMA scores < 10 were not included as they would not have had the motor ability to utilize the VR and robotic systems effectively. Potential participants were excluded from the study if they: (1) were not independent prior to the stroke, (2) were too ill to tolerate training, (3) had persistent motor impairment from a prior stroke, (4) had aphasia or spatial neglect precluding their performing the tasks or following task instructions – assessed by their inpatient therapists during their initial evaluations, (5) had ≥ 1 on the NIHSS limb ataxia item, (6) had severe proprioceptive loss, (7) scored ≥ 3 on the Modified Ashworth Scale (for the elbow, wrist, or finger flexors), or (8) had a previous medical history of neurological deficits or orthopedic conditions that limited their affected arm and hand movement.

Randomization and recruitment

Physicians and inpatient therapists identified potential participants. Their charts were then reviewed by the study coordinator and study therapists. Those who met the study criteria were consented by the study coordinator. All study participants provided informed consent prior to study onset. Consent followed all IRB procedures. All study procedures were initiated while the participants were inpatients. All participants were blinded to study questions and comparisons. A randomization table was generated by an independent statistician prior to the study onset. This table was only viewed by one of the principal investigators (SA) throughout the entire study. Study participants were stratified by the initial motor impairment level of their affected arm (initial total UEFMA score). This score was used to stratify participants into two subgroups: 1) severe: UEFMA score 10–19 and moderate: UEFMA score 20–49. For each of the two subgroups, every four participants were one block. In each block, there was one participant for each intervention group This stratified block randomization was used to achieve a balanced assignment to one of the four intervention arms20: (1) usual care only (UC), (2) usual care plus an additional 10 h (10 1-h sessions) of usual care (dose-matched usual care, DMUC; initiated 5–30 days post-stroke), (3) usual care plus an additional 10 h (10 1-h sessions) of intensive therapy using robotically facilitated rehabilitation interventions presented in virtual environments initiated early (5–30 days post-stroke) (EVR), and (4) usual care plus an additional 10 h (10 1-h sessions) of intensive therapy focusing on the hand using robotically facilitated rehabilitation interventions presented in virtual environments initiated later (31–60 days post stroke) (DVR). Details about our randomization process and sample size determination can be found in Merians et al.18. See Fig. 1 for the flow of study participants.

Fig. 1
figure 1

Flow of study participants.

Interventions

UC group: consisted of adaptive and progressive task and impairment based physical and occupational therapy including strengthening, range of motion (ROM), and appropriate modalities for the affected upper extremity as determined by the participants’ therapists initially at Kessler, and then either during home therapy or outpatient therapy.

DMUC group: received usual care plus 10 1-h sessions of adaptive and progressive task and impairment based physical and occupational therapy including strengthening and ROM for the affected upper limb provided by trained study physical or occupational therapists. This was provided initially while the participants were inpatients (5–30 days post stroke). Subjects completed training on an outpatient basis after discharge from inpatient rehabilitation.

EVR group: received usual care therapy plus an extra 10 1-h sessions of intensive upper limb therapy focusing on the hand using robotically facilitated rehabilitation interventions presented in non-immersive virtual environments and initiated 5–30 days post-stroke. The systems utilized included the NJIT RAVR, NJIT Track Glove and the NJIT HoVRS to train the affected upper extremity individually (or bilaterally with the unaffected upper extremity) and included activities such as using individual fingers to play a virtual piano, extending fingers to hit a virtual ball, transporting the arm to eliminate virtual spaceships, using a pinch grasp to move a virtual object onto higher and higher levels, reaching to lift virtual cups onto a haptic shelf placed at a variety of heights, integrating reach and hand movements to manipulate fruit from virtual trees, and integrating reach and forearm pronation and supination to hammer a virtual nail into a piece of wood. Details about the NJIT RAVR and NJIT Track Glove systems can be found in Merians et al.20, and information about the NJIT HoVRS can be found in Qiu et al.21. This training was provided initially while the participants were inpatients at Kessler Institute for Rehabilitation (5–30 days post stroke). Subjects completed training on an outpatient basis after discharge from inpatient rehabilitation.

DVR group: received usual care therapy plus an extra 10 1-h sessions of intensive upper limb therapy focusing on the hand using robotically facilitated rehabilitation interventions presented in virtual environments and initiated 31–60 days post-stroke. The systems utilized and training activities were the same as those used for the EVR group. Some subjects started training as inpatients, but most of the subjects in this group performed all training on an outpatient basis.

Outcomes

Clinical outcomes included the Action Research Arm test (ARAT – primary motor function-based outcome), the Box and Block test (BBT – gross manual dexterity), the Upper Extremity Fugl-Meyer Assessment (UEFMA – impairment-based outcome), and the EuroQol Five Dimensions Test (a measure of health-related quality of life). All outcomes were measured at pre (immediately prior to intervention onset), post (immediately post intervention), 1M (one month post intervention), 4M (four months post stroke – our primary endpoint), and 6M (six months post stroke). For the UC group, post was measured approximately two weeks after their pretesting, and one month thereafter for their 1M testing. For DVR, the first pre was measured within 30 days after the onset of the stroke, second pre (pre2) was measured immediately prior to intervention onset. Data was collected at the Kessler Institute for Rehabilitation. Therapists conducting the clinical assessments were not blinded to group allocation.

Action Research Arm Test (ARAT)22. This is a 19-item test with four subscales that measure the following upper extremity functional tasks at the ICF activity level: grasp (6 items), grip (4 items), pinch (6 items), and movement 3 items). Each item is rated on a four-point scale with the following values, 0 = no movement, 1 = the movement task is partially performed, 2 = the movement task is completed but takes abnormally long, and 3 = normal movement. Scores range from 0 to 57 with higher scores indicating better motor function.

Box and Block Test (BBT)23. This is a unilateral assessment of gross manual dexterity. It can be used for a variety of neurological diagnoses including stroke. The participant is asked to move blocks (1” cubes) from one compartment of the box to another compartment of equal size. The score is the maximum number of blocks that can be moved within 60 seconds.

Upper Extremity Fugl-Meyer Assessment (FM)19. This is an impairment based upper extremity measure consisting of 33 movements that test single and multi-joint movement in and out of synergy, digit individuation, speed, dysmetria, ataxia, and reflexes. Each item is rated on a three-point scale: 0 = cannot perform, 1 = performs partially, 2 = performs fully, for a total score of 66. Higher scores indicate less impairment and more isolated motions.

EuroQol-5D-5 L24. This is a self-rated health-related quality of life measure. Participants rate themselves on five measures: (1) mobility (ambulation), (2) self-care (bathing and dressing), (3) usual activities (their current function in “work, study, housework, family or leisure activities”), (4) pain/discomfort, and (5) anxiety/depression. The severity of each measure is rated as no problem (score of 1), slight problem (score of 2), moderate problem (score of 3), severe problem (score of 4), and extreme problem/unable to perform (score of 5) - with a 1-digit number describing the level selected for each measure. The five measurements are combined into a 5-digit number to describe the health state of the patient. This 5-digit value is converted to an index score which reflects how good or bad a health state is according to the preferences of the general population of a country/region25. We utilized the United States as our reference country. Higher numbers on the index score reflect better quality of life.

Statistical methods

Data from twenty participants per group was used for analyses. Baseline data, comparison of group differences, correlations, and generalized estimating equations (GEE) were conducted using SPSS (IBM). We utilized an unstructured correlation matrix and a robust estimator for the covariance matrix for the GEE model. Furthermore, growth mixture modeling was conducted using R programming. Any missing scores were replaced with a linear imputation method (the missing value is calculated as the average of the previous and the following values).

Baseline data

One-way ANOVAs were utilized to determine differences in key values at pre between the four intervention groups if the data was normally distributed. Kruskal-Wallis tests were used for non-normally distributed data, and crosstabs were used for categorical data. Normality was evaluated using the Shapiro-Wilk test.

Clinical outcomes across groups – primary aim

To address our primary aim, we used a longitudinal model using GEE15,26 to examine change in our primary outcome measure, ARAT, between pre and 4M. The GEE approach was selected to accommodate for the varying number of days between assessment times for different participants. Test time varied among individuals and was treated as a continuous variable. Intervention group (EVR, DVR, DMUC, UC) was treated as a categorical factor. The model included an interaction term between test time and the intervention group. Additionally, to evaluate our three preplanned contrasts for the effect of time (EVR vs. DVR), dose (DMUC vs. UC), and the specific benefits of VR training (EVR vs. DMUC) at 4M for functional recovery (measured via the ARAT – our primary outcome measure), we utilized independent t-tests to compare change scores from pre to 4M between groups for each of the three queries. Based on the physiology/time course of the natural/spontaneous recovery process occurring in the first 3 months post stroke, we could not compare results between the pre intervention timepoint and the other evaluated timepoints (immediately post intervention and 1 month) because it was collected at different time points in this highly time-sensitive period. The four-month post stroke collection point is the first and most valid comparison possible because it is outside of the three-month post stroke period when a majority of upper extremity spontaneous recovery occurs, and it was taken at the same time point for all four groups. We utilized Cohen’s D to calculate effect size for this specific analysis. Lastly, a GEE model was also employed to analyze the effect of all four intervention groups over all five timepoints for all three clinical outcomes (UEFMA, ARAT, BBT).

Recovery patterns as a whole – secondary analysis 1

Growth mixture modeling27 was used to determine whether there was a longitudinal recovery pattern among our participants for the UEFMA, a measure of impairment, and our primary measure of motor function – the ARAT. Mixed-effects growth modeling clustering was conducted using R for all 80 participants to group recovery rates and identify distinct patterns of motor impairment and functional improvement over the six-month period. This approach incorporated both fixed effects, which captured overall recovery trends across the cohort, and random effects, which modeled individual variations in recovery trajectories. By estimating population-level effects and subject-specific deviations, the model used random intercepts to account for individual baseline differences and random slopes to capture the rate of change in recovery. Participants with similar baseline levels and recovery rates were naturally clustered together. The covariance between random effects allowed for grouping individuals with similar recovery patterns. This clustering provided a more detailed understanding of recovery, highlighting distinct trajectories within the data and offering a clearer picture of motor recovery post-stroke.

We also used cross tabs with Fisher’s Exact Test to determine the association between initial hand movements and recovery patterns for both impairment and function.

Correlations between clinical scores and the EuroQol, mean index scores over time - secondary analysis 2

We calculated the mean index score for all participants over time. Pearson correlations between the total UEFMA score - a measure of impairment, and the total ARAT score, our primary clinical outcome for motor function, and the EuroQol index at pre, 1 M and 6 M for all participants were performed. Pearson correlations were used as the data was normal in distribution.

Results

Figure 2 presents the CONSORT diagram. We screened 4591 people admitted to the Kessler Institute for Rehabilitation between September 2018 through February 2024 to enroll 100 participants. The study was halted between March 16, 2020, and July 1, 2020, due to the COVID 19 pandemic. Four participants had to be withdrawn from the study during the COVID stoppage. As well, five participants lost retention sessions (their data was imputed - see Methods), and four had delayed six-month retentions due to the COVID stoppage. Participants were randomized to the four training arms: EVR, DVR, DMUC, and UC. Of the 100 consented participants, 80 completed the study, with 20 per training arm. The original recruitment goal was 21 per group based on a power analysis18. There were substantial interruptions of the study due to the COVID-19 pandemic and the trial was ended prior to reaching full recruitment goals. We ran a simulated analysis of each of our pre-planned comparisons, adding a representative 21st subject (median scores at each timepoint), to each of the four groups. These simulations did not produce results that differed from our analyses of four, twenty subject groups, leading us to believe that missing our recruitment goals did not have an important impact on our findings.

Fig. 2
figure 2

CONSORT diagram.

Table 1 shows the key participant characteristics per training group at pre. Mean (95% Confidence Interval - CI) values and outcomes of between-group statistical tests are shown for age, days post stroke, and the three clinical measures. Furthermore, sex and race distributions are shown per group. There were no significant differences at pre between the four groups for these measures, as well as for sex distribution. The breakdown of race well represented the pool of patients admitted to the Kessler Institute for Rehabilitation (based on the demographics of the surrounding counties). The baseline characteristics for the 20 participants who dropped out of the study are similar to those of the four intervention groups (mean (95% CI): age: (69.55 (64.24–74.86)); sex(11 F/9 M); race(1 Hispanic, 13 Caucasian, 6 Black); days post stroke(16.2 (12.85–19.55)); UEFMA(32(26.41–37.59)); ARAT(14.9(8.93–20.87)); BBT(4.35 (0.74–7.96)) – see Table 1 to compare.

Table 1 No between group differences were observed at baseline for age, sex distribution, days post stroke, UEFMA, ARAT and BBT1.

Pre to 4M changes in ARAT – primary aim

A GEE analysis was conducted to examine changes in the ARAT from pre to 4M. The main effect of group was not significant (Wald X2 (3) = 7.41, p = 0.060). The analysis did reveal a significant main effect for time (Wald X2 (1) = 261.34, p < 0.001). However, there was no significant interaction between group and time (Wald X2 (3) = 7.10, p = 0.069).

Although the group by time interaction of the GEE analysis was not significant, we ran three pre-planned comparisons to examine our a priori aims, evaluating the effect of time, dose, and VR training at 4M for the ARAT. The mean (CI) change in the ARAT score from pre to 4M was 20.6 (15.14–26.06) for the EVR group, 29.1 (23.64–34.56) for the DVR group, 21.8 (15.12–28.38) for the DMUC group, and 26.39 (19.62–32.88) for the UC group. Independent t-tests showed a significant difference (with a medium effect size) in the change in ARAT scores for the effect of time, with a mean change of 20.6 in the EVR group (95% CI: 15.14, 26.06) and a mean change of 29.1 in the DVR group (95% CI: 23.64, 34.56), demonstrating the DVR group improved more on the ARAT than the EVR group from pre to 4M (t(df) = 2.23 (37), p = 0.032; Cohen’s d = 0.71). There were no significant changes from pre to 4M for dosage or VR training (see Table 2; Fig. 3).

Table 2 Comparison of changes in total ARAT score from pre to 4M for the three a priori contrasts.
Fig. 3
figure 3

Comparison of change in total ARAT score from pre to 4M for the three a priori defined contrasts. Bars represent 95% CIs.

Group comparisons of the recovery patterns for ARAT, UEFMA and BBT till six months post stroke

The GEE analysis showed no between group differences for all three clinical measures over all five time points. The results, however, did show a significant improvement across time for all three clinical measures (Table 3). There were no significant interactions between group and time for any of the clinical measures. See Fig. 4 for total scores versus days since stroke for each of the three clinical outcomes. The bold lines represent the line of regression for each intervention group and the shaded areas are the confidence intervals.

Table 3 Generalized estimating equations (GEE) analysis of changes in UEFMA, ARAT and BBT over the six months post stroke.
Fig. 4
figure 4

Clinical score trajectories are shown for each study participant (panel a = UEFMA, panel b = ARAT, panel c = BBT). Modeled mean trajectories for each training group are shown as dashed and bold lines. Shaded areas indicate the 95% confidence bounds.

Recovery patterns – secondary analysis 1

All four groups demonstrated notable improvements in affected upper limb impairment and motor function over the course of the study with 65 of the 80 participants demonstrating substantial change on the ARAT (mean change from pre to 6 M: 31.83 (95% CI, 29.31–34.32)) and 73 of the 80 participants achieving considerable change on the UEFMA (mean change from pre to 6 M: 25 (95% CI, 23.19–26.8), with equal to or greater than the Minimal Clinically Important Difference (MCID) for both outcomes (MCID for the UEFMA = 10 28, MCID for the ARAT = 12–17 dominant side versus non-dominant side29).

Growth mixture modeling was used to determine whether there was a longitudinal recovery pattern among our participants for the UEFMA, a measure of impairment, and our primary measure of motor function – the ARAT. Figure 5 shows that there were three patterns of change observed over time in impairment (total UEFMA score). Group A (n = 37; red dots) showed the fastest rate of change initially and had the highest UEFMA total score at six months post stroke. Group B (n = 33; green dots) showed a slower rate of change initially compared to group A and had a lower total UEFMA score at six months. Lastly, group C (n = 10; blue dots) showed the overall slowest rate of change over time and the lowest total UEFMA score at six months. Figure 6 shows that there were three similar patterns of change across time for total ARAT scores.

Fig. 5
figure 5

Total UEFMA scores across time for all 80 participants. Groups were assigned using growth mixture modeling.

Fig. 6
figure 6

Total ARAT scores across time for all 80 participants. Groups were assigned using growth mixture modeling.

Most participants remained in the same recovery groups except nine participants who were in group B for impairment (UEFMA) were in group A for motor function (ARAT), and three participants who were in group C for impairment were in recovery group B for the ARAT. Additionally, three participants who were in group B for impairment went to group C for function and one person switched from group A for impairment to group B for function.

Please see Table 4 for 6 month scores and change from pre to 6 month scores for the three recovery groups for both the UEFMA and ARAT.

Table 4 Mean score at 6 M and mean score change from pre to 6 M for three recovery groups for UEFMA and ARAT.

To determine which initial factors were associated with the different recovery patterns for both impairment and motor function, we performed crosstabs using Fisher’s Exact test. Specifically, we evaluated whether specific hand movements at pre (mass extension task, distal finger grasp, thumb adduction grasp, and thumb to index finger grasp) from the UEFMA were associated with the recovery pattern subjects were grouped into for both the total UEFMA and the total ARAT scores over time. These four movements were used for analysis as they require some degree of corticospinal tract integrity and thus should be associated with greater overall recovery30. A score of 0 at pre (no active movement during the task) on all four hand movement tasks was significantly associated with being in the slowest and least amount of recovery group (group C) on both clinical outcomes (p < 0.001 for all analyses, standardized residuals > 2). The ability to extend the impaired fingers fully actively at pre (a score of 2 on the mass extension task) was significantly associated with being in the group with the fastest initial change and highest total score for both clinical outcomes (group A) (p < 0.001, standardized residuals > 2). Full active distal finger grasp and thumb to index grasp were also significantly associated with being in recovery group A for impairment only (p < 0.001, standardized residuals > 2). There was no pattern noted for initial hand movement and recovery in group B for both impairment and function.

EuroQol index over time and correlation between clinical measures and the EuroQol – secondary analysis 2

EuroQol index scores for the entire group increased over time in parallel with the outcome measures (Mean scores: pre − 0.4202 (95% CI, 0.372–0.468), post – 0.539 (95% CI, 0.487–0.591), 1 M – 0.614 (95% CI, 0.561–0.666), 4M – 0.637 (95% CI, 0.589–0.885), 6 M – 0.674 (95% CI, 0.627–0.722). We performed Pearson correlations between the total UEFMA score - a measure of impairment, and the total ARAT score our primary clinical outcome for motor function, and the EuroQol index at pre, one month, and six months post stroke for all participants to determine whether their level of impairment and function translated to their health-related quality of life. We found no correlation at pre for both outcome measures (UEFMA: r(78) = 0.0.069, p = 0.544; ARAT: r(78) = 0.0.073, p = 0.522). At one month there were weak correlations (UEFMA: r(78) = 0.310, p = 0.005; ARAT: r(78) = 0.313, p = 0.005). We showed moderate correlations between both clinical measures and the EuroQol index at 6 months (UEFMA: r(78) = 0.472, p < 0.001; ARAT: r(78) = 0.522, p < 0.001).

Adverse events

There were no study related adverse events. There were 14 IRB reviewed non-study-related adverse events including one death, four falls with one injury, two PFO closures, one episode of syncope, one episode of chest pain, one episode of face swelling, one episode of diverticulitis, one episode of face sensory changes, one episode of new seizure onset, and one hospitalization for elevated potassium levels.

Discussion

All four groups demonstrated notable improvements in affected upper limb impairment and function over the course of the study with 65 of the 80 participants demonstrating substantial change on the ARAT, and 73 of the 80 participants achieving considerable change on the UEFMA with equal to or greater than the MCID for both outcomes.

We did not show an overall effect of dose or time in this study between our four intervention groups at six months on all three clinical outcomes. This contrasts with some of our previous pilot work13. All but fourteen of the participants in this study had moderate levels of impairment of the affected upper limb at pre31 (see Table 1 for UEFMA scores at pre). Previous research has shown that higher levels of motor ability initially predict greater recovery of the upper limb post stroke32,33. Improvement is also associated with appreciable spontaneous recovery during the subacute period post stroke11. Thus, our different interventions may not have been sufficient to overcome the spontaneous and expected recovery during this early time to show a long-term effect at six months post stroke11.

With respect to timing, we had hypothesized that the earlier the additional VR intervention was initiated within the subacute period, the greater the benefits for impairment and motor function recovery. We found a significant difference in change from pre to four months testing in motor function (ARAT) between the EVR and DVR groups, with the DVR group achieving a significantly greater amount of change at four months than the EVR group. This is somewhat similar to the Dromerick et al., 2021 study who demonstrated that their subacute group (extra training initiated 2–3 months post stroke) achieved greater functional outcomes on the ARAT at one year post stroke compared to their acute group (extra training initiated within 30 days post stroke); with both early groups faring better than their chronic group (extra training initiated within 6–7 months post stroke)15. However, for our study, ARAT scores at six months were not significantly better at six months post stroke for the DVR group. Perhaps the key to greater, sustained outcomes is providing a higher volume of training in the subacute period post stroke. The aforementioned study by Dromerick et al., 2021 demonstrated that 20 extra hours (double the amount of our extra VR training) of task specific upper limb therapy introduced at two different time points in the subacute period compared to a similar amount introduced in the chronic phase led to greater motor function gains on the Action Research Arm Test (ARAT) when introduced early for both the subacute time periods (with benefits greatest at 60–90 days post stroke)15. In this study, we could not provide an extra 20 h of training within the time that they are in the inpatient rehabilitation setting, as this would require 2 h per day in addition to the 3 h of occupational, physical, and speech therapy required in this setting. This would result in fatigue issues and scheduling problems during the inpatient stay.

Importantly, we did not show any adverse effects of providing a higher volume of VR training so early after stroke. This is contrast to the VECTORS study that showed less motor recovery (ARAT based) in their early high intensity group10. The difference may be that the VECTORS study was initiated on average a week earlier than this one, and their high intensity participants were wearing a mitt on their unaffected side 90% of the day, forcing a much higher use of the affected arm compared to this study.

Looking at dose, the dose matched usual care group (DMUC) received an extra 10 h of training compared to the usual care (UC) group during the same early time frame. This amount was based on the literature available at the time the study protocol was developed34. This amount may not have been enough of an extra training dose to show an effect. Spontaneous recovery during this early period may have been greater than the treatment effect11. It has been noted that ‘time from stroke occurrence is independently associated with spontaneous recovery of impairments and activities, explaining 16–42% of the observed improvements11,35. The four training groups demonstrating comparable results is similar to studies such as the ICARE study which had an average of sixteen more hours of training for their higher dose groups compared to their usual care group11, and the VECTORS study which like this one, provided an extra 10 h of training in their higher dose training group10. Studies that have shown an effect on motor function (measured with the ARAT) in this period such as the Krakauer et al., 2021 study utilized a much greater amount of training in their higher dose group (~ 30 h more than their historical usual care group)12,36. Thus, a much higher dose may be required to overcome spontaneous recovery in the subacute period34. Additionally, it is important to consider that the UC group received therapy in one of the top five inpatient rehabilitation centers in the US. This care may be of a higher level and may not reflect ‘usual care’ received at most facilities.

We hypothesized that training in the VR environment, with high intensity goal-oriented, personalized activities, and enriched environments, in addition to assistance from a robotic arm would result in greater recovery at the impairment and functional level than the higher dose matched usual care (DMUC) provided during the same time frame. However, we were unable to find convincing evidence of this in the current study. This is similar to the findings of the 2021 study by Krakauer et al.12. Our DMUC group received high repetitions (an average of 178 hand and arm movements per 1 h session) and task-oriented training focused on the individual goals of the participants. This amount of activity is far greater than that observed per session by Lang et al., 2009 37 and approaches the volume utilized by Waddell et al.38. The two modes of training may have been too similar to each other to show an effect specific to VR/robotic training. This suggests that the key ingredients to foster neuroplasticity and recovery may be high repetition and goal-oriented, personalized activities which synergistically interact with the unique and heightened neuroplastic environment occurring during this period15. This supports established research that indicates that enduring changes at the physiological level in motor systems and behavioral changes can occur from repeated practice of goal oriented and challenging movements17.

Although we found similar long-term outcomes in impairment and function between the four study arms in this group of primarily moderately impaired individuals, we did show three different patterns of change initially and at six months for both impairment (total UEFMA score) and motor function (total ARAT score) (see Figs. 5 and 6). These patterns were independent of study group assignment. For both measures, one group (group A) demonstrated a faster rate of change initially with an overall greater total score at six months compared to the other two groups (groups B and C). A second group (group B) showed a less rapid rate of change initially with a lower total score at six months compared to group A. Note that the total average amount of change was the same though, for both groups A and B for the UEFMA and ARAT scores from pretest to six months post stroke.

Lastly, a third group (group C) showed less change over time overall. For impairment, this group of 10 participants varied in age and were from all four treatment arms. All had an initial total UEFMA score at pretest of 17 or lower reflecting that they had trace to no hand movement at pretesting. The proportional recovery rule states that the amount of spontaneous motor recovery of the paretic upper extremity after ischemic stroke is relatively fixed and accounts for approximately 70% of patients’ maximal potential recovery39. Outliers from this rule were found to have an initial total UEFMA score of less than 17, just like our lowest achieving group40. Also of interest is that all ten had ischemic strokes but only three of the ten received any medical thrombolysis. Thrombolytic treatment is known to improve clinical outcomes41. Four participants in the study also had an initial UEFMA score of less than 17 and a higher level of recovery (2 in group A and 2 in group B on the UEFMA). These individuals were much younger and had some minimal hand movement initially. Two had an ischemic stroke and both received TPA. Two of them had a hemorrhagic stroke and were not eligible for TPA.

The ability to move the hand initially is a key factor as the groups with the fastest initial change and the greatest total score at six months in our study (group A) for both the total UEFMA and the total ARAT scores had the ability to perform higher level hand tasks initially that require some degree of corticospinal tract (CST) integrity (as the CST is thought to be the neural basis for these tasks )42. This finding adds to the established literature on clinical models to predict upper limb recovery28,43,44.

Lastly, we did note that 9 participants who were in group B for recovery based on impairment were in group A for their motor function recovery measured via the ARAT, and three participants who were in group C for impairment were in group B for motor function (ARAT). This reflects that this cohort was able to use their affected upper limb at a higher level functionally compared to participants with similar levels of impairment.

A secondary analysis evaluated the EuroQol scores reported by our participants. We found that the mean EuroQol index scores increased steadily over time which parallels increases seen in the clinical scores over time. Thus, generally, functional and impairment changes were reflected in self rated quality of life scores. Looking at this in more depth via correlations, we found no correlation between ARAT and UEFMA scores and the EuroQol index at pretest, weak but significant correlations at one month post training, and moderate, significant correlations between study participants’ clinical scores on the total UEFMA and the ARAT at six months with their EuroQol score. The lack of correlation very early on may be a result of all the participants being in the inpatient setting at that time and receiving assistance with their ADLs and mobility therefore making it hard for them to determine their true level of independence so early on. At six months post stroke, the affected arm recovery had stabilized in these participants, and they had been living at home long enough to allow for true assessment of their own health related quality of life measures. Thus, the six-month scores allow us to more confidently suggest that long term improvements made by our participants in impairment and function of the upper limb were associated with greater independence in several domains of health-related quality of life.

Our VR training equipment is distinct because it incorporates a strong focus on hand retraining. It includes the NJIT RAVR, NJIT Track Glove, and NJIT HoVRS systems which allow for training with a broad set of impairments and functional abilities18, including a unique VR based visual mirror simulation, and adaptive algorithms to drive individual finger movement, ‘which can modify the workspace to increase range of motion and can provide gain modification in order to allow a person with a minimal amount of hand movement to interact successfully with the VR simulations’45.

A limitation of the study is that assessors were not blinded when collecting outcome data. This can lead to observer or detection bias. We believe that this bias was mitigated in this study as the ARAT, UEFMA, and BBT are reliable and valid outcome measures that have defined and objective scoring scales. As well, the EuorQol is self-administered questionnaire. We utilized uncorrected t-tests for our three pre-planned independent t-tests that address our primary aims, which might be considered less rigorous than corrected t-tests. This said, there were sound physiologic rationale for our focus on the time point that we chose to examine a-priori (see Methods).

Importantly, the findings from this research are quite relevant to discovering the important ingredients of an effective training program that taps into the unique neuroplasticity occurring within the first three months post stroke. We have also shown that being able to fully extend your fingers within the first month after stroke significantly increases the probability of achieving higher functioning levels at 6 months. The results of this study verify that the initial level of hand function may be an important predictor of recovery.

Conclusions

This parallel, randomized controlled trial investigated both additional training and optimal timing for intensive upper limb non-immersive VR/robotic training in the subacute period post stroke. Outcomes for the four study groups at six months post stroke were similar. Participants showed three different patterns of recovery for both upper limb impairment and function over time, which were independent of study group assignment. Importantly, these patterns were associated with initial hand use.

EuroQol quality of life scores increased over time parallel to the clinical outcomes and were moderately correlated with clinical scores at six months post stroke. Participants demonstrated three distinct patterns of recovery for both upper limb impairment and function over time.