Construction and validation of a pain facial expressions dataset for critically ill children

Jiang, Longquan; Wu, Mengqi; Fu, Weijia; Wang, Yingwen; Gu, Ying; Zhang, Fan; Gong, Weijuan; Qin, Yan; Xu, Yulu; Feng, Rui; Zhang, Xiaobo

doi:10.1038/s41598-025-02247-w

Download PDF

Article
Open access
Published: 17 May 2025

Construction and validation of a pain facial expressions dataset for critically ill children

Longquan Jiang¹^na1,
Mengqi Wu²^na1,
Weijia Fu³^na1,
Yingwen Wang⁴,
Ying Gu³,
Fan Zhang⁵,
Weijuan Gong³,
Yan Qin³,
Yulu Xu³,
Rui Feng² &
…
Xiaobo Zhang⁶

Scientific Reports volume 15, Article number: 17214 (2025) Cite this article

820 Accesses
Metrics details

Subjects

Abstract

Automatic pain assessment for non-communicative children is in high demand. However, the availability of related training datasets remains limited. This study focuses on creating a large-scale dataset of pain facial expressions specifically for Chinese critically ill children and evaluating its utility using deep learning models. Data were gathered from two intensive care units at Children’s Hospital of Fudan University. The dataset, named pain facial expression of critically ill children (PFECIC), includes 119 pain expression videos and 6951 images collected from 53 children between December 2022 and January 2023. All videos and images were independently triple labeled according to five pain levels. The PFECIC dataset was evaluated through deep learning experiments, demonstrating strong performance metrics: 88.3% accuracy, 88.3% precision, 88.7% recall, an F1-score of 88.5%, and a false-positive rate of 3.0%. Prediction errors were mostly associated with labels close to the true values. Comparative analysis with the classification of pain expressions (COPE) dataset highlighted the superiority of PFECIC in terms of accuracy, validity, and comprehensiveness.

Introduction

Pain is an unpleasant sensory and emotional experience associated with, or resembling actual or potential tissue damage¹. It has been reported that approximately 20% of patients worldwide suffer from pain². Critically ill children, compared to other pediatric populations, are particularly vulnerable to pain due to their medical or surgical conditions, compounded by the frequent inevitability of many medical procedures³. Research findings from 66 Chinese medical institutions revealed that pain management in China remains suboptimal, with a relatively high prevalence of pain among hospitalized children across all age groups, especially those who are critically ill⁴. Pain experienced during hospitalization can result in short-term adverse effects, such as increased rates of delirium and decreased sleep quality in children. In the long term, it may impact children’s healthcare compliance and other health behaviors, causing significant negative consequences⁵.

Recognizing pain and accurately assessing pain intensity are fundamental steps for pain management to be effective in critically ill children. While self-reporting is considered the most reliable method for pain assessment, it is often challenging for critically ill children who cannot or refuse to verbally communicate their pain to healthcare providers. In such cases, observational measures based on facial expressions and behavioral cues are essential^6,7,8,9. Unfortunately, there are still no universal tools for pain assessment that could be used in all children. Observational pain assessment tools, in particularly, face challenges such as variability among assessors and the time constraints imposed on staff by the need for regular assessments.

Recently, with the rapid development of artificial intelligence and computer vision technology, the accuracy of image-based facial expression recognition has continued to be improved. Algorithms based on facial image analysis show promise for assessing pain in children within intensive care settings^10,11,12,13. High-quality, accessible datasets are essential for training these algorithms, yet such valuable resources remain insufficient. There are several adult pain datasets. The UNBC-McMaster Shoulder Pain Archive documents 129 adults with chronic shoulder pain during movement exercises, featuring video recordings with facial action coding and self-reported pain scores¹⁴. The BioVid Heat Pain Database captures physiological signals (ECG, EMG, EDA) and facial videos from 90 healthy adults under controlled heat stimuli, including pain thresholds and subjective ratings¹⁵. Similarly, the X-ITE Pain Database from University of Magdeburg contains comparable physiological measurements from 134 participants across varied pain stimuli (thermal, electrical, pressure), with comprehensive pain annotations. These resources maintain ethical research access protocols¹⁶.The most notable neonatal pain facial expression datasets include the facial expression of neonatal pain (FENP), the classification of pain expressions (COPE), the infants pain assessment dataset (IPAD), and the acute pain in neonates dataset (APN-db). The FENP dataset¹⁷, introduced by the Nanjing University of Posts and Telecommunications, contains 11,000 neonatal facial expression images of 106 Chinese neonates from two children’s hospitals. The facial expression images are classified into four levels: severe pain, mild pain, crying, and calmness. Each category contains 2750 images of neonatal facial expressions. The COPE dataset¹⁸ contains 204 images captured from 26 healthy infants aged between 18 h and 3 days undergoing stress or pain-inducing stimuli. Facial expression images were labeled as rest, cry, air stimulus, friction, and pain. It lacks pain intensity ranking and offers limited information because of small samples. The IPAD dataset¹⁹ originates from 31 neonates of various ethnic backgrounds, including Caucasians, African Americans, and Asians, who were admitted to the neonatal intensive care unit (NICU). It encompasses facial expressions, body movements, and vocalizations observed during medical procedures such as heel prick blood sampling. The APN-db dataset²⁰ was compiled in NICU and vaccination departments, comprising 213 videos of newborns and infants undergoing clinical procedures that trigger facial expressions of pain. The age range of participants in this dataset spans from 0 to 6 months. Among the neonatal pain datasets mentioned, only the COPE dataset is publicly accessible.

Despite these valuable contributions to pain assessment research, significant limitations persist in applying existing datasets to the pediatric critical care environment. The challenges include the wide age range from 1 to 18 years, resulting in a greater diversity of facial expressions, the presence of multiple medical tubes, which partially obstruct facial images, and the complex real collection environment. With the help of the healthcare workers at the Children’s Hospital of Fudan University, we have collected data on children’s pain facial expressions in the PICU and cardiac intensive care unit (CICU) of the hospital for thirteen months. Currently, we have built a large facial expression dataset of critically ill children suffering procedural pain which includes 119 videos and 6,698 images. We hypothesize that the algorithm model trained on the PFECIC dataset will demonstrate better accuracy and generalization performance, indicating PFECIC’s superior usability and comprehensiveness.

Methods

This study was approved by the institutional review board of the Children’s Hospital of Fudan University (NO.2023-151), and all methods were carried out in accordance with the Declaration of Helsinki. Informed consent was obtained from a legal guardian for study participation and portrait rights usage authorization. To protect personal privacy, sensitive information such as the names, dates of birth, and diagnoses of the participating children was removed. Ten guiding principles jointly identified by the US Food and Drug Administration (FDA), Health Canada, and the United Kingdom’s medicines and healthcare products regulatory agency (MHRA) were followed²¹.

PFECIC dataset construction

Data collection environment

We collected videos and images of children’s facial expressions related to pain experienced while undergoing medical procedures in the PICU and CICU at Children’s Hospital of Fudan University. Participants were recruited from children admitted to the PICU and CICU. The inclusion criteria were as follows: age between 28 days and 18 years, planning to undergo only one of the listed medical procedures under non-emergent conditions, and written informed consents both for the study and the portrait rights usage authorization obtained. Exclusion criteria included children in deep sedation, prone positions, or with more than one-third of the face covered.

Data collection equipment

The equipment for collecting children’s facial expression videos and images was one Hikvision surveillance camera (1920 × 1080 60HZ) with a fixed bracket. The camera was installed directly above the head of the child’s bed, with the lens pointing vertically downward, directly facing the child’s face, as shown in (Fig. 1). The shooting angle was limited to 30 degrees, ensuring the face occupied at least half of the entire frame. Two staff members from the marketing department were designated as video recorders.

Video collection process

The PFECIC dataset contains five facial expression statuses classified according to the ‘facial muscles’ category of the COMFORT behavior scale. To collect corresponding data, we first needed to find suitable scenarios in real PICU or CICU environments. Consequently, data on the frequency of painful medical procedures performed in the PICU and CICU from January 1, 2019, to December 31, 2019, were extracted to figure out the common painful procedures. We used pre-COVID-19 data to follow regular patterns. Additionally, interviews were conducted among ten senior nurses and two advanced practice nurses (APNs) working in the PICU or CICU to enrich the information. Ultimately, the seven most painful medical procedures were identified: nebulization suction, tracheal suction, surgical debridement/dressing change, peripheral venous catheterization, arterial catheterization, intramuscular/subcutaneous injection, and urinary catheterization.

One senior nurse from the PICU and another from the CICU, both skilled in pain assessment, were appointed to serve as data collection coordinators for their respective departments. They were trained to use the maximum variation sampling²² to select patients meeting the inclusion criteria for video recording, considering factors such as age, gender, mechanical ventilation, type of suffering procedure, and pain score rated at the bedside. Once a patient suitable for video recording was identified, the nurse coordinator called the video recording team. A team member would then arrive at the scene within 5 to 10 min to start the recording. The video was recorded two minutes before the bedside nurse or intern performed the corresponding medical procedure, during the entire procedure, and two minutes after the completion^23,24,25. During the procedure, the nurse coordinator assessed pain at the bedside using the COMFORT behavior scale. The video collection procedure is depicted in (Fig. 2).

Data labels

The captured images were pre-processed. Videos in which children moved their bodies during the recording, leading to their faces being obscured or partially obscured by healthcare workers or the children’s limbs, or where external stimuli interfered, were excluded. Thus, only videos featuring clear and complete facial information were included in the subsequent annotation section.

Six experienced nurses from the hospital’s pain management team were selected to accurately classify the facial expression statuses for each video. Before annotation, they were trained to consistently use the COMFORT behavior scale to evaluate facial expressions (1 point: facial muscles totally relaxed; 2 points: normal facial tone; 3 points: tension evident in some muscles, not sustained, 4 points: tension evident throughout muscles, sustained; 5 points: facial muscles contorted and grimacing). Every video was triple annotated by three nurses independently using a custom software annotation tool developed by the study team. First, the annotator watched the entire video, pausing at moments that could be scored according to the “facial muscles” category of the COMFORT behavior scale. At these points, the segment was split and then annotated. The tool included basic video player functions such as play, pause, stop, forward, backward, and split. Additionally, after splitting video segments, a dialog box would pop up for inputting the rating score. The annotation process is illustrated in Fig. 3. Segment annotations agreed upon by at least two annotators were adopted; in cases where all three annotations differed, a working group discussion was initiated to resolve discrepancies and achieve consensus.

Afterward, the classified video segments were processed frame by frame by an algorithm engineer, with each frame within the segment being automatically annotated. The initial frame (F1) and its adjacent frame (F2) were selected, and the difference between the two frames was calculated using the frame difference method. If the frame difference was less than a given threshold ε, it indicated the minimal change in the child’s facial expression, and F2 was discarded. The difference between F1 and the third consecutive frame (F3) was then calculated, continuing this process until the frame difference exceeded the threshold ε or until the last frame of the video. If the frame difference exceeded the threshold ε, it indicated a significant change in the child’s facial expression, and the frame was included as valid image data in the PFECIC dataset.

Dataset’s clinical validation

We conducted recognition experiments on the PFECIC and COPE datasets separately using an algorithm based on the Swin Transformer²⁶. The PFECIC dataset was divided into training, validation, and test sets in a 7:2:1 ratio, comprising 4949 images, 1233 images, and 769 images, respectively. The sample sizes for each classification point from 1 to 5, in the training set were 7, 1344, 1156, 1067, and 1115, respectively. The validation and test sets were distributed proportionally. Since the PFECIC dataset contained objects besides faces, such as hospital beds and medical equipment, face detection was conducted first, followed by expression recognition. Similarly, for the COPE dataset, we divided the dataset into training, validation, and test sets, adhering to a 7:2:1 ratio. The training set encompassed 140 images, the validation set contained 40 images, and the test set included 24 images. Since the COPE dataset concentrates exclusively on the facial area, face detection was not required.

The experiments were implemented using PyTorch 1.11 deep learning framework and conducted on 8 GeForce RTX 1080Ti GPUs. The data was augmented during training to increase the dataset size, including random horizontal flipping and random cropping, where the crop size was set to 224. The model training parameters were set to a batch size of 16, a learning rate of 0.0001, a weight decay value of 0.00001, and 150 training epochs. To evaluate the effectiveness of the datasets for training and testing deep learning models based on the algorithm, we measured accuracy, precision, recall, harmonic mean of precision and recall (F1-score), and false positive rate (FPR). Detailed information about the algorithm and its evaluation metrics can be found in the Supplement.

Results

Results of the PFECIC dataset

The PFECIC dataset comprises 119 pain expression videos from 53 critically ill Chinese children, recorded across seven major clinical procedures that induce pain in the PICU and CICU of Children’s Hospital of Fudan University. The dataset also includes 6,951 annotated pain expression images, categorized into five facial expression levels (1 to 5 points), with 375, 1,887, 1,624, 1,499, and 1,566 images per category, respectively. Table 1 presents the basic information, while Fig. 4 provides sample images.

Table 1 Basic information of recorded videos (n = 119).

Full size table

Experiment result and comparative analyses

As illustrated in Fig. 5, the pain expression levels in the PFECIC dataset exhibit smaller granularity and a more balanced data distribution than the COPE dataset.

Moreover, as shown in Table 2, for the PFECIC dataset, all five metrics are higher for the facial expression levels rated 1 point, 2 points, and 5 points compared to the other two levels. The ROC curves in Fig. 6 show that the AUC values range from 0.764 to 0.968. Facial expressions rated 3 and 4 points are relatively subtler and more challenging to distinguish.

Table 2 Model performance trained and tested based on the PFECIC dataset.

Full size table

The performance on the COPE dataset is displayed in Table 3, suggesting a significant class imbalance issue favoring the “non pain” class. The ROC curve in Fig. 7 exhibits a step-like pattern rather than a smooth curve, which typically indicates evaluation on a smaller dataset where each step represents individual test cases. With an AUC value of 0.829, the model demonstrates good overall discriminative ability between pain and no-pain classes, despite the aforementioned class imbalance issues.

Table 3 The performance results from the model trained on the COPE dataset.

Full size table

We trained the Swin Transformer_base model on the PFECIC dataset and tested it on the COPE dataset. The results, shown in Table 4, indicate that the accuracy of pain recognition is significantly higher than that of the model trained solely on the COPE dataset. This suggests that increasing the amount of training data can effectively enhance the performance of deep learning models and improve their generalization capability.

Table 4 The performance results from the model trained on the PFECIC and then tested on the COPE dataset.

Full size table

For a more comprehensive comparative analysis between the PECIC and COPE datasets, considering that the COPE dataset only has two categories: pain and non-pain, to ensure the fairness of comparison, we defined the status of 1-point in PECIC as non-pain and statuses of from 2-points to 5-points as pain. The comparison of the performance metrics based on the Swin Transformer_base model trained on the PECIC and COPE datasets is shown in (Fig. 8). The PFECIC dataset shows an improvement over the COPE dataset with an increase in accuracy metric by 16.6%, precision by 15.7%, recall by 23%, F1-score by 24.2%, and a decrease in false positive rate by 30%.

Discussion

Principal findings

Pain is a complex phenomenon. Pain-induced facial expressions share certain basic action units and physiological responses with expressions triggered by other factors, such as raised eyebrows, lowered mouth corners, widened eyes, and an open mouth. Nevertheless, they also exhibit notable differences in specific expression combinations, facial symmetry, and duration. For example, pain-related expressions are typically brief and occur frequently, fluctuating with the intensity of the pain, whereas expressions associated with other emotions, such as happiness or sadness, tend to last longer. Deep learning-based pain facial expression recognition algorithms offer advantages such as high accuracy and strong generalization performance. However, they require large-scale, high-quality datasets for effective training²⁷.

Existing literature shows limitations in facial expression datasets for children’s pain analysis, including small scale, non-standardized construction processes, and inadequate coverage of various age groups²⁸. We established the PFECIC dataset in the study, which captures children’s facial expressions experiencing real medical procedure-related pain, exhibiting typical characteristics of pain-induced expressions. A standardized data collection process, along with pain ratings triple-annotated by trained, experienced senior nurses, ensures the dataset’s rigor, diversity, rationality, and usability.

The comparative analysis experiment utilizes the publicly available COPE dataset¹⁸, which comprises data from 26 Caucasian neonates between the ages of 18 h and 3 days. The ROC curve exhibits a step-like pattern, indicating the dataset’s small scale and limited generalization capability. Moreover, the four stimuli used on the neonates are not typical clinical procedures that induce pain in children, making them unrepresentative of typical pediatric pain expressions. In contrast, the confusion matrix results from the PFECIC dataset reveal that mispredictions predominantly occur around values adjacent to the true labels. This suggests that the annotation process for the PFECIC dataset is reasonable and reliable. Consequently, the PFECIC dataset established in this study offers greater usability and accuracy for analyzing pain expressions in critically ill children. Notably, it represents the first comprehensive dataset of pain-related facial expressions in critically ill children, covering an age range of 1 to 18 years. This broad coverage enables the application of artificial intelligence algorithms for pain expression analysis across all pediatric age groups.

Existing pediatric pain datasets primarily focus on acute procedural pain in otherwise healthy neonates and infants, failing to capture the complex and variable pain expressions seen in critically ill children. These datasets typically document responses to brief, standardized painful procedures conducted in controlled settings, which differ significantly from the prolonged, fluctuating pain experiences common in intensive care units. Moreover, most available datasets lack comprehensive pain assessments for children who are sedated or reliant on specialized medical equipment, such as respiratory machines and tracheal tubes²⁹. In contrast, the PFECIC dataset is the first to specifically address pain facial expressions in critically ill children, encompassing both mechanically and non-mechanically ventilated patients. This dataset enables consistent and accurate pain level evaluation, providing valuable insights for physicians to administer analgesics more effectively. Furthermore, PFECIC surpasses all previously mentioned datasets in scale, making it a superior resource for training advanced deep learning algorithms, ultimately enhancing pain recognition performance in critically ill pediatric patients.

Limitations and future research

Although the dataset has achieved satisfactory results, there are still some limitations. First, oxygen tubes, nasogastric tubes, and endotracheal tubes worn by critically ill children could partially obstruct facial images, posing challenges for facial expression analysis-based pain assessment in children. Second, relying solely on facial expression analysis was insufficient for accurately assessing pain in children. Recognition algorithms incorporating multimodal data, such as body movements and physiological indicators, were required. Third, statistical parity in the distribution of procedure types was not achieved across demographic subgroups, introducing potential confounding variables. Fourth, dataset annotations were exclusively based on observational assessments, without incorporating self-reported pain metrics for children with verbal communication abilities.

Further research should encompass the development of multimodal data recognition algorithms based on the established dataset and the expansion of application scenarios for existing datasets. We are enhancing the PFECIC dataset by including videos of each child’s face and limb movements and physiological indicators. It is essential to collect data from a broader range of scenarios in the future, including general ward settings, and to incorporate various types of acute and chronic pain. Additionally, integrating validated pediatric self-reported pain scales will provide convergent validity evidence, enhancing the database’s ecological relevance. This will be the first-ever multimodal dataset for pediatric pain assessment, laying the groundwork for developing multimodal monitoring-based models for pediatric pain assessment.

Conclusion

In this paper, we introduced a novel dataset PFECIC for pain facial expression of critically ill children’s pain to carry forward research on pian facial expression recognition in critically ill patients. The PFECIC dataset comprises a total of 119 videos capturing children’s pain expression, along with 6951 pain facial expression images sourced from 53 Chinese critically ill children treated at the Children’s Hospital of Fudan University. In this study, we employed a deep learning-based pain expression analysis algorithm to evaluate the PFECIC dataset. The PFECIC dataset demonstrates superior accuracy, rationality, usability, and comprehensiveness in training algorithm models and can serve as a valuable resource for training and testing algorithms for pain assessment in critically ill children.

Declaration

Data availability

The dataset generated and analyzed in this study is available for request from the corresponding author. Applicant needs to submit a formal proposal, which must specify intended use for scientific research. Upon ethical approval by our institutional review board who need to specify intended use for to the images on any applicant’s request to submit a proposal, guaranteeing that the images and videos will only be used for scientific research, academic discussions, presentations, and publications. Access will be granted once the proposal is reviewed and approved by our institution’s ethics committee.

References

Raja, S. N. et al. The revised international association for the study of pain definition of pain: concepts, challenges, and compromises. Pain 161, 1976–1982. https://doi.org/10.1097/j.pain.0000000000001939 (2020).
Article PubMed PubMed Central Google Scholar
Goldberg, D. S. & McGee, S. J. Pain as a global public health priority. BMC Public Health 11, 770. https://doi.org/10.1186/1471-2458-11-770 (2011).
Article PubMed PubMed Central Google Scholar
Baarslag, M. A. et al. How often do we perform painful and stressful procedures in the paediatric intensive care unit? A prospective observational study. Aust. Crit. Care 32, 4–10. https://doi.org/10.1016/j.aucc.2018.04.003 (2019).
Article PubMed Google Scholar
Shen, Q., Zheng, X., Lin, Z., Li, X. & Len, H. Survey of current status of children’s pain management practice in 66 hospitals in China. Chinese Nurs. Manag. 19, 187–193. https://doi.org/10.3969/j.issn.1672-1756.2019.02.007 (2019).
Article Google Scholar
Sajeev, M. F. et al. Interactive video games to reduce paediatric procedural pain and anxiety: a systematic review and meta-analysis. Br. J. Anaesth. 127, 608–619. https://doi.org/10.1016/j.bja.2021.06.039 (2021).
Article PubMed Google Scholar
Lin, Z., Zheng, X. L., Shen, Q. & Li, X. Research advances in pain assessment in children. Chinese General Pract. Nurs. 17, 3098–3101 (2019).
Google Scholar
Zhang, Y. A Clinical Research about Infant Pain Assessment Tools and Influencing Factors (Henan University, 2020).
Google Scholar
Wang, J., Lian, Q. & Zhang, B. Evaluation of pediatric pain. Chinese J. Appl. Clin. Pediatr. 21, 711–712 (2006).
CAS Google Scholar
Pei, J. et al. A review of chronic pain assessment tools for children with cerebral palsy. Chinese Nurs. Manag. 20, 308–312 (2020).
Google Scholar
Ye, J., Zhu, J., Jiang, A., Li, H. & Zuo, J. Facial expression recognition: A survey. Shuju Caiji yu Chuli = J. Data Acquisit. Process. 21 (2020).
Sariyanidi, E., Gunes, H. & Cavallaro, A. Automatic analysis of facial affect: A survey of registration, representation, and recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37, 1113–1133. https://doi.org/10.1109/tpami.2014.2366127 (2015).
Article PubMed Google Scholar
Zhi, R.-C. & Zhou, C.-X. Review of automatic pain recognition. Jisuanji Xitong Yingyong = Comput. Syst. Applic. 9 (2020).
Saeed, S. et al. Automated facial expression recognition framework using deep learning. J. Healthc. Eng. 2022, 5707930. https://doi.org/10.1155/2022/5707930 (2022).
Article PubMed PubMed Central Google Scholar
Lucey, P., Cohn, J. F., Prkachin, K. M., Solomon, P. E. & Matthews, I. in 2011 IEEE International Conference on Automatic Face & Gesture Recognition (FG). 57–64.
Walter, S. et al. The biovid heat pain database data for the advancement and systematic validation of an automated pain recognition system. 2013 IEEE International Conference on Cybernetics (CYBCO). 128–131, https://doi.org/10.1109/CYBConf.2013.6617456 (2013).
Werner, P., Al-Hamadi, A., Gruss, S. & Walter, S. Twofold-multimodal pain recognition with the X-ITE Pain Database. 2019 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW). 290–296, https://doi.org/10.1109/ACIIW.2019.8925061 (2019).
Yan, J. et al. FENP: A database of neonatal facial expression for pain analysis. IEEE Trans. Affect. Comput. 14, 245–254. https://doi.org/10.1109/TAFFC.2020.3030296 (2023).
Article Google Scholar
Brahnam, S., Nanni, L., Sexton, R. & Jain, L. C. Introduction to neonatal facial pain detection using common and advanced face classification techniques. In Advanced Computational Intelligence Paradigms in Healthcare (eds Yoshida, H. et al.) (Springer, 2007).
Google Scholar
Zhi, R., Zamzmi, G. Z. D., Goldgof, D., Ashmeade, T. & Sun, Y. Automatic infants’ pain assessment by dynamic facial representation: Effects of profile view, gestational age, gender, and race. J. Clin. Med. 7, 173. https://doi.org/10.3390/jcm7070173 (2018).
Article PubMed PubMed Central Google Scholar
Egede, J., Valstar, M., Torres, M. T. & Sharkey, D. Automatic neonatal pain estimation: An acute pain in neonates database. 2019 8th International Conference on Affective Computing and Intelligent Interaction (ACII). 1–7, https://doi.org/10.1109/ACII.2019.8925480.
The U.S. Food and Drug Administration (FDA). Good Machine Learning Practice for Medical Device Development: Guiding Principles, https://www.fda.gov/medical-devices/software-medical-device-samd/good-machine-learning-practice-medical-device-development-guiding-principles (2021).
Benoot, C., Hannes, K. & Bilsen, J. The use of purposeful sampling in a qualitative evidence synthesis: A worked example on sexual adjustment to a cancer trajectory. BMC Med. Res. Methodol. 16, 21. https://doi.org/10.1186/s12874-016-0114-6 (2016).
Article PubMed PubMed Central Google Scholar
Ambuel, B., Hamlett, K. W., Marx, C. M. & Blumer, J. L. Assessing distress in pediatric intensive care environments: the COMFORT scale. J. Pediatr. Psychol. 17, 95–109. https://doi.org/10.1093/jpepsy/17.1.95 (1992).
Article CAS PubMed Google Scholar
Liaw, J. J. et al. Non-nutritive sucking relieves pain for preterm infants during heel stick procedures in Taiwan. J. Clin. Nurs. 19, 2741–2751. https://doi.org/10.1111/j.1365-2702.2010.03300.x (2010).
Article PubMed Google Scholar
Hesselgard, K., Larsson, S., Romner, B., Stromblad, L.-G. & Reinstrup, P. Validity and reliability of the behavioural observational pain scale for postoperative pain measurement in children 1–7 years of age. Pediatr. Crit. Care Med. 8, 102–108. https://doi.org/10.1097/01.PCC.0000257098.32268.AA (2007).
Article PubMed Google Scholar
Liu, Z. et al. Swin transformer: Hierarchical vision transformer using shifted windows. 2021 IEEE/CVF International Conference on Computer Vision (ICCV). 9992–10002, https://doi.org/10.1109/ICCV48922.2021.00986.
Adadi, A. A survey on data-efficient algorithms in big data era. J. Big Data 8, 1–54. https://doi.org/10.1186/s40537-021-00419-9 (2021).
Article Google Scholar
Dapogny, A. et al. On automatically assessing children’s facial expressions quality: A study, database, and protocol. Front. Comput. Sci. (Lausanne) https://doi.org/10.3389/fcomp.2019.00005 (2019).
Article Google Scholar
Alomani, M. H., Alanzi, M. F. & Alotaibi, M. Y. System, space, staff, and Stuff framework in establishing a new pediatric critical care unit (PICU) (4S Framework). J. Pediatr. Perinatol. Child Health https://doi.org/10.26502/jppch.74050129 (2022).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We would like to thank Rui Wang for polishing our paper.

Funding

This work was supported by the Shanghai Shenkang Hospital Development Center (SHDC12024136), the Science and Technology Commission of Shanghai Municipality (No. 22511106000) and the Science and Technology Commission of Shanghai Municipality (No. 23511100600).

Author information

Longquan Jiang, Mengqi Wu, and Weijia Fu contributed equally to this work.

Authors and Affiliations

Industrial Internet Innovation Center (Shanghai) Co., Ltd., Shanghai, 201206, China
Longquan Jiang
School of Computer Science Fudan University, Shanghai, 200438, China
Mengqi Wu & Rui Feng
Nursing Department, Children’s Hospital of Fudan University, Shanghai, 201102, China
Weijia Fu, Ying Gu, Weijuan Gong, Yan Qin & Yulu Xu
Education and Training Department, Children’s Hospital of Fudan University, Shanghai, 201102, China
Yingwen Wang
Medical Information Center, Children’s Hospital of Fudan University, Shanghai, 201102, China
Fan Zhang
Respiratory Department, Children’s Hospital of Fudan University, Shanghai, 201102, China
Xiaobo Zhang

Authors

Longquan Jiang
View author publications
Search author on:PubMed Google Scholar
Mengqi Wu
View author publications
Search author on:PubMed Google Scholar
Weijia Fu
View author publications
Search author on:PubMed Google Scholar
Yingwen Wang
View author publications
Search author on:PubMed Google Scholar
Ying Gu
View author publications
Search author on:PubMed Google Scholar
Fan Zhang
View author publications
Search author on:PubMed Google Scholar
Weijuan Gong
View author publications
Search author on:PubMed Google Scholar
Yan Qin
View author publications
Search author on:PubMed Google Scholar
Yulu Xu
View author publications
Search author on:PubMed Google Scholar
Rui Feng
View author publications
Search author on:PubMed Google Scholar
Xiaobo Zhang
View author publications
Search author on:PubMed Google Scholar

Contributions

L.Q.J. contributed to conceptualization, writing—original draft, software; M.Q.W. contributed to writing—original draft, formal analysis, methodology. W.J.F. contributed to writing—original draft, project administration, visualization; Y.W.W. contributed to investigation, data curation, formal analysis; Y.G. contributed to resources, supervision, validation; F.Z. contributed to resources, project administration; W.J.G., Y.Q., and Y.L.X. contributed to investigation, data curation; R.F. contributed to conceptualization, writing—review & editing, methodology, software. X.B.Z. contributed to conceptualization, writing—review & editing, supervision, funding acquisition.

Corresponding author

Correspondence to Xiaobo Zhang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Jiang, L., Wu, M., Fu, W. et al. Construction and validation of a pain facial expressions dataset for critically ill children. Sci Rep 15, 17214 (2025). https://doi.org/10.1038/s41598-025-02247-w

Download citation

Received: 03 December 2024
Accepted: 12 May 2025
Published: 17 May 2025
DOI: https://doi.org/10.1038/s41598-025-02247-w