Table 6 Definition of variables included in the POLCOVID metadata file.

From: POLCOVID: a multicenter multiclass chest X-ray database (Poland, 2020–2021)

Variable Name

Definition

origin

Name of the dataset.

filename

Anonymized unique file name of the following structure: Anonymous_<hospital_id>_<patient_id>_<class_id>.<file_format>.

patient_id

Anonymized patient identifier, unique for patients examined in the same medical center, ranging from 1 to the number of patients.

hospital

Name of the medical center where the image was created (in Polish).

hospital_eng

Name of the medical center where the image was created (translated to English).

hospital_id

Unique hospital identifier ranging from 1 to 15.

sex

Patient sex.

age

Patient age in years.

smoke

Smoking status: “Yes” for smokers, “No” for non-smokers.

smoke_packyears

Number of pack-years for smokers.

class

Diagnosis: “COVID-19” for COVID-19, “PNEUMONIA” for types of pneumonia other than COVID-19-related, and “NORMAL” for the remaining cases.

class_id

Class identifier: 1 - normal, 2 - pneumonia, 3 - COVID-19.

quality

Image quality category: “Good” - sufficient quality, “Bad” - insufficient quality. The criteria for quality assessment are described in the Technical Validation section.

subtype

Subtype label: “C1”, “C2”, “C3” for COVID-19; “P1”, “P2”, “P3” for pneumonia other than COVID-19-related; “N1”, “N2”, “N3” for the remaining cases.

set

Set to which the image was included in Prazuch et al.16: “train” – training set, “hold-out test” – testing set.