Fig. 4
From: A large histological images dataset of gastric cancer with tumour microenvironment annotation for AI

Workflow of data preprocessing and model architecture. (A) The histological slide image, that is, whole slide image (WSI) is digitized, segmented, and tessellated into 224 × 224 patches. (B) ViT model pipeline: Patch image is linearly projected into flattened patches, followed by feature extraction via a transformer layer with multi-head attention. Predictions for various tissue classes are performed using a multi-layer perceptron (MLP). (C) EfficientNet model pipeline: The input image undergoes initial feature extraction via a convolution layer, followed by deep feature extraction using MBConv blocks. Extracted features are then processed through global average pooling, a flatten layer, a dropout layer, and a fully connected (FC) layer for the prediction of various tissue classes.