Introduction

Social networks consist of community member nodes with stable connections1,2. Individual behavior changes cause communities to evolve, making social networks dynamic entities3,4. These networks contain rich information and diverse patterns5, making knowledge mining increasingly popular, especially for predicting critical evolution events6.This belongs to the research category of social network analysis7,8. Communities will experience various events during the evolution process9, including formation, dissolution, growth, shrinkage, survival, merger, and split, as shown in Fig. 1. This is significant across fields: analyzing social networks aids in predicting crime10; tracking infected communities helps anticipate disease outbreaks11; analyzing rumor dissemination patterns maintains social stability12; and detecting malicious content in social media networks13.

Fig. 1
figure 1

Diagram of critical events.

Community evolution prediction involves forecasting future evolution events based on historical features with time-series relationships in the community evolution process. Currently, the framework for community evolution prediction is studied in four steps: dividing the time windows of dynamic networks14,15, detecting the community structure within networks under different time windows16,17, tracking the evolutionary sequences of different communities and identifying critical events18, and predicting the next event based on the tracked information19. As the final step, designing an effective method to predict events is crucial. The challenge lies in extracting characteristic information and designing a reasonable classification model for prediction.

Fig. 2
figure 2

The relationship between features and critical events.

Mainstream prediction methods rely on historical community state features to predict future events, as shown in Fig. 2. Researchers have introduced various methods to describe the state of communities, including direct calculation of structural features and discovering latent structural features. When the network state undergoes drastic changes, capturing the complete evolution process becomes challenging, affecting the quality of community state feature descriptions. Unlike historical states, community feature change rules tend to be more stable. This study comprehensively considers how changing feature patterns influence community evolution direction.

We hypothesize specific feature change patterns correspond to each type of evolutionary event, using changed feature information (Fig. 2). In our prediction method, we design an approach to obtain a complete sequence of change features. However, this field mainly relies on classical classifiers, which fail to fully capture the relationship between changing feature sequences and event sequences. This study adopts the inspiration of recurrent neural networks (RNN)20,21. In our prediction method, we designed a parallel Long Short-Term Memory (LSTM) model to learn the relationships. The parallel learning mechanism, which includes multi-process concurrency and parameter sharing, aims to reduce time consumption. This model captures the impact of changes in community feature patterns on critical events and fully utilizes time series information.

The prediction method based on feature change patterns fully mines the knowledge contained in the evolutionary sequences. The significance of our work can be summarized as follows:

  • To reveal the feature change patterns within community evolution, this paper proposes a prediction method. It uses more advantageous differential features in time series to characterize these patterns and considers their influence on the evolution direction.

  • To absorb knowledge from feature change patterns, we developed a parallel LSTM model. Recognizing that each community is influenced by the same network environment, we introduced a parallel mechanism into the existing LSTM model to enhance operational efficiency.

  • Compared to mainstream methods for predicting critical events, our method achieves better prediction accuracy in real-world networks under the setting of using the same backbones.

  • Compared with other deep learning models, the time cost of our proposed model with a parallel mechanism is lower.

Related work

Several approaches have tackled the challenge of predicting dynamic social network evolution. Dakiche et al.22 partitioned timeframes based on network activity distribution. Li et al.23 proposed a multi-objective optimization-based community detection algorithm. Berahmand et al.24proposed an efficient attributed graph clustering/community detection algorithm named WSNMF, which is an innovative extension of SNMF. This method introduces node attribute similarity to compute a weight matrix, effectively bridging the gap for attributed graph clustering. SDAC-DA25 transforms the attribute network into a dual-view network, applies a semi-supervised autoencoder layering approach to each view. The resulting representation layer contains high clustering-friendly embeddings, which are optimized through a unified end-to-end clustering process for effectively identifying clusters. Bródka et al.26 introduced the GED algorithm for tracking critical events. Ilhan et al.27 developed a method for community similarity and evolution tracking. Li et al.28 utilized resistance distance for community evolutionary chain tracking. These studies enhanced network partitioning, community discovery, or tracking methods, all contributing to the final step of predicting future events.

To enhance predictive accuracy, numerous studies have proposed innovative approaches to community state feature extraction. For instance, Gliwa et al.29 introduced group features, while Br’odka et al.30 described community states using micro-node and meso-level characteristics. Pavlopoulou et al.31 added temporal features to structural features. Dakiche et al.32 proposed a feature set combining community structure and member influence for event prediction based on feature change rates. Tajeuna et al.33 developed a feature set to assist Cox regression, and Mohammadmosaferi et al.34 introduced AFIF for identifying critical structural features. Ding et al.35 introduced a 48-dimensional feature set for enhanced prediction accuracy.

Some methods employ graph mining techniques to extract community state features. Wang et al.36 developed algorithms for hypergraph construction to mine inter-community features. Revelle et al.37 used a graph neural network(GNN) with an attention mechanism to obtain community representations. Chen et al.38 integrated static features with dynamic information, exploring community structural features through DeepWalk and spectral propagation. Feature selection among a large number of community state features may be a good strategy. For example, Sheikhpour et al.39 proposed a feature selection method formulated in the trace ratio form, integrating hypergraph Laplacian-based semi-supervised discriminant analysis (SDA) and the mixed convex and non-convex \(\ell _{2, p}\)-norm\((0<p\le 1)\) regularization.

Choosing an appropriate classification model is crucial for leveraging feature effectiveness. Saganowski et al.40 found that the random forest classifier, ranked second among 15 classifiers, can classify all events and is widely used in existing research. Dakiche et al.41 validated the effectiveness of customized timeframe partitioning using four traditional machine learning methods. Rajita et al.42integrated multiple classifiers to improve prediction accuracy over individual models. They also applied GANs43 to stabilize data distribution. These models still rely on historical state characteristics and underutilize temporal information in samples.

The above methods focus fully on state feature information. This article describes a pattern by learning the relationship between the sequence of differential features and the sequence of events, and proposes a prediction method to obtain complete sequence information. For the prediction method based on feature change patterns, we need a classification model that learns variation patterns and time series information. Therefore, we chose the LSTM model and enhanced it with parallel mechanisms and learning strategies for event prediction.

Problem description

Fig. 3
figure 3

Illustration of the framework of prediction method based on feature change patterns.

In a dynamic social network, a timestamp \(t\) typically denotes a specific time point. Each timestamp corresponds to a static social network graph, denoted as \(\{V, E\}\), where \(V\) represents the set of all nodes and \(E\) represents the set of all edges. We utilize a series of time-ordered graphs to record the evolution of a dynamic social network \(G\). \(m\) is the number of timestamps.

$$\begin{aligned} G = \bigcup \limits _{i = 1}^m {{G_{{t_i}}}} = \bigcup \limits _{i = 1}^m {({V_{{t_i}}},{E_{{t_i}}})} \end{aligned}$$
(1)

A framework for community evolution prediction processes the data in the following steps, as shown in Fig. 3a:

  • The timeframe partitioning strategy segments the dynamic network into a series of timeframes \(\{T_1, T_2,\dots , T_\tau \}\). Each timeframe represents a time window comprising several consecutive timestamps, describing the evolution of the social network over that period. Here, \(\tau\) denotes the number of timeframes. The dynamic network is represented as \(\bigcup \limits _{i = 1}^{\tau } g_{T_i}\).

  • The community discovery algorithm is applied to identify the community structure within \(g_{T_i}\), resulting in the community set \(\bigcup \limits _{j}C_{T_i}^{j}\).

  • The community tracking algorithm tracks a series of evolutionary sequences and critical events with time series relationships. For instance, an evolutionary sequence with a length \(L\) of 3 can be represented as \(\{C_{T_{L-2}}^{a}, C_{T_{L-1}}^{b}, C_{T_{L}}^{c}\}\). Historical critical events can be represented as \(\{\textrm{event}_{T_{L-2}}^{a'}, \textrm{event}_{T_{L-1}}^{b'}\}\).

  • The features of the community evolutionary sequence are calculated, and a classification model is subsequently trained to predict future critical events \(\{\textrm{event}_{T_{L}}^{c'}\}\).

The framework aims to predict the critical events that might occur in the \(T_{\tau +1}\) timeframe of the community. Our main focus is on the fourth step of the framework. Accurately capturing community evolution, as well as utilizing classifiers that can effectively extract this information, is crucial for enhancing the accuracy of event predictions.

Methods

Datasets

Our method was applied to five real datasets with different categories and sizes: they were the communication network Autonomous System (AS) dataset44, the AS-Caida dataset44, the co-authorship network DBLP dataset45, the post network Facebook dataset46, and the question-and-answer network Sx-askubuntu-c2q dataset47. The basic parameters of the network are shown in Table 1. The fifth column represents the time span, the sixth column represents the number of timestamps, the seventh column represents the average node count, and the eighth column represents the average edge count.

Table 1 Network Statistics.

The proposed prediction algorithm based on feature change patterns

As the final step of the evolutionary prediction problem, the approach is outlined in Fig. 3b. After screening and normalizing the community features, an appropriate machine learning method is selected for classification prediction. However, general methods only focus on the role of past community state information on events, as shown in Fig. 3c. The state features represent the static features of a community at a specific moment, while evolution events represent the process of change in the community from one moment to the next. Associating state features with evolution events clearly overlooks the dynamic nature of these events. Using differential features can solve this problem.

We propose a novel prediction algorithm based on feature change patterns, outlined in four steps as depicted in Fig. 3d: First, calculate community features in differential form. Second, design a regression model and training strategies to predict future feature changes. Third, use a linear layer that extract high-order features to describe feature change patterns. Fourth, utilize a parallel LSTM model to learn these patterns and predict future events. These patterns encompass both long-term trends and short-term variations. An LSTM model trained with community differential features can simultaneously capture both aspects. The differential features at the current moment represent short-term variations, while historical differential features imply long-term trends. This enables the LSTM model to learn the feature change patterns based on the differential features effectively. In the following sections, we detail the prediction algorithm.

To facilitate subsequent discussions, we succinctly represent an evolutionary sequence of length \(L\) as \(\bigcup _{i=1}^{L} C_{T_i}\). The representation of historical critical event sequence is \(\bigcup _{i=1}^{L-1} event_{T_i}\). The goal is to predict the critical event \(event_{T_L}\) when \(C_{T_L}\) evolves into \(C_{T_{L+1}}\).

Community feature extraction

For each community state features (\(SF\)), we adopt the 48-dimensional features proposed by Ding35, as shown in Table 2. Ding’s research indicates that these features effectively characterize community attributes and contribute to successful evolutionary analysis. The calculation formula for 48-dimensional features is detailed in Appendix A.

Table 2 Community state features.
$$\begin{aligned} SF = [f^{1}, f^{2}, \ldots , f^{48}] \end{aligned}$$
(2)

Fig. 3c shows that traditional prediction strategies concatenate historical features as input and use \(event_{T_L}\) as the label, without adequately addressing the influence of changes in community features on evolutionary events. We examine the changing patterns of these features and establish their relationship with events. To achieve this, we apply a differencing operation to the community’s 48-dimensional feature vectors across adjacent time frames, as detailed in Eq. (3). This process results in a sequence of changing features (\(CF\)), denoted as \(\bigcup _{i=1}^{L-1} CF_{T_i}\).

$$\begin{aligned} CF_{T_{i-1}} = SF_{T_{i}} - SF_{T_{i-1}}, 1 < i \le L \end{aligned}$$
(3)

LSTM regression network

The objective is to predict \(event_{T_L}\), but computing the differential feature corresponding to \(event_{T_L}\) is unfeasible, because the features of \(C_{T_{L+1}}\) are unknown. Therefore, we devise an LSTM regression network to forecast the changing features \(CF_{T_L}\) for each evolutionary sequence. The LSTM regression network is chosen as the learning model for its capability to predict 48-dimensional features simultaneously and its efficient memory management for sequential tasks.

The LSTM regression model consists of an LSTM layer with a hidden dimension of 64 and a linear layer with an output dimension of 48. The training strategy of the model is as follows.

Step 1: The sequence of changing features is divided into an input set and a label set for the regression model. The differential feature of the current timeframe is used as the label for the preceding timeframe (Eq. (4)). Then, the regression model is trained. The mean square error (MSE) is used to calculate the loss during model training (Eq. (5)). In this equation, \(CF_{T_i}\) represents the actual value, \(\hat{CF_{T_i}}\) is the predicted value, and \(n\) denotes the number of evolutionary sequences. To minimize the loss function value, the gradient descent method is employed.

$$\begin{aligned} & input = \bigcup _{i=1}^{L-2} CF_{T_i},\; label = \bigcup _{i=2}^{L-1} CF_{T_i} \end{aligned}$$
(4)
$$\begin{aligned} & \textrm{MSE} = \frac{1}{n} \sum _{1}^{n} \frac{\sum _{i=2}^{L-1}(CF_{T_i} - \hat{CF_{T_i}})^{2}}{L-2} \end{aligned}$$
(5)

Step 2: The trained model is used to input all changing feature sequences and predict future feature values (Eq. (6)).

$$\begin{aligned} output(C_{T_L} \rightarrow C_{T_{L+1}}) = CF_{T_L} \end{aligned}$$
(6)

Step 3: Combine historical differential feature sequences to obtain the complete sequence of feature changes, denoted as \(\bigcup _{i=1}^{L} CF_{T_i}\).

Extracting high-order features

The 48-dimensional features cover micro, meso, and macro aspects of the network, exhibiting high dimensionality and diverse types. Each community’s differential features evolving from \(T_{i-1}\) to \(T_{i}\) correspond to a specific event. Capturing higher-order features (\(HF\)) from the differentials aids the model in learning these associations in subsequent steps. High-order feature sequences describe more intricate patterns of feature changes, offering greater accuracy than using differential features directly. Extracting high-order features via a linear layer can also reduce their dimensionality.

A linear layer performs a linear transformation on the input (Eq. (7)).

$$\begin{aligned} & HF_{n*o} = CF_{n*s}*W_{s*o} + b \end{aligned}$$
(7)
$$\begin{aligned} & o = \frac{1}{2} * s \end{aligned}$$
(8)

Where \(n\) is the number of samples, \(s\) is the number of input neurons (the number of sample features), \(o\) is the number of output neurons (the number of high-order features of the sample), \(W\) is the parameter to be learned by the module, and \(b\) is an \(o\)-dimensional vector bias. The output is a linear combination of input features. In this experiment, the linear layer halves the feature dimension (Eq. (8)). After processing through this module, the high-order feature sequence \(\bigcup _{i=1}^{L} HF_{T_i}\) is obtained, which is used to describe the feature changing patterns during community evolution.

Parallel LSTM prediction model

We utilize the proposed parallel LSTM prediction model to learn the patterns of feature changes. The reasons for choosing this model are as follows: Firstly, leveraging an RNN to learn the correspondence between two sets of sequences aligns with the sequence-to-sequence pattern of RNNs. To mitigate the problem of gradient vanishing in RNNs, the prediction model is designed based on the LSTM model. Secondly, as the scale of social networks continues to expand, processing large-scale data faces the challenge of high computational complexity. Parallel learning models can reduce time consumption, but independent training will increase the number of model parameters. Since different communities are influenced by the same overall network environment during the evolution process, we propose a parallel LSTM prediction model based on parameter sharing.

Algorithm 1
figure a

Parallel LSTM Prediction Model

All feature change sequences \(\bigcup _{i=1}^{L} HF_{T_i}\) constitute the feature set (Feature_Seqs), while all critical event sequences \(\bigcup _{i=1}^{L} event_{T_i}\) constitute the event set (Event_Seqs). Algorithm 1 outlines the flow of the parallel LSTM model to predict critical events.

Each LSTM model consists of an LSTM layer with a hidden dimension of 64 and a linear layer, which transforms the output vectors of the LSTM layer into label vectors (line 5). The learning process of a neural network is essentially the process of adjusting the weight matrix. After each LSTM model is trained, a set of weight parameters, labeled \(W_{all}\), is obtained.

The prediction model adopts two mechanisms: weight sharing and multi-process concurrency, to reduce training time. The training set is evenly divided into five parts. One part is trained with an LSTM model, and the learned parameter \(W_{\textrm{all }\_1}\) is shared with the remaining four LSTM models. These four LSTM models are then trained concurrently using multi-processing, based on the shared parameters, and continue to train on the remaining four datasets. For the final result of the test set, a voting method is employed. Each sample is predicted using all five LSTM models, and the result with the highest number of votes is selected as the final result.

When the number of sample sets is less than 1000, using the parallel mechanism results in extremely imbalanced categories within each small sample set, which is not conducive to model learning. At this point, we introduced a supplementary learning mechanism using a single LSTM model, as indicated by the purple arrow in Fig. 4

Fig. 4
figure 4

Parallel LSTM model processing dataset.

Setting parameters

The parameter dimensions are shown in the Table 3. \(W_{ih}\) and \(b_{ih}\) are the weight matrix and bias between the input layer and the hidden layer, respectively. \(W_{hh}\) and \(b_{hh}\) are the weight matrix and bias within the hidden layer, respectively. \(W_{o}\) and \(b_{o}\) are the weight matrix and bias between the hidden layer and the output layer, respectively. In the subsequent experiments, the number of hidden layers is set to 1 and the number of hidden units is set to 64.

The reasons for setting parameters in this way and more detailed parameter settings can be found in Appendix B.

Table 3 Parameter dimensions of the model.

Experimental results and analysis

This paper proposes a prediction method to characterize the pattern of community feature changes, learn their impact on evolutionary events, and forecast critical events in the next timeframe. In this section, we will analyze the effectiveness of the prediction method from two aspects: the accuracy of the prediction algorithm and the time complexity of the model.

Baselines

To verify the effectiveness of our proposed prediction method, we compared it with seven other recent mainstream community evolution prediction algorithms. Evolution event prediction methods based on machine learning are used in Bródkas’s algorithm, Dakiche’s algorithm, Tajeuna’s algorithm, TNSEP algorithm, MF-PSF algorithm and SATPM algorithm such as random forests, logistic regression, and probability-based methods. The GNAN algorithm uses a GNN-based prediction method. In addition, we also validated our proposed prediction algorithm on the newly community tracking algorithm(ECDR).

  • Bródka’s algorithm30: This algorithm is based on the results of the GED community tracking algorithm. It uses a classifier to learn the historical state features.

  • Dakiche’s algorithm32: Features are proposed from two aspects: community structure and influential member. The rate of change in these features is used to predict events.

  • Tajeuna’s algorithm33: For this algorithm, eleven-dimensional state features are proposed to describe the community state, and the Cox model is used to predict critical events.

  • Dakiche’s algorithm (TNSEP)22: A new framework is used to identify the timeframe partitioning. The community status is described using ten-dimensional features.

  • Revelle’s algorithm (GNAN)37: This algorithm builds a GNN model, which learns the characteristics of nodes and groups.

  • Li’s algorithm (ECDR)28: In this algorithm, the resistance distance model is used to track the community structure and discover events.

  • Chen’s algorithm (MF-PSF)38: The algorithm combines multivariate feature sets and potential structural features to describe community.

  • Ding’s algorithm (SATPM)35: The algorithm adaptively divides the timeframe size, proposes 48-dimensional features to describe the community.

Evaluation metrics

The \(\mathrm {F_{measure}}\) is used for each category as a metric, combining precision and recall in a harmonic mean form, where an \(\mathrm {F_{measure}}\) of 1 indicates optimal performance and 0 indicates poor performance.

$$\begin{aligned} F_{measure} = 2 \cdot \frac{{precision * recall}}{{precision + recall}} \end{aligned}$$
(9)

The next three metrics work together to allow for a more comprehensive evaluation of the overall performance of a method. Accuracy refers to the ratio of the number of samples correctly classified by the classifier to the total number of samples. Macro average accuracy (Macro avg) calculates the arithmetic mean of the \(\mathrm {F_{measure}}\) values for each category, ensuring equal consideration of all categories during evaluation. Weighted average accuracy (Weighted avg) weights the \(\mathrm {F_{measure}}\) values based on the sample size of each category, giving greater influence to categories with more samples. The specific calculation formulas are as follows:

$$\begin{aligned} & \begin{aligned} \mathrm {Accuracy(Acc)} = \frac{\text {Number of correct predictions}}{\text {Total number of forecasts}} \end{aligned} \end{aligned}$$
(10)
$$\begin{aligned} & \textrm{Macro}\, \textrm{avg} = \frac{1}{NT} \sum _{i=1}^{NT} F_{measure}(i) \end{aligned}$$
(11)
$$\begin{aligned} & \textrm{Weighted}\, \textrm{avg} = \frac{1}{N} \sum _{i=1}^{NT} N(i) * F_{measure}(i) \end{aligned}$$
(12)

The number of event types is denoted as \(NT\). \(N\) represents the total number of event, and \(N(i)\) represents the number of events in a specific category.

Cross-validation is primarily used to evaluate the performance of the model. We employed the five-fold cross-validation method to calculate the average accuracy (\(\mathrm {\overline{Acc}}\)). If the average performance of these five models is good, it indicates that the model has a certain level of generalization ability. The variance reflects the stability of the five outcomes in five-fold cross-validation. \(Acc_i\) represents each individual result. The formula for variance (\(\sigma ^2\)) can be expressed as:

$$\begin{aligned} \sigma ^2 = \frac{1}{5-1} \sum _{i=1}^{5} (Acc_i - \overline{Acc})^2 \end{aligned}$$
(13)

To more clearly and intuitively demonstrate the effective improvements of our proposed method, we opt to use the magnitude of improvement as the standard for presentation. Performance Improvement (PI) quantifies the degree of enhancement in predictive performance of our proposed algorithm (Prop) compared to the baseline algorithm (Base). We use standard deviation (\(\sigma\)) to reflect the volatility of the predicted results by the proposed method. This degree of improvement can be quantified by the following formula:

$$\begin{aligned} & \textrm{PI}\, \textrm{Accuracy} = (\mathrm {\overline{Acc}(Prop)} - \mathrm {\overline{Acc}(Base)})*100 \% \pm \sigma \end{aligned}$$
(14)
$$\begin{aligned} & \textrm{PI}\,\textrm{Macro}\,\textrm{avg} = (\textrm{Macro}\,\mathrm {avg(Prop)} - \textrm{Macro}\,\mathrm {avg(Base)})*100\% \end{aligned}$$
(15)
$$\begin{aligned} & \textrm{PI}\,\textrm{Weighted}\,\textrm{avg} = (\textrm{Weighted}\,\mathrm {avg(Prop)} -\textrm{Weighted}\,\mathrm {avg(Base}))*100\% \end{aligned}$$
(16)

We validated the time efficiency advantage of the model by comparing the actual time spent in the prediction process. Our experiment measured the training time, testing time, and total time for one fold of five-fold cross-validation in seconds.

Community evolution prediction results and analysis

To evaluate the effect of our proposed feature change pattern-based prediction algorithm on the results, we conducted two sets of experiments for validation: First, we evaluated the effectiveness of the characterized feature change patterns. This was verified by comparing the conventional community state feature-based prediction algorithm against the one relying on feature change patterns for event predictions. Second, the performance of the proposed prediction method was determined by contrasting the feature change pattern-based algorithm against other prediction algorithms. Additionally, the efficacy of the parallel mechanism within the parallel LSTM prediction model was also verified.

As we proposed a new method in the fourth step of the social network evolution predictive framework, the algorithm used in the first three steps remained consistent with the comparison algorithms, with changes made only in the fourth step.

Validity analysis of characterized patterns

In the fourth step, the traditional prediction method learned the relationship between the historical state features and critical events (SF_E), with inputs and labels shown in Eq. (17). This relationship was improved by learning the relationship between the change patterns of community features and critical events (CF_E), with inputs and labels shown in Eq. (18).

$$\begin{aligned} & input = \bigcup _{i=1}^{L} SF_{T_i},\; label = \bigcup _{i=1}^{L} event_{T_i} \end{aligned}$$
(17)
$$\begin{aligned} & input = \bigcup _{i=1}^{L-1} CF_{T_i},\; label = \bigcup _{i=1}^{L} event_{T_i} \end{aligned}$$
(18)

The Pearson correlation coefficient is first used to verify the effectiveness of the change features compared to the state features. The Pearson correlation coefficient is an index for measuring the linear correlation between two variables.

For two vectors \(X=[x_1,...,x_n]\), \(Y=[y_1,...,y_n]\), their Pearson correlation coefficient is

$$\begin{aligned} r = \frac{\sum (x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum (x_i - \bar{x})^2} \sqrt{\sum (y_i - \bar{y})^2}}, \end{aligned}$$
(19)

where \(\bar{x}\) is the mean of X and \(\bar{y}\) is the mean of Y. Among multiple vectors, the average correlation coefficient of each pair of vectors is regarded as the correlation coefficient of these vectors.

The community disappears after the dissolution event, so the change features cannot be calculated. Therefore, the correlation coefficient of the change features of the dissolution event cannot be obtained.

Table 4 shows Pearson correlation coefficients of 48-dimensional change features and state features in the five datasets for various events and the Pearson correlation coefficients of all events. In the same event, the Pearson correlation coefficient of change features is larger than that of state features in most of the time. In the same data set, the Pearson correlation coefficient of the change features or state features of all events is smaller than the Pearson correlation coefficient within the same event. This indicates that both the change feature and the state feature can characterize the event, but the effect of the proposed change feature is better than the state feature.

Table 4 Pearson Correlation Index.
Fig. 5
figure 5

Performance improvement of \(\mathrm {CF^{Alg}\_E}\) prediction results compared to \(\mathrm {SF^{Alg}\_E}\).

In the following, the community features of the comparison algorithm and the 48-dimensional community features are respectively marked as \(\mathrm {SF_{T_i}^{Alg}}\) and \(\mathrm {SF_{T_i}^{48}}\). Then, the differential features calculated respectively are expressed as \(\mathrm {CF_{T_i}^{Alg}}\) and \(\mathrm {CF_{T_i}^{48}}\). To further validate the effectiveness of the feature change patterns, we compared the results of the model using \(\mathrm {CF_{T_i}^{Alg}}\) and \(\mathrm {SF_{T_i}^{Alg}}\) respectively, and the results of the model using \(\mathrm {CF_{T_i}^{48}}\) and \(\mathrm {SF_{T_i}^{48}}\) respectively. Detailed parameter settings for the model are provided in the supplementary Appendix B.

In the GNAN algorithm, we averaged the node features of the same group to construct a state feature vector suitable for our method. Since the ECDR method was designed based on resistance distance, its calculation time increases as the size of the social network increases. Therefore, its application was confined to two comparatively smaller datasets: Facebook and Sx-askubuntu-c2q. Due to the ECDR algorithm tracing events without a prediction step, the community feature extraction method aligns with SATPM method.

Fig. 6
figure 6

Performance improvement of \(\mathrm {CF^{48}\_E}\) prediction results compared to \(\mathrm {SF^{48}\_E}\).

The first set of comparisons illustrates the improvement that feature change patterns bring over traditional methods when utilizing each algorithm’s proposed features to describe community state. The comparison results are shown in Fig. 5. As shown in Figs. 5a, in the AS dataset, the prediction results based on feature change patterns are slightly superior to those relying solely on state features. This can be attributed to the significant imbalance in event categories. Notably, in the remaining four datasets with relatively balanced event categories, predictions relying on feature change patterns typically outperform those based on state features. Due to the flexible time frame selection allowed by the TNSEP method, its variance is calculated as the average of the prediction results from five-fold cross-validation across three different methods, leading to significant fluctuations.

In the second set of comparisons, all compared algorithms utilize 48-dimensional features to describe communities. The improvements of predictions based on feature change patterns over those based on state features are illustrated in Fig. 6. When applying the ECDR algorithm to the Facebook dataset, predictions based on state features perform better. This is attributed to the relatively fewer critical events tracked by the ECDR algorithm, resulting in shorter evolutionary sequences and inadequate captured community change information, thereby affecting pattern learning. In addition to the above, despite variations in the quality of evolution sequences tracked by different algorithms, predictions generally excel based on feature change patterns rather than state features. More details on tracked events by algorithms are in supplementary Appendix C.

Fig. 7
figure 7

Performance improvement of \(\mathrm {CF^{48}\_E}\) prediction results compared to \(\mathrm {CF^{Alg}\_E}\).

We also compared the results of model using \(\mathrm {CF_{T_i}^{48}}\) and \(\mathrm {CF_{T_i}^{Alg}}\) respectively. The results are shown in Fig. 7. It is evident from the figure that utilizing 48-dimensional features facilitates more effective mining of feature change patterns, resulting in higher prediction accuracy.

In summary, our proposed prediction method, based on feature change patterns, effectively reveals the rules of evolution, playing an active role in predicting future evolutionary events. For detailed results of the three sets of comparative experiments in this section, refer to Appendix D.

The performance of proposed prediction method

Table 5 Prediction results of different comparision algorithms under the proposed prediction method.

The entire community evolution prediction algorithm presented in this paper comprises: using the SATPM method for timeframe division, utilizing the CPM algorithm for community detection, applying the GED algorithm for community tracking, and implementing the prediction method based on feature change patterns. Initially, the effectiveness of the proposed prediction method was validated, followed by the validation of the entire community evolution prediction algorithm’s effectiveness. Additionally, the efficacy of the parallel mechanism was confirmed.

Table 5 displays the predictive results with/without the prediction method based on feature change patterns (P). Fig. 8 demonstrate the improvement in prediction results before and after the comparison. Since the ECDR algorithm lacks a prediction step algorithm, the prediction algorithms before and after aligns with SATPM or the methodology proposed in this article.

In the Facebook dataset, the ECDR and MF-PSF methods showed poor predictive results, when using our proposed prediction method. This is attributed to the small number of communities discovered by the ECDR method, as well as the MF-PSF method dividing the dynamic network into a small number of time windows, leading to shorter evolutionary sequences with limited change information. Consequently, the change patterns were challenging to learn for the parallel LSTM model. However, as depicted in Fig. 8d, the proposed prediction method effectively improves accuracy in the Bródka and Dakiche algorithms, because of richer evolutionary sequence information. The timeframe division results of the algorithms are in Appendix C.

Fig. 8
figure 8

Performance improvement of prediction methods based on feature change patterns across five datasets.

In the AS, AS-Caida, DBLP, and Sx-askubuntu-c2q datasets, the parallel LSTM model learned patterns and improved results on all four metrics compared to the comparison algorithm. As shown in Figs. 8a,b,c and e, the parallel LSTM model effectively captured the changing patterns and improved the prediction value of future evolutionary events.

In conclusion, our prediction method achieved high accuracy in most datasets, but some results showed lower accuracy. This was primarily due to insufficient information describing changes during evolution. Therefore, optimizing community features is essential to precisely predict future evolution. For prediction results of various evolutionary events, refer to Appendix E.

As shown in Table 5, using our proposed method in the final step achieves generally better results compared to using traditional machine learning algorithms such as Random Forest and Logistic Regression, as well as evolutionary event prediction algorithms based on probability or GNN.

Table 6 Comparison of prediction model results with/without parallel mechanism under three timeframe partition methods.

Next, we investigated the effectiveness of the parallel mechanism. Given that the quantity of samples accepted is influenced by the method of time window division, we assessed the efficacy of the parallel mechanism under three timeframe partitioning methods: Disjoint, Overlapping and SATPM.

Table 6 presents the prediction results with and without parallelism. The disjoint timeframe partition method generally yields fewer critical events and evolutionary sequences. So, applying parallel mechanisms reduces predictive performance due to insufficient samples for each LSTM and category imbalance, which limits pattern learning capabilities. Conversely, the application of parallel mechanisms enhances performance when using overlapping and SATPM partition methods. Through parallel mechanisms, each model can learn the data pattern more accurately, thereby mitigating the difficulties in fitting due to an excess of samples. Furthermore, our voting mechanism allows the integration of results from the five models to yield a final prediction. As such, the parallel LSTM model assesses based on the number of samples and will temporarily suspend the parallel mechanism when sample count is fewer than 1000.

Complexity analysis

The number of parameters, also known as the capacity of the model, is a measure to evaluate the space complexity of the model. The computational complexity of each parameter is defined as \(O(1)\). During the training process of a single LSTM in the parallel LSTM model, the complexity of each time step can be measured by \(O(\mathrm {N_{LSTM}})\). Among the comparison algorithms, the GNAN method uses a deep learning model, and the symbols used in the calculation process of the model parameters are consistent with those in its paper37. The number of weights and bias parameters for both models can be calculated as Eq. (20) and Eq. (21)

$$\begin{aligned} & \begin{aligned}&O(\mathrm {N_{LSTM}})=4[(\text {input}\_\text {size}+\text {hidden}\_\text {size})*\text {hidden}\_\text {size}+ 2*\text {hidden}\_\text {size}] \\&+ \text {hidden}\_\text {size}*\text {output}\_\text {size}+\text {output}\_\text {size} = 23430 \end{aligned} \end{aligned}$$
(20)
$$\begin{aligned} & \begin{aligned}&O(\mathrm {N_{GNAN}}) =H*[g_{i}^{t}*D_{m}+D_{m}*D_{m}*3]+[(x_{u}^{t}+1)*D_{m}+D_{m}]+[g_{i}^{t}*D_{m}+D_{m}]\\&+[g_{i}^{t}*g_{i}^{t}+g_{i}^{t}]+[(D_{m}+D_{m})*E+E]=3998 \end{aligned} \end{aligned}$$
(21)

The LSTM model has significantly more parameters than the GNAN model because the three gates introduce numerous weights and biases. This enables the parallel LSTM model to better extract correlations between the two sequences and learn change patterns, positively impacting future event prediction.

The time complexity of multi-process concurrency cannot be expressed by a formula, so runtime analysis was conducted. First, to validate the effectiveness of the parallel mechanism, we measured the time consumption of the parallel LSTM prediction model with and without parallel measures. Second, to verify the proposed model’s runtime advantage over other deep learning models, we conducted time statistics on prediction algorithms based on deep learning models.

Time complexity analysis of the model under parallel mechanism

Fig. 9
figure 9

Consuming time of the parallel LSTM model on the five datasets with/without parallel measures.

Modify the final step of each comparison algorithm to the predictive method proposed in this paper, and compare the time consumption of the parallel LSTM prediction model with/without parallel mechanisms. The time consumed by the model is proportional to the number of evolutionary sequences traced. As show in Figs. 9b-9d, to make the histogram clearer, the ranges of the ordinates on both sides of the pink dotted line are different, but they are in seconds.

As shown in Figs. 9a-9e, the parallel mechanism reduced the training time. Since the test time was at the level of milliseconds and microseconds, the parallel mechanism also demonstrated an advantage in the total time consumption of the model. From this, we can conclude that the parallel mechanism effectively reduces time consumption, and the parallel LSTM model has certain advantages in terms of time efficiency.

Time complexity analysis of deep learning models

In general, deep learning models consume more time than traditional classifier models, because traditional classifier models do not require extensive training processes. Therefore, we analyze the time consumption results of the deep learning models applied in social network evolution problem. The prediction times of the parallel LSTM model, the comparison algorithm GNAN and the ordinary LSTM model are compared. The experimental settings are as follows: the first three steps of the prediction framework are consistent with the algorithm used by GNAN, while the fourth step uses respective prediction methods. The ordinary LSTM model is based on the relationship between community state characteristics and critical events. The structure and parameters of this model are consistent with the single LSTM model in the parallel LSTM model.

Fig. 10
figure 10

Time consumed by different deep learning models.

As seen from Fig. 10a, the parallel LSTM model achieved the shortest training time. This indicates that the parallel mechanism played a role in reducing time consumption. However, as shown in Fig. 10b, the parallel LSTM model did not have the shortest time in the test phase. This is because the voting mechanism adopted increased the testing time slightly. Nevertheless, the time consumption in the test phase was in milliseconds. Fig. 10c shows the total prediction time, where the parallel LSTM model had the least total time consumption across the five datasets. This demonstrates that the parallel LSTM model has a great advantage in time consumption compared with other deep learning models.

Conclusion and future work

In this study, we aimed to propose a predictive method that could focus on the changing patterns of features in evolutionary sequences. Experiments showed that the prediction algorithm based on feature change patterns effectively characterized the evolution laws, reducing misjudgment when predicting future critical events, and the parallel measures effectively reduced time consumption. The use of parallel measures not only improved prediction performance compared to other deep models but also reduced time consumption.

We described the changing characteristics of the evolutionary sequence based on the 48-dimensional features. Other information about the differences of communities in adjacent timeframes was not fully mined. In the future, we hope to develop a new perspective on the evolutionary sequence to extract changing information, helping us predict the direction of evolution.