Figure 4 | Scientific Reports

Figure 4

From: Cross-modal semantic autoencoder with embedding consensus

Figure 4

The process of CSAEC. We map the datasets to an embedding space, learn projections by multi-modal semantic autoencoder and reconstruct original features. \((V\;T)\) is the original data matrix, \({U_i}\) is a low-dimensional consensus vector of embedding consensus \({\varphi ^d}\), W is a low-dimensional embedding matrix, C is the corresponding semantic code. Two encoders \({P_v},{P_t}\) project image and text data into low-dimensional space A, and two decoders reproject A back to high-dimensional data.

Back to article page