Upstreamness and downstreamness in input–output analysis from local and aggregate information

Bartolucci, Silvia; Caccioli, Fabio; Caravelli, Francesco; Vivo, Pierpaolo

doi:10.1038/s41598-025-86380-6

Download PDF

Article
Open access
Published: 21 January 2025

Upstreamness and downstreamness in input–output analysis from local and aggregate information

Silvia Bartolucci¹,
Fabio Caccioli^1,2,3,
Francesco Caravelli⁴ &
…
Pierpaolo Vivo⁵

Scientific Reports volume 15, Article number: 2727 (2025) Cite this article

1854 Accesses
4 Citations
15 Altmetric
Metrics details

Subjects

Abstract

Ranking sectors and countries within global value chains is of paramount importance to estimate risks and forecast growth in large economies. However, this task is often non-trivial due to the lack of complete and accurate information on the flows of money and goods between sectors and countries, which are encoded in input–output (I–O) tables. In this work, we show that an accurate estimation of the role played by sectors and countries in supply chain networks can be achieved without full knowledge of the I–O tables, but only relying on local and aggregate information, e.g., the total intermediate demand per sector. Our method, based on a rank-1 approximation to the I–O table, shows consistently good performance in reconstructing rankings (i.e., upstreamness and downstreamness measures for countries and sectors) when tested on empirical data from the world input–output database. Moreover, we connect the accuracy of our approximate framework with the spectral properties of the I–O tables, which ordinarily exhibit relatively large spectral gaps. Our approach provides a fast and analytically tractable framework to rank constituents of a complex economy without the need of matrix inversions and the knowledge of finer intersectorial details.

The rise and fall of countries in the global value chains

Article Open access 31 May 2022

Entropic measure unveils country competitiveness and product specialization in the World trade web

Article Open access 13 May 2021

Is China decoupling from the global value chain? A quantitative analysis framework based on the global production network

Article Open access 11 June 2025

Introduction

The introduction of input–output (I–O) analysis as a fundamental tool to analyze the inter-relationship between economic sectors of a country was pioneered by W. Leontief, who proposed the construction of the first I–O tables for the United States for the years 1919 and 1929^1,2. An I–O table summarizes how the products (outputs) of a given industry or economic sector are used as input to other industries or sectors within the same, or different, economies (for instance, in the case of import/export exchanges with other countries)³. Understanding the structure and relevance of industrial sectors and countries within the so-called global value chains (GVCs), encompassing the different stages of the production process across different countries, is of central importance⁴. To achieve this, a number of indicators and measures have been devised that characterize the relative positioning of industries and economic sectors in the economy. These rely on the calculation of the following technical object,

$$\begin{aligned} G(A)=(\mathbbm {1}_N-A)^{-1}\ , \end{aligned}$$

(1)

the so-called Leontief inverse (or resolvent) matrix. Here, $\mathbbm {1}_N$ is the $N\times N$ identity matrix (where N is the number of industrial sectors) and A is a (row) sub-stochastic matrix, which is related in a simple fashion to the original I–O table. A sub-stochastic matrix A is such that its entries are non-negative and $\sum _{j}A_{ij}\le 1$ for each row i. Notably, the upstreamness and downstreamness metrics proposed by Antrás, Chor and collaborators (see Sect. 2 for mathematical definitions) have become widely used and mainstream in recent years^5,6,7. They are meant to represent the average distance of a sector from final demand, and from primary factors of production, respectively. One of the main practical challenges of the I–O analysis lies in the accurate and reliable compilation of inter-sectorial I–O tables from which the matrix A in formula (1) is derived. This issue is particularly felt at firm-level, where often only aggregate information is available⁸.

The main contribution of our paper is to show that up-/downstreamness measures and similar resolvent-like metrics can be approximated with high accuracy even when possessing only aggregate and local information about the inter-sectorial dependencies encoded within the I–O table. In this case, the required information only amounts to the row (or column) sums of the matrix A, representing the total intermediate demand per industry (or the total value of all inputs required by each industry).

More specifically, we propose an approach rooted in complexity science that reconstructs the most likely matrix A derived from I–O tables on the basis of limited/aggregated information and uses this surrogate information to compute the Leontief inverse and related indicators (e.g., upstreamness and downstreamness). These indicators can be derived from the aggregate information available in a fast—as this procedure does not require to perform a full matrix inversion– and accurate way. Moreover, in this work we connect the accuracy of our approximate framework with the spectral properties of the I–O tables.

Related literature

There is a vast literature concerning I–O models and how inaccuracies and noise in I–O tables may affect the determination of the relative ranking of industrial sectors and countries within the economy. One strand focuses on the accuracy of the empirical I–O matrix denoted by $A_{emp}$ with respect to the true matrix $A_{true}$. The main question is about how errors occurring in the compilation of the I–O tables propagate and affect measurements and predictions based on nonlinear functions of $A_{emp}=A_{true}+H$ (for instance, the Leontief Inverse $(\mathbbm {1}_N-A_{emp})^{-1}$), where H encodes the stochastic sources of error. Compiling the entries of the matrix $A_{emp}$ is subject to many issues, for instance the difficulty in sampling and surveying firms and flows of goods with great accuracy^9,10. This has provided the motivation to study stochastic models for the I–O analysis.

Evans¹¹ and Quandt¹² are among the first to look at this problem by constructing random models. Evans¹¹ assumed that the error matrix H had only one non-zero row and that the errors could be propagated on a row-by-row basis. Quandt¹² assumed that the errors $H_{ij}$ on the matrix elements are independent and normally distributed with mean zero, solved the error propagation problem for a small-size system (e.g. $2\times 2$), and determined the confidence intervals on the expected Leontief Inverse. Later, Simonovits¹³ deduced the fundamental inequality $\langle (\mathbbm {1}_N-A_{emp})^{-1}\rangle _H\ge (\mathbbm {1}_N-\langle A_{emp}\rangle _H)^{-1}$, where the average is taken with respect to independent matrix elements of H. This inequality circumvents the problem of inverting the matrix $\mathbbm {1}_N-A_{emp}$, where the non-linearity involved in the Leontief matrix inversion makes it challenging to study how modifications (or inaccurate determinations) of the entries of the matrix $A_{emp}$ would propagate.

One of the first comprehensive theoretical studies of stochastic I–O models is due to West¹⁴. His starting point is a random matrix H, of which the expected value and the standard error of all the elements are known, with the aim to provide approximating formulas for the expected value and the standard errors of the Leontief Inverse in terms of these known quantities. Some of the assumptions (for instance, that the errors $H_{ij}$ be independent and normally distributed) are however not realistic or plainly incompatible with the sub-stochasticity constraint, and only lead to a closed-form solution for the mean and variances of the deviations from the “true” matrix under very restrictive choices for the variances of the errors in H.

More recently, this approach has been re-evaluated by Kogelschatz¹⁵—who assumed that the $a_{ij}$ are Beta-distributed and derived estimates for the elements of the Leontief Inverse—and Kozicka¹⁶—who postulated more realistic distribution for the matrix entries, but provided explicit formulae only for small-size systems.

Within the empirical literature, a number of studies have been also undertaken to characterize the regional inter-sectorial dependence of industries and to discuss the challenges of reconstructing regional data from national accounts and surveys¹⁷.

Given the practical difficulties associated with compiling I–O tables, especially at the regional level, earlier scholars devised “shortcut” methods to estimate the Leontief inverse from incomplete or unreliable information, or even foregoing I–O tables altogether. Katz and Burford^18,19 derived a formula under the assumption that the matrix A is uniformly drawn from the set of sub-stochastic matrices, and under the rather questionable technical condition that the covariance between the entries of the matrix and the output multipliers be null. Their work hinges on an earlier formula empirically derived by Drake²⁰. The general approach based on finding “shortcuts” and foregoing a painstaking compilation of I–O tables was criticized on both technical and conceptual grounds^21,22,23,24 before this line of investigation was dropped and even ignored altogether in the subsequent related literature.

The Leontief inverse and the associated indicators have also been looked at through the prism of complexity and network science. Cerina et al.²⁵ analyzed the properties of the (global and regional) network of industries in different economies reconstructing the monetary goods flows (edges) using the I–O matrix. McNerney et al.²⁶ used average national output multipliers to predict future economic growth and price changes. In²⁷, a model for the propagation and amplification of idiosyncratic shocks along the I–O network is provided. In²⁸, a network analysis of the World I–O Data set is undertaken to analyze the temporal interdependence between countries and industrial sectors.

In recent years the interest in I–O models has grown steadily²⁹, also in view of a rather compelling connection to models of complexity and networks^28,30. Moreover, many of these ideas can in principle be extended to more general sector-product spaces, which saw many uses for the study of the connection between complexity measures, productivity and economic growth^31,32,33,34 (see however^35,36 for mathematical issues surrounding the Economic Complexity Index and resolutions thereof).

Another strand of the literature looks at entropic measures of inter-sectorial complexity. Jacquemin and Berry³⁷ introduce an entropy-based measure of corporate diversification, highlighting its additivity across different levels of product or industry aggregation. This metric is shown to better capture nuanced diversification patterns compared to alternatives like the Herfindahl index, particularly when assessing contributions of diversification within and across industry sectors. Their empirical analysis of 460 large U.S. manufacturing corporations demonstrates that diversification into closely related industries, as well as more distant sectors, correlates positively with corporate growth, emphasizing the utility of entropy measures for understanding diversification’s role in economic dynamics. The study³⁸ explores the dynamics of economic growth through a model of export evolution derived from global trade network data. It links economic complexity to the diversity and specialization of national export baskets by employing stochastic differential equations to simulate resource transfer between exports. The authors introduce a novel complexity measure based on Shannon entropy, integrated with specialization metrics, and demonstrate its alignment with GDP per capita and growth trajectories across 223 countries over 21 years. This framework unveils the interplay of cooperative and competitive forces in trade, offering insights into growth potentials via counterfactual analyses. The subsequent work³⁹ expands upon this by refining economic complexity measures using an iterative, entropy-based methodology. Their approach captures the diversity and ubiquity of exports within a bipartite network of countries and products, employing Shannon entropy to estimate the bare diversity of products and sectors. The study introduces intra- and inter-sectorial decomposition, providing nuanced assessments of economic efficiency and specialization. The results highlight the advantages of retaining full trade data granularity and demonstrate the utility of these measures in distinguishing national economic structures and developmental pathways. In the following section, we will focus on the works by Antràs and Chor⁴, Fally et al.⁶ and Miller et al.⁷, where different incarnations of the so-called upstreamness and downstreamness measures have been first proposed. An early example of a direct application of those measures for the analysis of empirical data on global value chains can be found in⁴⁰, now used in multiple contexts^41,42.

Definition of upstreamness and downstreamness

Antràs et al.⁴ considered a closed economy of N industries. For each industrial sector $i= 1, \dots , N$ we indicate the value of gross output with $Y_i$ and the total intermediate demand (i.e., the use of the output of an industry as a final good) with $F_i$. Then the following equality holds in I–O tables:

$$\begin{aligned} Y_i= & F_i +Z_i = F_i+\sum _{j=1}^N a_{ij}= \end{aligned}$$

(2)

$$\begin{aligned}= & F_i +\sum _{j=1}^N d_{ij}Y_j \, \end{aligned}$$

(3)

with $Z_i = \sum _{j=1}^N d_{ij}Y_j$ corresponding to the output of industry i used as intermediate input to other industries (intermediate demand) as shown in the scheme in Fig. 1. In Eq. (2), $a_{ij}$ is the total value in monetary units (e.g. US dollars) of i’s output used to produce j’s output, while $\{d_{ij}\}$ in Eq. (3) corresponds to the monetary amount of sector i’s output used to produce one monetary unit’s worth of sector j’s output, and it is related to the matrix A via the relationship $d_{ij}Y_j = a_{ij}$. The final demand, as detailed in Sect. 4, comprises contributions from different factors including, among others, the final consumption expenditure by households and government, and exports.

Iterating the identity Eq. (2) within Eq. (3), one obtains an infinite sequence of contributions, each representing the use of sector i’s output at different levels within the value chain³

$$\begin{aligned} Y_i = F_i + \sum _{j=1}^N d_{ij}F_j + \sum _{j=1}^N \sum _{k=1}^N d_{ik}d_{kj}F_j +\ldots \ . \end{aligned}$$

(4)

We can finally rewrite Eq. (4) as follows

$$\begin{aligned} {\varvec{Y}} = [\mathbbm {1}_N-D]^{-1}{\varvec{F}} \end{aligned}$$

(5)

using $\sum _{k\ge 0} D^k=[\mathbbm {1}_N-D]^{-1}$. In this case, $\mathbbm {1}_N$ is the $N\times N$ identity matrix, $D=(d_{ij})$ contains each sector’s output in dollar values, and $\varvec{F}$ is the vector of final demands. Antràs et al.⁴ hence proposed the following measure of upstreamness of the i-th industrial sector

$$\begin{aligned} U_{1i}= 1 \cdot \frac{F_i}{Y_i} + 2 \cdot \frac{\sum _{j=1}^N d_{ij}F_j}{Y_i} + 3 \cdot \frac{\sum _{j,k=1}^N d_{ik}d_{kj}F_j}{Y_i} + \ldots = \frac{([\mathbbm {1}_N - D]^{-2}{\varvec{F}})_i}{Y_i} \ , \end{aligned}$$

(6)

where each term contributing to Eq. (4) is weighted by their distance from final use and divided by the output of the sector $Y_i$. The notation $(\cdot )_i$ is used to indicated the i-th component of the vector. By construction, the terms of the sum that are further upstream in the value chain carry larger weight in the calculation of the upstreamness. Inserting Eq. (4) in Eq. (6), we can rewrite the upstreamness as

$$\begin{aligned} {\varvec{U}_1} = [\mathbbm {1}_N-A_U]^{-1}{\varvec{1}}_N \ , \end{aligned}$$

(7)

where

$$\begin{aligned} A_U= Y^{-1}A = \begin{pmatrix} \frac{a_{11}}{Y_1} & \cdots & \frac{a_{1N}}{Y_1} \\ \vdots & \ddots & \vdots \\ \frac{a_{N1}}{Y_N} & \cdots & \frac{a_{NN}}{Y_N} \end{pmatrix}\ \end{aligned}$$

(8)

and $Y =\textrm{diag}(Y_1,\dots ,Y_N)$. The vector ${\varvec{1}}_N$ is a column vector of N ones. The matrix $A_U$ has non-negative elements, and in this convention it is row-substochastic, i.e., $\sum _{j}(A_U)_{ij}\le 1 \ \forall i$. By construction $U_{1i}\ge 1$, and it is precisely equal to 1 if no output of industry i is used as input to other industries, but it is only used to satisfy the final demand.

Later, Antràs et al.⁵ also established an equivalence between their upstreamness measure and a measure—defined in a recursive fashion—of the “distance” of an industry from the final demand proposed independently by Fally et al.⁶. Fally’s upstreamness $U_2$ is defined as follows:

$$\begin{aligned} U_{2i} = 1 + \sum _{j=1}^N\frac{d_{ij}Y_j}{Y_i}U_{2j} \ . \end{aligned}$$

(9)

The idea is that $\varvec{U}_2$ aggregates information on the extent to which a sector in a given country produces goods that are sold directly to final consumers, or that are sold to other sectors that themselves mainly sell to final consumers. Sectors selling a large share of their output to relatively upstream industries should be therefore considered to be more upstream themselves. Using the fact that $d_{ij}Y_j = a_{ij}$ we obtain

$$\begin{aligned} {\varvec{U}_2}= [\mathbbm {1}_N-A_U]^{-1}{\varvec{1}}_N\ , \end{aligned}$$

(10)

where $A_U$ is defined in Eq. (8) as presented in⁵.

On the input side, there exists an analogous accounting identity stating that sector i’s total input $Y_i$ is equal to the value of its primary inputs (the so-called value added) $V_i$ plus its intermediate input purchased from all other sectors, namely

$$\begin{aligned} Y_i= V_i +Z_i = V_i +\sum _{j=1}^N a_{ji}= V_i +\sum _{j=1}^N d_{ji}Y_j \ , \end{aligned}$$

(11)

and

$$\begin{aligned} {\varvec{Y}}= [\mathbbm {1}_N-D^T]^{-1}{\varvec{V}}\ . \end{aligned}$$

(12)

Similarly to Antràs et al. (cf. Eq. (6)), Miller and Temurshoev⁷ introduced the so-called downstreamness, measuring the “average distance between suppliers of primary inputs and sectors as input purchaser along the input demand supply chain” as follows:

$$\begin{aligned} D_{1i} = 1 \cdot \frac{V_i}{Y_i} + 2\cdot \frac{\sum _{j=1}^N V_j d_{ji}}{Y_i} + 3\cdot \frac{\sum _{j,k=1}^N V_j d_{jk}d_{ki} }{Y_i} + \ldots = \frac{([\mathbbm {1}_N - D^T]^{-2}{\varvec{V}})_i}{Y_i} \ . \end{aligned}$$

(13)

As before, using Eq. (12), we obtain

$$\begin{aligned} {\varvec{D}_1}= [\mathbbm {1}_N-A_D]^{-1}{\varvec{1}}_N \ , \end{aligned}$$

(14)

with

$$\begin{aligned} A_D= (A Y^{-1})^T = \begin{pmatrix} \frac{a_{11}}{Y_1} & \cdots & \frac{a_{N1}}{Y_1} \\ \vdots & \ddots & \vdots \\ \frac{a_{1N}}{Y_N} & \cdots & \frac{a_{NN}}{Y_N} \end{pmatrix}\ . \end{aligned}$$

(15)

The matrix $A_D$ has non-negative elements, and it is row-substochastic, i.e., $\sum _{j}(A_D)_{ij}\le 1 \ \forall i$. Finally, as in the upstreamness case, also for the downstreamness, Fally⁶ introduced an analogous iterative definition of the form

$$\begin{aligned} D_{2i} = 1 + \sum _{j=1}^N d_{ji}D_{2j} \ , \end{aligned}$$

(16)

which can be again mapped with simple manipulations onto Eq. (14) using $Y_i d_{ji}=a_{ji}$.

Rank-1 approximation with local and aggregate information

In this section, we will discuss how to derive an approximation for the upstreamness and downstreamness metrics discussed in Sect. 2. Let us consider the resolvent $G(A)=(\mathbbm {1}_N - A)^{-1}$, where the matrix A stands for $A_U$ or $A_D$ as defined in the previous section. Therefore, A has non-negative entries and is sub-stochastic. Recall that the vectors of upstreamness and downstreamness are defined as ${\varvec{U}}_1 = G(A_U){\varvec{1}}_N$ and ${\varvec{D}}_1= G(A_D){\varvec{1}}_N$, respectively (cf. Eq. (10), (14)). We are going now to assume that a detailed and accurate knowledge of all the entries of A is not available. The only available aggregate information is given by the 2N constants $\varvec{r}=(r_1,\ldots ,r_N)$ and $\varvec{c}=(c_1,\ldots ,c_N)$, namely the sums of the N rows and columns of A. This corresponds to knowing only the total intermediate demand per industry and the total value of all inputs required by each industry respectively. In the following we will analyse the single (row-sum only) and double (row- and column-sum) constraint cases. For the single constraint case, the knowledge of row sums of the I-O matrix (total intermediate demand of the associated sector) and of the vector of final demands is sufficient to infer the row sums of the matrix $A_U$. Similarly, the knowledge of column sums of the I-O matrix (total inputs of the associated sector) and of the vector of value added is sufficient to infer the row sums of the matrix $A_D$. For the double constraints case, the knowledge of row and column sums of the I-O matrix and of the vector of final demands/values added is not sufficient to infer the rows and column sums of either matrix $A_U$ or $A_D$, however this level of knowledge can be approximately achieved by positing that $Y_i\approx \bar{Y}$, where $\bar{Y}$ is the average of the $Y_i$. In the following, we will assume that the row/column sums (single constraint) or row and column sums (double constraints) of the matrices $A_U$ and $A_D$ are known or retrievable from the corresponding row/column sums of the original I-O matrix. This lack of detailed information is actually quite common in supply chain and intrafirm network analysis⁸, which in turn leads to the need for inference and reconstruction methods to fill the gaps.

A simple rank-1 approximation ${\hat{A}}$ for the matrix A is

$$\begin{aligned} {\hat{A}}=\frac{1}{N}\varvec{g}\varvec{q}^T= \begin{pmatrix} \frac{g_1 q_1}{N} & \cdots & \frac{g_1 q_N}{N}\\ \vdots & \ddots & \vdots \\ \frac{g_Nq_1}{N} & \cdots & \frac{g_Nq_N}{N} \end{pmatrix}\ , \end{aligned}$$

(17)

where the entries of the column vectors $\varvec{g} = (g_1,\ldots ,g_N)$ and $\varvec{q}=(q_1,\ldots ,q_N)$ are determined imposing the constraint that A and ${\hat{A}}$ share the same row and column sums

$$\begin{aligned} r_i=&\sum _j A_{ij}\equiv \frac{\sum _{k} q_k}{N} g_i={\bar{q}}\ g_i\ , \end{aligned}$$

(18)

$$\begin{aligned} c_j=&\sum _i A_{ij}\equiv \frac{\sum _{k} g_k}{N} q_j={\bar{g}}\ q_j\ . \end{aligned}$$

(19)

This yields eventually the unique matrix

$$\begin{aligned} {\hat{A}} =\frac{1}{mN}\varvec{r}\varvec{c}^T \end{aligned}$$

(20)

with $m=\frac{1}{N} \sum _{ij} A_{ij}=\frac{1}{N}\sum _j c_j=\frac{1}{N}\sum _i r_i$. The rank-1 matrix ${\hat{A}}$ in (20) is the so-called Maximum Entropy reconstructed matrix (see e.g.^45,46) subject to the row and column constraints in (18) and (19) (see also^{47,48,49,50,51} for related works).

If the only information we have is about row sums, then the corresponding rank-1 approximation is even simpler

$$\begin{aligned} \hat{A} = \begin{pmatrix} \frac{r_1}{N} & \cdots & \frac{r_1}{N}\\ \vdots & \ddots & \vdots \\ \frac{r_N}{N} & \cdots & \frac{r_N}{N} \end{pmatrix} \ . \end{aligned}$$

(21)

Clearly, ${\hat{A}}$ has a single non-zero, real and positive eigenvalue $\lambda _1=\frac{1}{mN}\sum _j r_j c_j$ (or $\lambda _1=\frac{1}{N}\sum _j r_j$ in the case of only-row constraints) due to the Perron-Frobenius theorem, and $N-1$ zero eigenvalues, therefore we may expect that this approximation will work better the larger the spectral gap (or equivalently the smaller the spectral radius in the bulk) of the original matrix A is^52,53. The spectral gap is defined as $\Gamma =\lambda _1-\Xi$, with $\lambda _1$ real and $<1$ being the Perron-Frobenius eigenvalue. The spectral radius is $\Xi =\max \{|\lambda _2|,\ldots ,|\lambda _{N-1}|\}$. The empirical I–O matrices $A_U,A_D$ typically show a large spectral gap, suggesting that the rank-1 approximation described in this section should be very effective.

As the empirical I–O matrices $A_U,A_D$ are rather small ($N=35$), it is more informative to look at their spectral radius. In Sect. 5, we perform a thorough analysis of the spectra of the I–O matrices at the country level, and we study how the accuracy of our rank-1 formula is related to the spectral radius. We indeed find that there is a clear negative correlation between the two, i.e. the error made using our approximation increases with $\Xi$. This said, even in the worst cases, the relative errors remain fairly negligible, and the formulae work very well across the entire dataset.

Employing this rank-1 approximation, we can now evaluate the approximate resolvent

$$\begin{aligned} G({\hat{A}})=(\mathbbm {1}_N - {\hat{A}})^{-1}= \mathbbm {1}_N+\frac{{\hat{A}}}{1-\frac{1}{m N}\sum _j r_j c_j}\ , \end{aligned}$$

(22)

using the Sherman-Morrison formula⁵⁴ for the inverse of a rank-1 matrix, from which it follows that the upstreamness and downstreamness of the i-th industry are respectively approximated by

$$\begin{aligned} U_{1i}&\approx 1+\frac{r_i}{1-\frac{1}{m N}\sum _j r_j c_j} \end{aligned}$$

(23)

$$\begin{aligned} D_{1i}&\approx 1+\frac{{\tilde{r}}_i}{1-\frac{1}{{\tilde{m}} N}\sum _j {\tilde{r}}_j {\tilde{c}}_j}\ , \end{aligned}$$

(24)

where $r_i,c_i$ and ${\tilde{r}}_i,{\tilde{c}}_i$ represent respectively the sum of rows and columns of $A_U$ and $A_D$. If only the constraints on rows are imposed, the formulae above reduce to

$$\begin{aligned} U_{1i}&\approx 1+\frac{r_i}{1-\frac{1}{N}\sum _j r_j } \end{aligned}$$

(25)

$$\begin{aligned} D_{1i}&\approx 1+\frac{{\tilde{r}}_i}{1-\frac{1}{ N}\sum _j {\tilde{r}}_j }\ . \end{aligned}$$

(26)

The approximate formulae above show that, within our rank-1 approximation, the upstreamness (downstreamness) of sector i is fully determined by the interplay of (i) local and aggregate information, namely of the total intermediate demand per sector (and/or the total value of all inputs required by a each sector), and (ii) a suitable average of the total intermediate demand (and/or the total value of all inputs) across all sectors in the economy.

In spite of the seemingly drastic approximation, which neglects a significant amount of finer intersectorial details, we will show that the aggregate information featuring in our rank-1 formulae is sufficient to determine with high accuracy the relative positioning of countries and sectors within the global value chains.

In the next sections, we will then calculate upstreamness and downstreamness measures on I–O tables from the NIOT Dataset (see Sect. 4), comparing the results obtained via our approximation with the full calculation using the original formulae, namely Eq. (10) and (14).

Dataset

Table 1 Countries and their codes in the NIOT database by WIOD⁴⁴. Luxembourg is not included in our analysis as data present inconsistencies across the years.

Full size table

The empirical I–O matrices used for the experiments have been constructed using the 2013 release of the National I–O tables by the World I–O Database (WIOD)⁴⁴. The NIOT dataset comprises 39 countries –representing a large fraction of the major world economies – over the years 1995–2011. The list of countries and their codes considered in our empirical analysis is presented in Table 1. The structure of the I–O table of each country is schematically shown in Fig. 1. The intermediate demand for each country is reported for $N=35$ economic sectors in terms of the flow (in US million dollars) between sectors. The full list of economic sectors and their codes included in our analysis is summarized in Table 2. The final demand is characterized in terms of (i) final consumption expenditure by households, (ii) final consumption expenditure by non-profit organizations serving households (NPISH), (iii) final consumption expenditure by government, (iv) gross fixed capital formation, (v) changes in inventories and valuables and (vi) exports. In the dataset sometimes the change in Inventories and Valuables can be negative, and were assumed to contribute to imports. The entries $a_{ij}$ of each row of the full I–O table are then normalized by the vector outputs $Y_j$. The normalized intermediate demand sub-matrix is sub-stochastic and represents the matrix $A_U$. The $r_i$ used in the model are simply the sums over the rows of the matrix $A_U$ [or equivalently if normalized by columns the matrix $A_D$, respectively in Eqs. (8) and (15)] .

Table 2 Sectors of the NIOT dataset by WIOD (2013 release) and their sector codes⁴⁴.

Full size table

Results

In this section, we compare our approximate formulae for downstreamness and upstreamness with single [Eqs. (25) and (26) respectively] and double contraints [Eqs. (23) and (24) respectively] with the measures obtained via direct inversion of the empirical I–O matrix [Eqs. (10) and (14) respectively].

Given the very weak temporal dependence of the empirical upstreamness and downstreamness measures as shown in Fig. 2 (consistent with previous analyses in⁴⁰), in the following we will be able to aggregate together the analyses across all years in a robust way.

In Fig. 3 we plot the empirical average over all sectors (cyan squares) of the upstreamness for 39 countries (listed in Table 1) for all years (1995–2011) versus the approximate value with single (top panel) and double constraints (bottom panel), respectively obtained in Eqs. (25) and (23). We see that the empirical data (663 data points—39 countries $\times$ 17 years) nicely collapse on top of the theoretical benchmark (blue dashed line). In the single constraint case, this implies that the average upstreamness coefficient for a country is determined with high accuracy by the knowledge of a single quantity $\bar{z} = 1 - \frac{1}{N}\sum _j{r_j}$, corresponding to one minus the average total intermediate demand. We also show the upstreamness values for each sector in each country across the entire period (red full circles) constituting in total $\sim 23k$ data points—35 sectors $\times$ 39 countries $\times$ 17 years. At the sector level, we observe a similar good agreement of the empirical exact upstreamness with the approximate values.

There are occasional deviations (including a systematic upward deviation for large values of the empirical downstreamness), whose origin can be traced back to a higher degree of heterogeneity in the A matrix with respect to the “flat” rank-1 model introduced in Eq. (21).

To identify the sectors that are typically less accurately captured by our approximation, we computed a simple indicator, $\langle |\Delta _U^{\textrm{sect}}|\rangle$. This metric represents the average absolute difference between the empirical and approximated upstreamness values for each sector, aggregated across all years and all countries (see Fig. 4). The mining and agricultural sectors, among others, appear to exhibit greater heterogeneity in their input–output relationships with other sectors, as suggested by the higher differences values. This indicates that the structural differences in these sectors across countries may pose challenges for the accuracy of our approximation. Consequently, our method may perform less effectively for countries with economies that rely heavily on these sectors, as their heterogeneity is less well captured in the A-matrix approximation. In contrast, sectors such as housing, public administration, and education display lower values, suggesting more consistent and predictable input–output relationships, making them better suited for our approximation approach.

We have calculated a similar metric for the upstreamness at country-level (see Fig. 5), $\langle |\Delta _U^{\textrm{country}}|\rangle$, averaging absolute differences over the period 1995-2011. The countries consistently more divergent (with respect to our approximation) are Spain, Korea, Russia and China.

In the following we will also analyze more closely the relation between the error—discrepancy between the actual values of upstreamness (and downstreamness) calculated via direct inversion and those obtained via our approximate formula—and the spectral properties of the empirical I–O matrix A.

In Fig. 6, we repeat a similar analysis for the downstreamness, comparing the values obtained via direct inversion (Eq. (14)) with the approximate values of downstreamness imposing the single or double constraint on the knowledge of row sums, or row and column sums, respectively. Also for this measure, we observe a good agreement between exact and approximate values, both at the sectors (red full circles) and at the aggregate country level (cyan squares).

To assess the accuracy of the approximations, we quantify the correlation between the empirical and approximate measures using Pearson and Spearman correlation coefficients as summarised in Table 3. The results show that the double-constraints approximation provides a visible improvement for countries, with correlations nearly perfect in both upstreamness and downstreamness measures. However, for sectors, the improvement is marginal, as the single-constraint approximation already achieves high correlations.

Table 3 Comparison of Spearman and Pearson Correlation Coefficients between empirical and approximated upstreamness and downstreamness measure (1) at country or sector level and (2) considering the single or double-constraint approximation.

Full size table

In the following, we analyze more closely the error made in the estimation of the upstreamness/downstreamness coefficients via our approximate formulae and link it to spectral properties of the underlying I–O matrix A. In particular, we define the following metric for assessing the error⁵²

$$\begin{aligned} \sigma =\left\langle \left| \frac{\mathcal {R}_i^{(\textrm{emp})}}{\mathcal {R}_i^{(\textrm{approx})}}-1\right| \right\rangle \ , \end{aligned}$$

(27)

where $\mathcal {R}_i$ represents either the upstreamness or the downstreamness values computed via direct inversion ($\mathcal {R}_i^{(\textrm{emp})}$) and via our approximate formula ($\mathcal {R}_i^{(\textrm{approx})}$) respectively. The average $\langle \cdots \rangle$ is calculated over all sectors of a given country. Concerning the spectral properties, as shown in^52,53 the accuracy of the approximation is related to the spectral gap of the matrix A. The matrix A has non-negative entries, therefore it has one real eigenvalue of largest magnitude $\lambda _1$ (the Perron-Frobenius eigenvalue), and its spectral gap is defined as $\Gamma =\lambda _1-\max \{|\lambda _2|,\ldots ,|\lambda _{N-1}|\}$. As the empirical I–O matrices are rather small ($N=35$) it is more informative to look at the spectral radius. We then introduce the spectral radius excluding the Perron-Frobenius $\lambda _1$ as

$$\begin{aligned} \Xi =\max \{|\lambda _2|,\ldots ,|\lambda _{N-1}|\}\ . \end{aligned}$$

(28)

This definition is consistent with the approach used in the case of Gaussian matrices perturbed with a rank-1 matrix that may force an outlier to split off from the circular bulk^53,55. In Fig. 7, we display the error $\sigma$ made on the approximation for all countries in all years as a function of the spectral radius $\Xi$ of the $A_U$ matrix characterizing each country in each year. As expected, the error grows with the spectral radius, as the rank-1 approximation becomes less accurate in reproducing the underlying intersectorial interactions. In Fig. 8, we show the same relationship labelling the countries for a single year (2011). In the bottom panel, we show the eigenvalue spectrum of two selected countries—namely China and Mexico—displaying respectively among the maximal and minimal errors in the estimation, to highlight spectral differences in the displacement of eigenvalues in the bulk.

In this analysis, we find a clear negative correlation between the accuracy of the estimation and the spectral radius, i.e., the error made using our approximation increases (equivalently the accuracy of the approximation decreases) with $\Xi$. In general though, even in the worst cases, the relative errors remain fairly small ($\sim 5-6\%$), and the approximation works very well across the entire sample.

Upstreamness under aggregation

In this section, we briefly consider how our approximation performs after the I–O data matrix has been subject to aggregation (consolidation) of different industrial sectors. The effects of aggregation—i.e. the procedure by which the data are looked at and lumped together at different “granularity” level—have been considered in many works (see⁵⁶ for a comprehensive review). Here we consider the axiomatic formulation of aggregation provided in⁵⁷, which is summarized below. Furthermore, our treatment will be confined to the upstreamness, and the row-only rank-1 approximation, as generalizations to the other cases are straightforward.

Consider the definition of upstreamness given in Eq. (7)

$$\begin{aligned} {\varvec{U}_1}= [\mathbbm {1}_N-A_U]^{-1}{\varvec{1}}_N\ . \end{aligned}$$

(29)

To make contact with Ref.⁵⁷, we rewrite (29) as

$$\begin{aligned} {[}{\varvec{U}_1}^T]_N= {\varvec{1}}_N^T[\mathbbm {1}_N-A_U^T]^{-1}\ , \end{aligned}$$

(30)

in terms of row vectors ${\varvec{U}_1}^T$ and ${\varvec{1}}_N^T$, and a column-substochastic $N\times N$ matrix $A_U^T$. The notation $[\ldots ]_N$ indicates that the vector has length N.

Let us assume that we wish to aggregate the N “micro” industrial sectors or commodities into a set of $M<N$ “macro” sectors or commodities. Formally, we can define two matrices, S and T, of size $M\times N$ and $N\times M$ respectively. The $\{0,1\}$ matrix S indicates which micro-sectors should be combined together: $S_{ij}=1$ if micro-sector j is to be included in macro-sector i. Thus, S is a column stochastic matrix with exactly one 1 in every column, and at least one 1 in every row. The matrix T indicates the proportional weights of each micro-sector within its macro-aggregate. The element $T_{ji}\in (0,1)$ represents the weight $w_{ji}$ that micro-sector j carries within macro-sector i, and therefore is such that $\sum _j T_{ji}=1$. It follows that T is also column stochastic.

Forming the aggregate $M\times M$ matrix $A_U^\prime =SA_U^T T$ is the most common way used in the literature to create a smaller sub-stochastic matrix from the original matrix $A_U$, which retains (at a coarser level of detail) some of the information about industrial sectors and commodities provided by $A_U$. Although other choices of aggregation are possible, it was proven in⁵⁷ that the aggregator $A_U^\prime$ is the only one that satisfies three natural axioms of linearity, value added neutrality, and partitioning, therefore in the following we will confine ourselves to this case (the so called standard aggregator). It follows from the definition of S and T that $ST=\mathbbm {1}_M$ and TS is a column stochastic, idempotent matrix of rank M (see⁵⁷ for a proof).

Although in principle any non-negative column-stochastic matrix could play the role of T, in practice it makes most sense to define it as

$$\begin{aligned} T=\textrm{diag}(\varvec{w})S^T [\textrm{diag}(S\varvec{w})]^{-1}\ , \end{aligned}$$

(31)

where $\varvec{w}$ is a vector of N non-negative numbers, and $\textrm{diag}(\varvec{w})$ is the diagonal matrix having the vector entries on the diagonal (in their natural order). According to Charnes and Cooper, “The main justification for this mode of consolidation is that it conforms to the way data would be synthesized ab initio if SAT rather than A were the objective”⁵⁸. To better understand how standard aggregation works, consider as an example a $6\times 6$ matrix $A_U^T$ (whose elements we denote $\alpha _{ij}$ for simplicity, so $\alpha _{ij} = a_{ji}/Y_j$). Let

$$\begin{aligned} S = \begin{pmatrix} 0 & 0 & 1 & 1 & 0 & 0\\ 1 & 1 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 1 & 1\\ \end{pmatrix} \ , \end{aligned}$$

(32)

and $\varvec{w} = (w_1,w_2,w_3,w_4,w_5,w_6)$. Then

$$\begin{aligned} T = \textrm{diag}(\varvec{w})S^T [\textrm{diag}(S\varvec{w})]^{-1}= \begin{pmatrix} 0 & \frac{w_1}{w_1+w_2} & 0 \\ 0 & \frac{w_2}{w_1+w_2} & 0 \\ \frac{w_3}{w_3+w_4} & 0 & 0 \\ \frac{w_4}{w_3+w_4} & 0 & 0 \\ 0 & 0 & \frac{w_5}{w_5+w_6} \\ 0 & 0 & \frac{w_6}{w_5+w_6} \\ \end{pmatrix} \ , \end{aligned}$$

(33)

and the aggregator becomes

$$\begin{aligned} A_U^\prime =S A_U^T T = \begin{pmatrix} \frac{w_3 (\alpha _{33}+\alpha _{43})+w_4 (\alpha _{34}+\alpha _{44})}{w_3+w_4} & \frac{w_1 (\alpha _{31}+\alpha _{41})+w_2 (\alpha _{32}+\alpha _{42})}{w_1+w_2} & \frac{w_5 (\alpha _{35}+\alpha _{45})+w_6 (\alpha _{36}+\alpha _{46})}{w_5+w_6} \\ \frac{w_3 (\alpha _{13}+\alpha _{23})+w_4 (\alpha _{14}+\alpha _{24})}{w_3+w_4} & \frac{w_1 (\alpha _{11}+\alpha _{21})+w_2 (\alpha _{12}+\alpha _{22})}{w_1+w_2} & \frac{w_5 (\alpha _{15}+\alpha _{25})+w_6 (\alpha _{16}+\alpha _{26})}{w_5+w_6} \\ \frac{w_3 (\alpha _{53}+\alpha _{63})+w_4 (\alpha _{54}+\alpha _{64})}{w_3+w_4} & \frac{w_1 (\alpha _{51}+\alpha _{61})+w_2 (\alpha _{52}+\alpha _{62})}{w_1+w_2} & \frac{w_5 (\alpha _{55}+\alpha _{65})+w_6 (\alpha _{56}+\alpha _{66})}{w_5+w_6} \\ \end{pmatrix}\ . \end{aligned}$$

(34)

Now, let us assume that the vector of N upstreamness values in Eq. (30) can be faithfully approximated by our formula in Eq. (25), which can be written as

$$\begin{aligned} {[}\hat{\varvec{U}_1}^T]_N= {\varvec{1}}_N^T+\frac{1}{1-\bar{r}_N}\varvec{r}^T\ , \end{aligned}$$

(35)

where $\varvec{r}$ is the (column) vector of row sums of the matrix $A_U$ (or the column sums of $A_U^T$, $r_j = \sum _{i=1}^N \alpha _{ij}$), and ${\bar{r}}_N$ is their average. Let us further assume that the original data matrix $A_U$ is not known in its entirety (only its row sums are known), but the sectors/commodities in $A_U$ have been aggregated using a known pair of matrices S, T—in other words, we are aware of what sectors/commodities have been lumped together (and with which relative weights) and what their aggregate outputs are, but we do not have more detailed information. We ask whether the knowledge of $\varvec{r}, S$ and T is sufficient to determine $[\hat{\varvec{U}_1}^T]_M$, namely a faithful approximation for the M upstreamness values of the aggregate model. The answer is affirmative.

First, define

$$\begin{aligned} {[}{\varvec{U}_1}^T]_M= {\varvec{1}}_M^T[\mathbbm {1}_M-A_U^\prime ]^{-1}={\varvec{1}}_M^T[\mathbbm {1}_M-SA_U^T T]^{-1}\ , \end{aligned}$$

(36)

the vector of M upstreamness values, obtained using the aggregate matrix $A_U^\prime$ as a source. The Leontief matrix on the r.h.s. of (36) is equal to the aggregate of the Leontief matrix of the so called companion matrix ${\bar{A}}_U= A_U^T TS$⁵⁷, namely

$$\begin{aligned} {[}\mathbbm {1}_M-SA_U^T T]^{-1} = S[\mathbbm {1}_N-{\bar{A}}_U]^{-1}T\ . \end{aligned}$$

(37)

The proof follows by expanding $[\mathbbm {1}_M-SA_U^T T]^{-1}=\mathbbm {1}_M +SA_U^T T+(SA_U^T T)^2+\ldots$, and using $(SA_U^T T)^n=S(A_U^T TS)^nT$ and $TST=T$.

Imagine now that the true matrix $A_U^T$ appearing on the l.h.s. of (37) is replaced by its best rank-1 approximation, given by ${\hat{A}}^T$ (see Eq. (21)). From the fact that the rank of the product of two matrices (${\hat{A}}$ and TS) is smaller or equal than the smallest rank of the two factors, and that TS is rank-M (and of course none of the matrices involved is a null matrix), it is easy to deduce that in this case the companion matrix will also be rank-1. Applying Sherman-Morrison on the r.h.s. of (37), we get

$$\begin{aligned} S[\mathbbm {1}_N-{\hat{A}} TS]^{-1}T= \mathbbm {1}_M+\frac{1}{1-\phi (\varvec{r},S,T)}S({\hat{A}} TS)T=\mathbbm {1}_M+\frac{1}{1-\phi (\varvec{r},S,T)}S {\hat{A}} T\ , \end{aligned}$$

(38)

where we used $S\mathbbm {1}_N T=ST=\mathbbm {1}_M$, and

$$\begin{aligned} \phi (\varvec{r},S,T)=\frac{1}{N}\sum _{i,k=1}^N r_i (TS)_{ik}\ . \end{aligned}$$

(39)

Eq. (38) shows how to construct a faithful rank-1 approximation for the upstreamness of the aggregate model starting from the knowledge of row sums of the original model, as well as of the matrices T and S implementing the aggregation.

Summary and outlook

In this paper, we have shown that the upstreamness and downstreamness measures introduced in the context of I–O analysis at both the inter-sectorial and country level can be faithfully recovered from the knowledge of aggregate and local information about the I–O table. In other words, the precise determination of the elements of the I–O matrix does not matter much, as long as their distribution does not deviate significantly from the “homogeneous” (flat) model (described in Eq. (21)), and the total intermediate demand per sector is ordinarily sufficient to provide an accurate estimate of the sector’s multipliers.

Our rank-1 approximation has been successfully tested on National I–O tables obtained from WIOD, where an excellent correlation is obtained between the empirical multipliers and the theoretical formulae (see Figs. 3 and 6). Small deviations from this remarkably robust regularity are readily attributed to stronger heterogeneity in the empirical sectorial data, which would require refinements to the (single or doubly constrained) rank-1 approximation presented here.

Indeed, sparser or more heterogeneous I–O matrices tend to have a larger spectral radius (or equivalently a smaller spectral gap), as demonstrated in Figs. 7 and 8. The quality of our rank-1 approximation is very high across the sectors and countries considered, but may be inferior for emprical matrices with larger spectral radii – as more eigenvalues besides the largest (Perron-Frobenius) start to play an important role.

In Section 6, we have also shown how our rank-1 approximation is well-behaved with respect to aggregation of sectorial data: knowing what sectors/commodities are lumped together, and what their aggregate outputs are, is sufficient to determine a faithful approximation for the upstreamness values of the aggregate model, as the rank-1 nature of the approximation is preserved upon aggregation.

In a recent paper⁵⁹, we further employ the rank-1 approximation as a proxy to investigate the “puzzling” correlations observed between upstreamness and downstreamness at aggregate level⁴⁰. More generally, our approach based on a rank-1 approximation demonstrates that local and aggregate information about I–O tables is ordinarily sufficient to determine the upstreamness and downstreamness at sectorial and country level with high accuracy, while at the same time providing analytically tractable formulae (Eq. (14), (7)) that avoid matrix inversions altogether. The rank-1 formulae prove also useful to approximate centrality values of nodes in complex networks^52,60. As an outlook for future research, it will be interesting to test the accuracy of our formulae on firm-level data, where data availability and sparsity are greater concerns. In spite of the sparser nature of the data, we would expect our approximation to work well, as recently shown on experiments conducted on synthetic data⁵².

Data availability

The datasets analysed during the current study are publicly available at https://www.rug.nl/ggdc/valuechain/wiod/wiod-2013-release?lang=en. The codes written for the analysis will be made available upon request to the corresponding author.

References

Leontief, W. Input–Output Economics (Oxford University Press, Oxford, 1986).
MATH Google Scholar
Leontief, W. Quantitative input–output relations in the economic system of the United States. Rev. Econ. Stat. 18, 105–125 (1936).
Article MATH Google Scholar
United Nations Department for Economic and Social Affairs Statistics Division. Handbook of input–output Table Compilation and Analysis (1999).
Antràs, P. & Chor, D. Organizing the global value chain. Econometrica 81(6), 2127–2204 (2012).
MathSciNet MATH Google Scholar
Antràs, P., Chor, D., Fally, T. & Hillberry, R. Measuring the upstreamness of production and trade flows. Am. Econ. Rev. Pap. Proc. 102(3), 412–416 (2012).
Article MATH Google Scholar
Fally, T. Production Staging: Measurement and Facts (unpublished). Available at: https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=8ff103e6b2573a063bdfcac61ef73550b79467c7 (2012).
Miller, R. E. & Temurshoev, U. Output upstreamness and input downstreamness of industries/countries in world production. Int. Reg. Sci. Rev. 40(5), 443–475 (2017).
Article MATH Google Scholar
Bacilieri, A. & Austudillo-Estevez P. Reconstructing firm-level input–output networks from partial information. ArXiv preprint arXiv:2304.00081 (2023).
Kop Jansen, P. S. M. Analysis of multipliers in stochastic input–output models. Reg. Sci. Urban Econ. 24, 55–74 (1994).
Article MATH Google Scholar
Kop Jansen, P. & Ten Raa, T. The choice of model in the construction of input–output coefficients matrices. Int. Econ. Rev. 31(1), 213–227 (1990).
Article MATH Google Scholar
Evans, W. D. The effect of structural matrix errors on interindustry relations estimates. Econometrica 22(4), 461–480 (1954).
Article MATH Google Scholar
Quandt, R. E. Probabilistic errors in the Leontief system. Naval Res. Logist. Q. 5, 155–170 (1958).
Article MathSciNet MATH Google Scholar
Simonovits, A. A note on the underestimation and overestimation of the leontief Inverse. Econometrica 43, 493–498 (1975).
Article MathSciNet MATH Google Scholar
West, G. R. A stochastic analysis of an input–output model. Econometrica 54(2), 363–374 (1986).
Article MATH Google Scholar
Kogelschatz, H. On the Solution of Stochastic input–output-Models. Discussion Paper Series n. 447, University of Heidelberg (2007).
Kozicka, M. Novel approach to stochastic input–output modeling. RAIRO-Oper. Res. 53, 1155–1169. https://doi.org/10.1051/ro/2018046 (2019).
Article MathSciNet MATH Google Scholar
Sargento, A.L. Introducing input–output analysis at the regional level: Basic notions and specific issues. The Regional Economics Application Laboratory (REAL) https://api.semanticscholar.org/CorpusID:158457048 (2009).
Katz, J. L. & Burford, R. L. Shortcut formulas for output, income and employment multipliers. Ann. Reg. Sci. 19(2), 61–76 (1985).
Article MATH Google Scholar
Burford, R. L. Regional input–output multipliers without a full IO table. Ann. Reg. Sci. 11(3), 21–38 (1977).
Article MATH Google Scholar
Drake, R. L. A short-cut to estimates of regional input–output multipliers: Methodology and evaluation. Int. Reg. Sci. Rev. 1(2), 1–17 (1976).
Article MATH Google Scholar
Phibbs, P. J. & Holsman, A. J. An evaluation of the Burford Katz short cut technique for deriving input–output multipliers. Ann. Reg. Sci. 15(3), 11–19 (1981).
Article MATH Google Scholar
Jensen, R. C. & Hewings, G. J. D. Shortcut ‘input–output’ multipliers: The resurrection problem (a reply). Environ. Plan A 17(11), 1551–1552 (1985).
Article Google Scholar
Jensen, R. C. & Hewings, G. J. D. Shortcut ‘input–output’ multipliers: A requiem. Environ. Plan A 17(6), 747–759 (1985).
Article Google Scholar
Burford, R. L. & Katz, J. L. Shortcut ‘input–output’ multipliers, alive and well: Response to Jensen and Hewings. Environ. Plan A 17(11), 1541–1549 (1985).
Article MATH Google Scholar
Cerina, F., Zhu, Z., Chessa, A. & Riccaboni, M. World input–output network. PLoS ONE 10(7), e0134025. https://doi.org/10.1371/journal.pone.0134025 (2015).
Article CAS PubMed PubMed Central Google Scholar
McNerney, J., Savoie, C., Caravelli, F. & Farmer J. D. How production networks amplify economic growth. PNAS119(1), e2106031118 (2021) (2018).
Moran, J. & Bouchaud, J.-P. May’s instability in large economies. Phys. Rev. E 100, 032307 (2019).
Article ADS CAS PubMed MATH Google Scholar
del Rio-Chanona, R. M., Grujić, J. & Jensen, H. J. Trends of the World input and output network of global trade. PLoS ONE 12(1), e0170817. https://doi.org/10.1371/journal.pone.0170817 (2017).
Article CAS PubMed PubMed Central MATH Google Scholar
Carvalho V. M. input–output networks: A survey. A report for the European Commission under the CRISIS consortium agreement. https://cordis.europa.eu/docs/projects/cnect/1/288501/080/deliverables/001-CRISISD31InputOutput.pdf (2012).
Acemoglu, D., Carvalho, V., Ozdaglar, A. & Tahbaz-Salehi, A. The network origins of aggregate fluctuations. Econometrica 80, 1977–2016 (2012).
Article MathSciNet MATH Google Scholar
Hidalgo, C., Bailey, K., Barabási, A.-L. & Hausmann, R. The product space conditions the development of nations. Science 317, 482–487 (2007).
Article ADS CAS PubMed MATH Google Scholar
Hidalgo, C. & Hausmann, R. The building blocks of economic complexity. PNAS 106, 10570–10575 (2009).
Article ADS CAS PubMed PubMed Central MATH Google Scholar
Tacchella, A., Cristelli, M., Caldarelli, G., Gabrielli, A. & Pietronero, L. A new metrics for countries fitness and products complexity. Sci. Rep. 2, 723 (2012).
Article PubMed PubMed Central MATH Google Scholar
Caldarelli, G. et al. A network analysis of countries’ export flows: Firm grounds for the building blocks of the economy. PLoS ONE 7(10), e47278. https://doi.org/10.1371/journal.pone.0047278 (2012).
Article ADS CAS PubMed PubMed Central MATH Google Scholar
Morrison, G. et al. On economic complexity and the fitness of nations. Sci. Rep. 7(1), 15332 (2017).
Article ADS MathSciNet PubMed PubMed Central MATH Google Scholar
Servedio, V. D. P. et al. A new and stable estimation method of country economic fitness and product complexity. Entropy 20(10), 783 (2018).
Article ADS PubMed PubMed Central MATH Google Scholar
Jacquemin, A. P. & Berry, C. H. Entropy measure of diversification and corporate growth. J. Ind. Econ. XXVI I, 359 (1979).
Article MATH Google Scholar
Teza, G. et al. Growth dynamics and complexity of national economies in the global trade network. Sci. Rep. 8, 15230 (2018).
Article ADS PubMed PubMed Central MATH Google Scholar
Teza, G., Caraglio, M. & Stella, A. L. Entropic measure unveils country competitiveness and product specialization in the World trade web. Sci. Rep. 11(1), 10189 (2021).
Article ADS CAS PubMed PubMed Central MATH Google Scholar
Antràs, P. & Chor D. On the measurement of upstreamness and downstreamness in global value chains. In Working Paper 24185. http://www.nber.org/papers/w24185 (2018).
López, L. A., Arce, G. & Osorio, P. Foreign multinationals affiliates and countries’ carbon upstreamness. How could these firms support the fulfilment of emissions reduction targets?. J. Environ. Manage. 326, 116714 (2023).
Article PubMed Google Scholar
Caraiani, P., Dutescu, A., Hoinaru, R. & Stănilă, G. O. Production network structure and the impact of the monetary policy shocks: Evidence from the OECD. Econ. Lett. 193, 109271 (2020).
Article MATH Google Scholar
Suganuma, K. Upstreamness in the global value chain: Manufacturing and services. Monetary Econ. Stud. 34, p. 39-66. https://EconPapers.repec.org/RePEc:ime:imemes:v:34:y:2016:p:39-66 (2016).
Timmer, M. P., Dietzenbacher, E., Los, B., Stehrer, R. & de Vries, G. J. An illustrated user guide to the world input–output database: The case of global automotive production. Rev. Int. Econ. 23, 575–605 (2015).
Article Google Scholar
Squartini, T., Caldarelli, G., Cimini, G., Gabrielli, A. & Garlaschelli, D. Reconstruction methods for networks: The case of economic and financial systems. Phys. Rep. 757, 1–47 (2018).
Article ADS MathSciNet MATH Google Scholar
Cimini, G., Squartini, T., Garlaschelli, D. & Gabrielli, A. Systemic risk analysis on reconstructed economic and financial networks. Sci. Rep. 5, 15758 (2015).
Article ADS CAS PubMed PubMed Central MATH Google Scholar
Bianconi, G. Mean field solution of the Ising model on a Barabási-Albert network. Phys. Lett. A 303, 166–168 (2002).
Article ADS MathSciNet CAS MATH Google Scholar
Dorogovtsev, S. N., Goltsev, A. V. & Mendes, J. F. F. Critical phenomena in complex networks. Rev. Mod. Phys. 80, 1275 (2008).
Article ADS MATH Google Scholar
Park, J. & Newman, M. E. J. The statistical mechanics of networks. Phys. Rev. E 70, 066117 (2004).
Article ADS MathSciNet MATH Google Scholar
Caldarelli, G., Capocci, A., De Los Rios, P. & Muñoz, M. A. Scale-free networks from varying vertex intrinsic fitness. Phys. Rev. Lett. 89, 258702 (2002).
Article ADS CAS PubMed MATH Google Scholar
Thibeault, V., Allard, A. & Desrosiers, P. The low-rank hypothesis of complex systems. Nat. Phys. 20, 294–302.https://doi.org/10.1038/s41567-023-02303-0 (2024).
Article CAS MATH Google Scholar
Bartolucci, S., Caccioli, F., Caravelli, F. & Vivo, P. Ranking influential nodes in networks from aggregate local information. Phys. Rev. Res. 5, 033123 (2023). https://doi.org/10.1103/PhysRevResearch.5.033123.
Article CAS MATH Google Scholar
Bartolucci, S., Caccioli, F., Caravelli, F. & Vivo, P. “Spectrally gapped’’ random walks on networks: A mean first passage time formula. SciPost Phys. 11(5), 088. https://doi.org/10.21468/SciPostPhys.11.5.088 (2021).
Article ADS MathSciNet Google Scholar
Sherman, J. & Morrison, W. J. Adjustment of an inverse matrix corresponding to a change in one element of a given matrix. Ann. Math. Stat. 21(1), 124–127 (1950).
Article MathSciNet MATH Google Scholar
Mosam, F., Vidaurre, D. & De Giuli, E. Breakdown of random matrix universality in Markov models. Phys. Rev. E 104(2), 024305 (2021).
Article ADS MathSciNet CAS PubMed MATH Google Scholar
Kymn, K. O. Aggregation in input–output models: A comprehensive review, 1946–71. Econ. Syst. Res. 2(1), 65–93 (1990).
Article MathSciNet MATH Google Scholar
Howe, E. C. & Johnson, C. R. Linear aggregation of input–output models. SIAM J. Matrix Anal. Appl. 10(1), 65–79 (1989).
Article MathSciNet MATH Google Scholar
Charnes, A. & Cooper, W. W. Management Models and Industrial Applications of Linear Programming Vol. I (Wiley, New York, 1961).
MATH Google Scholar
Bartolucci, S., Caccioli, F., Caravelli, F. & Vivo P. Correlation between upstreamness and downstreamness in random global value chains. arXiv preprint arXiv:2303.06603 (2023).
Bartolucci, S., Caccioli, F., Caravelli, F., & Vivo, P. Distribution of centrality measures on undirected random networks via the cavity method. Proc. Natl. Acad. Sci. 121(40), e2403682121. https://doi.org/10.1073/pnas.2403682121 (2024).
Article MathSciNet CAS PubMed PubMed Central MATH Google Scholar

Download references

Acknowledgements

We gratefully acknowledge insightful conversations with J. D. Farmer, F. Lafond, L. P. Garcia-Pinto and J. McNerney. The work of F. Caravelli was carried out under the auspices of the NNSA of the U.S. DoE at LANL under Contract No. DE-AC52-06NA25396. F. Caravelli was also financed via DOE LDRD grant 20240245ER. P.V. acknowledges support from UKRI Future Leaders Fellowship Scheme (No. MR/X023028/1).

Author information

Authors and Affiliations

Department of Computer Science, University College London, 66-72 Gower Street, London, WC1E 6EA, UK
Silvia Bartolucci & Fabio Caccioli
Systemic Risk Centre, London School of Economics and Political Sciences, London, WC2A 2AE, UK
Fabio Caccioli
London Mathematical Laboratory, 8 Margravine Gardens, London, WC 8RH, UK
Fabio Caccioli
T-Division (Center for Nonlinear Studies and T4), Los Alamos National Laboratory, Los Alamos, NM, 87545, USA
Francesco Caravelli
Department of Mathematics, King’s College London, Strand, London, WC2R 2LS, UK
Pierpaolo Vivo

Authors

Silvia Bartolucci
View author publications
Search author on:PubMed Google Scholar
Fabio Caccioli
View author publications
Search author on:PubMed Google Scholar
Francesco Caravelli
View author publications
Search author on:PubMed Google Scholar
Pierpaolo Vivo
View author publications
Search author on:PubMed Google Scholar

Contributions

All authors conceived the idea, performed the numerical tests, wrote and revised the manuscript.

Corresponding author

Correspondence to Pierpaolo Vivo.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Bartolucci, S., Caccioli, F., Caravelli, F. et al. Upstreamness and downstreamness in input–output analysis from local and aggregate information. Sci Rep 15, 2727 (2025). https://doi.org/10.1038/s41598-025-86380-6

Download citation

Received: 16 February 2024
Accepted: 10 January 2025
Published: 21 January 2025
DOI: https://doi.org/10.1038/s41598-025-86380-6

Upstreamness and downstreamness in input–output analysis from local and aggregate information

Subjects

Abstract

Similar content being viewed by others

The rise and fall of countries in the global value chains

Entropic measure unveils country competitiveness and product specialization in the World trade web

Is China decoupling from the global value chain? A quantitative analysis framework based on the global production network

Introduction

Related literature

Definition of upstreamness and downstreamness

Rank-1 approximation with local and aggregate information

Dataset

Results

Upstreamness under aggregation

Summary and outlook

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Search

Quick links

Subjects

Abstract

Similar content being viewed by others

The rise and fall of countries in the global value chains

Entropic measure unveils country competitiveness and product specialization in the World trade web

Is China decoupling from the global value chain? A quantitative analysis framework based on the global production network

Introduction

Related literature

Definition of upstreamness and downstreamness

Rank-1 approximation with local and aggregate information

Dataset

Results

Upstreamness under aggregation

Summary and outlook

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links