Figure 3

These diagrams showcase the construction of activations from previous activations and pixels for the case of RNN and MDRNN. Note that the zero vector is used when previous activations are not available, e.g. \({h}_{-1}=\mathrm{0,}\,{h}_{-\mathrm{1,}j}=0\) and \({h}_{i,-1}=0\). The grids on the right demonstrate the linking of those activations. In the RNN, an arrow from h i to h j indicates that h i was used in the construction of h j , i.e. \({h}_{j}=tanh(A{I}_{i}+B{h}_{i})\). Note that the distance that activations need to travel before being used as context for nearby pixels differs significantly between the two approaches. For example, h2 must pass through 6 recurrent blocks before being used as context for h8 in the RNN, whilst \({h}_{\mathrm{0,2}}\) only requires 2 steps before being available as context in the MDRNN.