Is using causal convolution / padding equivalent to shifting the outputs back?

1.1k views Asked by At

I'm having some trouble understanding the purpose of causal convolutions. Suppose I'm doing time-series classification using a convolutional network with 1 layer and kernel=2, stride=1, dilation=0. Isn't it the same thing as shifting my output back by 1?

For larger networks, it would be a little more involved to take into account the parameters of all the layers to get the resulting receptive field to do a proper output shift. To me it seems, if there is some leak, you could always account for the leak by shifting the output back.

For example, if at time step $t_2$, a non-causal CNN sees $x_0, x_1, x_2, x_3, x_4$, then you'd use the target associated with $t_4$, i.e. $y_4$

Edit: I've seen diagrams for causal CNNs where all the arrows a right-aligned. I get that it's meant to illustrate that $y_t$ aligns to $x_t$, but couldn't you just as easily draw them like this:

Non_Causal CNN Right-aligned

1

There are 1 answers

4
jhso On

The point of causal convolutions is not to see 'future' data. This is important in real time sequential analysis because we won't have access to new information before it happens, however we typically do in training (due to having the whole training sequence). Therefore, causal convolutions begin t-k//2 and end at t (where t = current timestep and k = kernel size), rather than a typical convolution which starts at t-k//2 and end at t+k//2. This can be imagined as a 1-sided kernel, where instead of having the target pixel/sample be in the centre of the kernel, it's now the rightmost (going from L-R) part of the kernel.

Using your example, if the top orange dot in the following picture is t_n, then t_n has a receptive field stemming from t_n-4 to t_n due to it having a kernel size of 2 and 4 layers.

enter image description here

Compare that to a noncausal convolution (ignore the dilated convolution on the right), where the receptive field stems from t_n-3 to t_n+3 due to it being a double-sided kernel:

enter image description here