De-noising data

Junk Charts 2013-06-12

One of the most important steps in analyzing data is to remove noise. First, we have to identify where the noise is, then we find ways to reduce the noise, which has the effect of surfacing the signal.

Jc_labor_force_decomposedThe labor force participation rate data, discussed here and here, can be decomposed into two components, known as the trend and residuals. (See right.)  The residuals are  the raw data minus the trend; in other words, they are the data after removing the trend.

If the purpose of the analysis is to describe the evolution of the labor force participation rate over time, then the trend is the signal we're after.

Our purpose is the opposite. I want to remove the trend in order to surface correlations that are unrelated to time evolution. Thus, the residuals are where the signal is.

Another way to think about the residuals (bottom chart) is that positive values imply the actual data was above trend while negative values imply the actual data was below trend.

***

After decomposing the miles-driven data in the same way, I obtain two sets of residuals. These were plotted in the last post in a scatter plot.

The lack of correlation is also obvious in the plot below. You can see that the periods when one series of residuals went above trend was not well correlated with the other series being above trend (or below trend).

Jc_residuals