So here are some great news: an awesome book chapter about time series analysis!
And here is the link I used: http://www.itl.nist.gov/div898/handbook/pmc/section4/pmc41.htm
And I’m going to explore the notes about the content here, the best part of the introduction is one of the applications of time series analysis:
- Fit a model and proceed to forecasting, monitoring or even feedback and feedforward control.
But I didn’t know what feedback or feedforward control were, and the best I could find was made on this Q&A. As it’s mentioned, and what I’ve found is this:
- Feedback control: Given some data points, I try to predict/forecast the time series and then compare it with actual data received from the future to measure error and improve the adjustements.
- Feedforward control: I got really lost on this one, so I’ll research further.
So I moved foward on the chapter and they already mention a simple technique to reveal underlying trend, seasonal and cyclic components, it’s called smoothing. And there are two groups of them:
- Averaging Methods
- Exponential Smoothing Methods (why exponential?)
It starts focusing on the first one, and gives the example of mean as a predictor, but we can see in our minds that it’s a poorly predictor for a time series where there are the concepts of trends and etc, so I won’t go further here on that. But next comes seemed interesting:
Single Moving Average
The process is simple taking small subsets of the data and take the average of them, these small subests being successive data points. So imagine the following data points:
x = [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
and we select the step of m=3, I get the first three data points and avarage them. Then I’ll try to predict the one at the end of the “time window”, per example:
3 + 4 + 5 = 12/3 = 4,
the prediction would be 5 – 4. The next one would then be:
4 + 5 + 6 = 15 /3 = 5, then: 6-5.
so I built a python script to calculate that, let’s see how it works, this is the code:
from statistics import mean data = [20, 25, 33, 15, 8, 13, 18] errors =  average_list =  time_window = 3 for point in data: average_list.append(point) if len(average_list) == time_window: print(average_list) errors.append(point - mean(average_list)) average_list.pop(0) errors_mean = [(x-mean(errors))**2 for x in data] print("MSE: "+ str(mean([error**2 for error in errors]))) print("MSE by mean would be:" + str(mean(errors_mean)))
You can check the results for yourself and you’ll see the improvement is gigantic on MSE (which the smaller is the best, if you don’t what MSE is, here is a link).
Thus, today this is my contribution, I’ll try to explore more of the book and provide new stuff tomorrow. See you all.