What Is Time Series Analysis?

Megaputer Intelligence
Geek Culture
Published in
5 min readMar 29, 2021

--

Many types of models are created from individual samples of data. We may have conducted a study, collected vital signs from different patients, and are now fitting some predictive model. These patients are unique and independent cases, which collectively make up a trend. However, many kinds of systems are not collections of individual data points; they are a sequence of observations of one concept over time. The fluctuations in the stock market, global temperatures, oil sales, activity of solar flares, and so on are not baskets of different observations, but rather they are sequences of the same object through time. Thus, we would like to model how this object changes over time. When we consider time and past observations of a variable to model how it behaves and make forecasts, this is called a time series analysis.

As the name implies, time series models are used to predict what will happen in the future. The goal is to learn patterns from the past and use those for forecasting. These models are used in many applications such as predicting the weather and predicting economic activity for investors. There are many kinds of time series analysis, some more complicated than others, but today we will examine a relatively straightforward yet powerful one: ARIMA. ARIMA, or Autoregressive Integrated Moving Average, is built out of several components. Let’s tackle the Autoregressive portion first.

Autoregression

The concept of autoregression is simple: What happens today is strongly correlated with what happened yesterday, mildly correlated with what happened last week, weakly correlated with what happened last month, and so on. Essentially, we would like to use past values of our variable to predict future ones. That is, at a given time, t, our value at that time is modeled as a constant, c, plus a sum of weighted past values going back p timestamps, plus some Gaussian noise (error).

This equation may look familiar. It is just a linear regression using past values of our variable as the input variables. This is where the “auto” part of autoregression comes from. We are regressing a variable on itself. And we can easily fit this model using Ordinary Least Squares; we just need to select the parameter p, which is how far back we want to look.

Moving Averages

Autoregression is fairly intuitive. Moving averages, however, are a bit trickier to understand. In a moving average model, we don’t use the actual values of our past observations, but rather the errors they accrued. More formally, a moving average is another type of linear regression but instead of using past observations of data, we use their errors.

Moving averages are conceptually simple, but tricky because they can be hard to create helpful analogies. A good example is to consider a human trying to throw a stone as far as possible. They first throw the stone and have some deviation or “noise” or “error” from their typical throw. Then they throw again. This second throw may have a deviation from the typical throw related to the deviation in their previous throw. Maybe a good first throw gives them confidence and they are more likely to continue that success or maybe a poor throw dejects them and makes them more likely to have another poor throw. With moving averages, there is momentum in these deviations.

Integrated

We can combine the two models described above to form a model called an ARMA (Autoregressive Moving Average) or even an ARIMA. Unfortunately, the “integrated” portion of ARIMA is a bit too technical in detail to discuss here in depth. Basically, it refers to the fact that an ARIMA system is something that is an ARMA system when we take the difference between successive times. Or in the case of a continuous time series, we take the derivative. It basically means that we can consider if we should be modeling the data directly or if we should model the rate of change of the data and then accumulate that change.

Seasonality

ARIMA models are good but they often fall short when data exhibits seasonal behavior. For example, consider temperatures over a period of years.

There is a strong cyclical pattern as summer is hotter than winter. Thus, the data may be strongly correlated with the direct past, but it is also strongly correlated with some fixed periodic interval. The temperatures today depend on what they were last week, very little on what they were six months ago, and moderately on what they were a year ago. We can include autoregressive and moving average models that, instead of looking back several timestamps, they look back several periods of the seasonal structure. ARIMA combined with this seasonal model is called Seasonal ARIMA or SARIMA.

Using ARIMA Models

Training an ARIMA model is fast and easy. The problem is choosing the hyperparameters of how far you wish to look back and if there is seasonality. However, like traditional linear regression, there are several statistical tools to help make that decision. We can use autocorrelation and partial autocorrelation correlograms to help visualize when to make these cutoffs. When fitting a model, we can view heuristics like the distribution of our errors and measures like AIC and BIC to compare performance. Once we have selected a model, we can use it to make forecasts and either through some statistical calculations or simulation, we can then create a band of likelihood. Our predictions, like any regression, are almost impossible to be literally true. It is unlikely that we get exactly the right price of a stock or the exact temperature. However, we can create confidence intervals, which are regions where we are 95% (or some other value) sure the future will move in. All of this was achieved simply by looking at the past values of our desired variable. Plus, time series can be made more powerful by using independent input variables in the regression as well.

ARIMA models are another example of an excellent application of statistics in that they are:

  • Formalized
  • Easy to understand
  • Predictable
  • Interpretable
  • Produce good results

--

--

Megaputer Intelligence
Geek Culture

A data and text analysis firm that specializes in natural language processing