Loading...
Loading...
Fit ARIMA(p,d,q) models to time series data. Estimate parameters, evaluate model fit with AIC/BIC, generate forecasts, and export Python statsmodels code.
import pandas as pd
import numpy as np
from statsmodels.tsa.arima.model import ARIMA
from statsmodels.tsa.stattools import adfuller
import matplotlib.pyplot as plt
# Data
data = pd.Series([100, 102, 105, 103, 107, 110, 108, 112, 115, 113, 117, 120, 118, 122, 125, 123, 127, 130, 128, 132, 135, 133, 137, 140])
# ADF test for stationarity
adf_result = adfuller(data)
print(f"ADF Statistic: {adf_result[0]:.4f}")
print(f"p-value: {adf_result[1]:.4f}")
print(f"Stationary: {'Yes' if adf_result[1] < 0.05 else 'No (consider differencing)'}")
# Fit ARIMA(1,1,0)
model = ARIMA(data, order=(1, 1, 0))
fitted = model.fit()
print(f"\nARIMA(1,1,0) Results:")
print(fitted.summary())
# Forecast
forecast = fitted.forecast(steps=6)
conf_int = fitted.get_forecast(steps=6).conf_int()
print(f"\nForecast (6 periods):")
for i, (fc, lo, hi) in enumerate(zip(forecast, conf_int.iloc[:, 0], conf_int.iloc[:, 1])):
print(f" t+{i+1}: {fc:.2f} [{lo:.2f}, {hi:.2f}]")
# Diagnostics
print(f"\nAIC: {fitted.aic:.2f}")
print(f"BIC: {fitted.bic:.2f}")
# Plot
fig, axes = plt.subplots(2, 1, figsize=(12, 8))
# Forecast plot
axes[0].plot(data.index, data, 'ko-', label='Actual', alpha=0.6, markersize=4)
axes[0].plot(data.index, fitted.fittedvalues, 'b-', label='Fitted', linewidth=2)
fc_idx = range(len(data), len(data) + 6)
axes[0].plot(fc_idx, forecast, 'r--', label='Forecast', linewidth=2)
axes[0].fill_between(fc_idx, conf_int.iloc[:, 0], conf_int.iloc[:, 1], alpha=0.2, color='red')
axes[0].legend()
axes[0].set_title('ARIMA(1,1,0) Forecast')
axes[0].grid(True, alpha=0.3)
# Residuals
axes[1].plot(fitted.resid, 'g-', alpha=0.7)
axes[1].axhline(y=0, color='k', linestyle='--')
axes[1].set_title('Residuals')
axes[1].grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
# Auto ARIMA (requires pmdarima)
# from pmdarima import auto_arima
# auto_model = auto_arima(data, seasonal=False, trace=True)
# print(auto_model.summary())p = AR order (number of autoregressive lags), d = differencing order (times to difference for stationarity), q = MA order (number of moving average lags). ARIMA(1,1,0) means: 1 AR lag, 1 differencing, no MA. Use ACF/PACF plots to identify p and q; use ADF test for d.
Run the Augmented Dickey-Fuller (ADF) test. If p-value > 0.05, the series is non-stationary - apply first differencing (d=1). After differencing, test again. Most economic/business time series need d=1. Rarely is d=2 needed.
Both measure model fit vs complexity. AIC = n·ln(σ²) + 2k, BIC = n·ln(σ²) + k·ln(n). BIC penalizes complexity more, preferring simpler models. Use AIC for prediction accuracy, BIC for identifying the true model. Lower values = better model.
auto_arima (from pmdarima) automatically searches for the best ARIMA(p,d,q) by trying multiple combinations and selecting the one with lowest AIC/BIC. It also handles stationarity tests and seasonal models. Great for quick modeling without manual ACF/PACF analysis.
Use SARIMA (Seasonal ARIMA) when your data has seasonal patterns. SARIMA adds seasonal AR, differencing, and MA parameters: SARIMA(p,d,q)(P,D,Q)m where m is the seasonal period. Use when ACF shows spikes at seasonal lags (e.g., lag 12 for monthly data).