使用sktime进行时间序列预测zz - 风生水起

使用sktime预测

在预测中，我们对利用过去的数据进行对未来进行预测很感兴趣。sktime提供了常用的统计预测算法和用于建立复合机器学习模型的工具。

更多细节，请看我们关于用sktime进行预测的论文，其中我们更详细地讨论了 forecasting API，并使用它来复制和扩展M4研究。

准备工作

[2]:
from warnings import simplefilter

import numpy as np
import pandas as pd

from sktime.datasets import load_airline
from sktime.forecasting.arima import ARIMA, AutoARIMA
from sktime.forecasting.base import ForecastingHorizon
from sktime.forecasting.compose import (
    EnsembleForecaster,
    ReducedRegressionForecaster,
    TransformedTargetForecaster,
)
from sktime.forecasting.exp_smoothing import ExponentialSmoothing
from sktime.forecasting.model_selection import (
    ForecastingGridSearchCV,
    SlidingWindowSplitter,
    temporal_train_test_split,
)
from sktime.forecasting.naive import NaiveForecaster
from sktime.forecasting.theta import ThetaForecaster
from sktime.forecasting.trend import PolynomialTrendForecaster
from sktime.performance_metrics.forecasting import sMAPE, smape_loss
from sktime.transformations.series.detrend import Deseasonalizer, Detrender
from sktime.utils.plotting import plot_series

simplefilter("ignore", FutureWarning)
%matplotlib inline

数据

首先，我们使用Box-Jenkins单变量航空数据集，该数据集显示了1949-1960年每月的国际航空乘客数量。

[3]:
y = load_airline()
plot_series(y);

一个时间序列由一系列时间点-数值对组成，其中数值代表我们观察到的数值，时间点代表我们观察到该数值的时间点。

我们将时间序列表示为pd.Series，其中索引代表时间点。sktime支持pandas的integer, period 和timestamp。在这个例子中，我们有一个period index。

[4]:
y.index
[4]:
PeriodIndex(['1949-01', '1949-02', '1949-03', '1949-04', '1949-05', '1949-06',
             '1949-07', '1949-08', '1949-09', '1949-10',
             ...
             '1960-03', '1960-04', '1960-05', '1960-06', '1960-07', '1960-08',
             '1960-09', '1960-10', '1960-11', '1960-12'],
            dtype='period[M]', name='Period', length=144, freq='M')

明确预测任务

接下来我们将定义一个预测任务。

我们将尝试预测最近3年的数据，使用前几年的数据作为训练数据。系列中的每一个点代表一个月，所以我们应该保留最后36个点作为测试数据，并使用36步的超前预测范围来评估预测性能。
我们将使用sMAPE（对称平均绝对误差百分比）来量化我们预测的准确性。较低的sMAPE意味着较高的准确性。

我们可以对数据进行如下拆分。

[5]:
y_train, y_test = temporal_train_test_split(y, test_size=36)
plot_series(y_train, y_test, labels=["y_train", "y_test"])
print(y_train.shape[0], y_test.shape[0])
108 36

当我们进行预测时，我们需要指定预测范围，并将其传递给我们的预测算法。

相对预测范围

One of the simplest ways is to define a np.array with the steps ahead that you want to predict relative to the end of the training series.

（这句不知咋翻译好）

[6]:
fh = np.arange(len(y_test)) + 1
fh
[6]:
array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
       18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
       35, 36])

所以这里我们感兴趣的是从第一步到第三十六步的预测。当然你也可以你使用其他的预测范围。例如，如果只预测前面的第二步和第五步，你可以写：

import numpy as np
fh = np.array([2, 5])  # 2nd and 5th step ahead

绝对预测范围

另外，我们也可以使用我们想要预测的绝对时间点来指定预测范围。为了做到这一点，我们需要使用sktime的ForecastingHorizon类。这样，我们就可以简单地从测试集的时间点中创建预测范围。

[7]:
fh = ForecastingHorizon(y_test.index, is_relative=False)
fh
[7]:
ForecastingHorizon(['1958-01', '1958-02', '1958-03', '1958-04', '1958-05', '1958-06',
             '1958-07', '1958-08', '1958-09', '1958-10', '1958-11', '1958-12',
             '1959-01', '1959-02', '1959-03', '1959-04', '1959-05', '1959-06',
             '1959-07', '1959-08', '1959-09', '1959-10', '1959-11', '1959-12',
             '1960-01', '1960-02', '1960-03', '1960-04', '1960-05', '1960-06',
             '1960-07', '1960-08', '1960-09', '1960-10', '1960-11', '1960-12'],
            dtype='period[M]', name='Period', freq='M', is_relative=False)

进行预测

像在scikit-learn中一样，为了进行预测，我们需要先指定（或建立）一个模型，然后将其拟合到训练数据中，最后调用predict来生成给定预测范围的预测。

sktime附带了几种预测算法（或叫forecasters）和建立综合模型的工具。所有forecasters都有一个共同的界面。forecasters根据单一系列数据进行训练，并对所提供的预测范围进行预测。

先来两个naïve预测策略，可以作为比较复杂方法的参考。

预测最后的数值

[8]:
# using sktime
forecaster = NaiveForecaster(strategy="last")
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
smape_loss(y_pred, y_test)
[8]:
0.23195770387951434

预测同季最后的数值

[9]:
forecaster = NaiveForecaster(strategy="last", sp=12)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
smape_loss(y_pred, y_test)
[9]:
0.145427686270316

为什么不直接用scikit-learn？

你可能会有疑问，为什么我们不干脆用scikit-learn来做预测呢？预测说到底不就是一个回归问题吗？

原则上，是的。但是 scikit-learn 并不是为解决预测任务而设计的，所以要小心陷阱!

[10]:
from sklearn.model_selection import train_test_split

y_train, y_test = train_test_split(y)p
plot_series(y_train.sort_index(), y_test.sort_index(), labels=["y_train", "y_test"]);

这就导致了

你用来训练机器学习算法的数据恰好有你想要预测的信息。

但是train_test_split(y, shuffle=False)是可以的，这就是sktime中temporal_train_test_split(y)的作用。

[11]:
y_train, y_test = temporal_train_test_split(y)
plot_series(y_train, y_test, labels=["y_train", "y_test"]);

为了使用scikit-learn，我们必须首先将数据转换为所需的表格格式，然后拟合regressor ，最后生成预测。

关键思想：精简

预测通常是通过回归来解决的。这种方法有时被称为还原法，因为我们将预测任务还原为更简单但相关的表格回归任务。这样就可以对预测问题应用任何回归算法。

精简为回归的工作原理如下。我们首先需要将数据转化为所需的表格格式。我们可以通过将训练序列切割成固定长度的窗口，并将它们叠加在一起来实现。我们的目标变量由每个窗口的后续观测值组成。

我们可以写一些代码来做到这一点，例如在M4比赛中。

[12]:
# slightly modified code from the M4 competition
def split_into_train_test(data, in_num, fh):
    """
    Splits the series into train and test sets.

    Each step takes multiple points as inputs
    :param data: an individual TS
    :param fh: number of out of sample points
    :param in_num: number of input points for the forecast
    :return:
    """
    train, test = data[:-fh], data[-(fh + in_num) :]
    x_train, y_train = train[:-1], np.roll(train, -in_num)[:-in_num]
    x_test, y_test = test[:-1], np.roll(test, -in_num)[:-in_num]
    #     x_test, y_test = train[-in_num:], np.roll(test, -in_num)[:-in_num]

    # reshape input to be [samples, time steps, features]
    # (N-NF samples, 1 time step, 1 feature)
    x_train = np.reshape(x_train, (-1, 1))
    x_test = np.reshape(x_test, (-1, 1))
    temp_test = np.roll(x_test, -1)
    temp_train = np.roll(x_train, -1)
    for x in range(1, in_num):
        x_train = np.concatenate((x_train[:-1], temp_train[:-1]), 1)
        x_test = np.concatenate((x_test[:-1], temp_test[:-1]), 1)
        temp_test = np.roll(temp_test, -1)[:-1]
        temp_train = np.roll(temp_train, -1)[:-1]

    return x_train, y_train, x_test, y_test
[13]:
# here we split the time index, rather than the actual values,
# to show how we split the windows
feature_window, target_window, _, _ = split_into_train_test(
    np.arange(len(y)), 10, len(fh)
)

为了更好地理解事先的数据转换，我们可以看看如何将训练序列分割成窗口。这里我们展示了以整数指数表示的生成窗口。

[14]:
feature_window[:5, :]
[14]:
array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10],
       [ 2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
       [ 3,  4,  5,  6,  7,  8,  9, 10, 11, 12],
       [ 4,  5,  6,  7,  8,  9, 10, 11, 12, 13]])
[15]:
target_window[:5]
[15]:
array([10, 11, 12, 13, 14])
[16]:
# now we can split the actual values of the time series
x_train, y_train, x_test, y_test = split_into_train_test(y.values, 10, len(fh))
print(x_train.shape, y_train.shape)
(98, 10) (98,)
[17]:
from sklearn.ensemble import RandomForestRegressor

model = RandomForestRegressor()
model.fit(x_train, y_train)
[17]:
RandomForestRegressor()

这里有哪些潜在的隐患？

这需要大量的手写代码，而这些代码往往容易出错，不模块化，也不可调。还需要注意的是，这些步骤涉及到一些隐含的超参数。将时间序列切成窗口的方式（如窗口长度） _ 生成预测的方式（递归策略、直接策略、其他混合策略） _ 递归策略是指将时间序列切成窗口的方式。

陷阱三：给定一个拟合回归算法，我们如何生成预测？

[18]:
print(x_test.shape, y_test.shape)

# add back time index to y_test
y_test = pd.Series(y_test, index=y.index[-len(fh) :])
(36, 10) (36,)
[19]:
y_pred = model.predict(x_test)
smape_loss(pd.Series(y_pred, index=y_test.index), y_test)
[19]:
0.11455911283150787

但这里的问题是什么？

实际上，我们并不进行多步预测，直到第36步。取而代之的是，我们总是使用最新的数据进行36个单步前的预测。但这是另一种学习任务的解决方案!

为了解决这个问题，我们可以像M4比赛中一样，写一些代码来进行递归。

[20]:
# slightly modified code from the M4 study
predictions = []
last_window = x_train[-1, :].reshape(1, -1)  # make it into 2d array

last_prediction = model.predict(last_window)[0]  # take value from array

for i in range(len(fh)):
    # append prediction
    predictions.append(last_prediction)

    # update last window using previously predicted value
    last_window[0] = np.roll(last_window[0], -1)
    last_window[0, (len(last_window[0]) - 1)] = last_prediction

    # predict next step ahead
    last_prediction = model.predict(last_window)[0]

y_pred_rec = pd.Series(predictions, index=y_test.index)
smape_loss(y_pred_rec, y_test)
[20]:
0.15670668827071418

使用sktime预测

sktime为这种方法提供了一个meta-estimator，即：

模块化，并且兼容scikit-learn，因此我们可以很容易地应用任何scikit-learn回归器来解决我们的预测问题。
可调整，允许我们调整超参数，如窗口长度或策略，以生成预测。
自适应，即它将scikit-learn的估计器界面调整为预测器界面，确保我们能够调整并正确评估我们的模型。

[21]:
y = load_airline()
y_train, y_test = temporal_train_test_split(y, test_size=36)
print(y_train.shape[0], y_test.shape[0])
108 36
[22]:
from sklearn.neighbors import KNeighborsRegressor

regressor = KNeighborsRegressor(n_neighbors=1)
forecaster = ReducedRegressionForecaster(
    regressor=regressor, window_length=12, strategy="recursive"
)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
smape_loss(y_test, y_pred)
[22]:
0.14008272913734346

sktime有许多统计预测算法，基于statsmodels的实现。例如，为了使用带有加法趋势成分和乘法季节性的指数平滑算法，我们可以写如下：

请注意，由于这是月度数据，季节性周期(sp)或每年的周期数为12。

[23]:
forecaster = ExponentialSmoothing(trend="add", seasonal="multiplicative", sp=12)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
smape_loss(y_test, y_pred)
[23]:
0.05108252343492944

状态空间模型的指数平滑也可以类似于R中的ets函数自动进行。

[24]:
from sktime.forecasting.ets import AutoETS

forecaster = AutoETS(auto=True, sp=12, n_jobs=-1)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
smape_loss(y_test, y_pred)
[24]:
0.06317467074033545

另一种常见的模型是ARIMA模型。在 sktime中，我们与 pmdarima https://github.com/alkaline-ml/pmdarima`__接口，这是一个自动选择最佳ARIMA模型的软件包。这是因为要在许多可能的模型参数上进行搜索，所以可能要花一点时间。

[25]:
forecaster = AutoARIMA(sp=12, suppress_warnings=True)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
smape_loss(y_test, y_pred)
[25]:
0.04117062367656992

也可以手动配置单个ARIMA模型。

[26]:
forecaster = ARIMA(
    order=(1, 1, 0), seasonal_order=(0, 1, 0, 12), suppress_warnings=True
)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
smape_loss(y_test, y_pred)
[26]:
0.04257105737228371

BATS和TBATS是另外两种时间序列预测算法，通过封装`tbatshttps://github.com/intive-DataScience/tbats__，包含在sktime中。

[27]:
from sktime.forecasting.bats import BATS

forecaster = BATS(sp=12, use_trend=True, use_box_cox=False)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
smape_loss(y_test, y_pred)
[27]:
0.08689500756325415

[28]:
from sktime.forecasting.tbats import TBATS

forecaster = TBATS(sp=12, use_trend=True, use_box_cox=False)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
smape_loss(y_test, y_pred)
[28]:
0.08493353477049964

sktime还提供了Facebook的`fbprophethttps://github.com/facebook/prophet__的接口。请注意，fbprophet与时间戳类型为pd.DatetimeIndex的数据密切相关，所以我们必须先转换索引类型。

[30]:
# Convert index to pd.DatetimeIndex
z = y.copy()
z = z.to_timestamp(freq="M")
z_train, z_test = temporal_train_test_split(z, test_size=36)
[32]:
from sktime.forecasting.fbprophet import Prophet

forecaster = Prophet(
    seasonality_mode="multiplicative",
    n_changepoints=int(len(y_train) / 12),
    add_country_holidays={"country_name": "Germany"},
    yearly_seasonality=True,
)
forecaster.fit(z_train)
y_pred = forecaster.predict(fh.to_relative(cutoff=y_train.index[-1]))

y_pred.index = y_test.index
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
smape_loss(y_test, y_pred)
INFO:fbprophet:Disabling weekly seasonality. Run prophet with weekly_seasonality=True to override this.
INFO:fbprophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.
[32]:
0.06939056917256975

构建组合模型

sktime为预测的复合模型构建提供了一个模块化的API。

与scikit-learn一样，sktime提供了一个meta-forecaster，用于组合多种预测算法。例如，我们可以将指数平滑的不同变体组合如下。

[ ]:
forecaster = EnsembleForecaster(
    [
        ("ses", ExponentialSmoothing(seasonal="multiplicative", sp=12)),
        (
            "holt",
            ExponentialSmoothing(
                trend="add", damped_trend=False, seasonal="multiplicative", sp=12
            ),
        ),
        (
            "damped",
            ExponentialSmoothing(
                trend="add", damped_trend=True, seasonal="multiplicative", sp=12
            ),
        ),
    ]
)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
smape_loss(y_test, y_pred)

调优

在 ReducedRegressionForecaster 中，window_length 和 strategy 参数都是我们可能想要优化的超参数。

[31]:
forecaster = ReducedRegressionForecaster(
    regressor=regressor, window_length=15, strategy="recursive"
)
param_grid = {"window_length": [5, 10, 15]}

#  we fit the forecaster on the initial window,
# and then use temporal cross-validation to find the optimal parameter
cv = SlidingWindowSplitter(initial_window=int(len(y_train) * 0.5))
gscv = ForecastingGridSearchCV(forecaster, cv=cv, param_grid=param_grid)
gscv.fit(y_train)
y_pred = gscv.predict(fh)
[32]:
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
smape_loss(y_test, y_pred)
[32]:
0.14187443909112035

[33]:
gscv.best_params_
[33]:
{'window_length': 15}

使用scikit-learn的GridSearchCV，除了调整window_length，我们还可以调整从scikit-learn导入的regressors 。

[34]:
from sklearn.model_selection import GridSearchCV

# tuning the 'n_estimator' hyperparameter of RandomForestRegressor from scikit-learn
regressor_param_grid = {"n_estimators": [100, 200, 300]}
forecaster_param_grid = {"window_length": [5, 10, 15, 20, 25]}

# create a tunnable regressor with GridSearchCV
regressor = GridSearchCV(RandomForestRegressor(), param_grid=regressor_param_grid)
forecaster = ReducedRegressionForecaster(
    regressor, window_length=15, strategy="recursive"
)

cv = SlidingWindowSplitter(initial_window=int(len(y_train) * 0.5))
gscv = ForecastingGridSearchCV(forecaster, cv=cv, param_grid=forecaster_param_grid)

gscv.fit(y_train)
y_pred = gscv.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
smape_loss(y_test, y_pred)
[34]:
0.12834791719456862

[35]:
print(gscv.best_params_, gscv.best_forecaster_.regressor_.best_params_)
{'window_length': 25} {'n_estimators': 200}

在调优过程中，我们可以使用ForecastingGridSearchCV的scoring参数来获取某个特定指标的性能。

[36]:
gscv = ForecastingGridSearchCV(
    forecaster, cv=cv, param_grid=forecaster_param_grid, scoring=sMAPE()
)
gscv.fit(y_train)
pd.DataFrame(gscv.cv_results_)
[36]:

| | mean_fit_time | mean_score_time | param_window_length | params | mean_test_sMAPE | rank_test_sMAPE | | ---: | ---: | ---: | ---: | ---: | ---: | ---: | | 0 | 5.004688 | 1.640830 | 5 | {'window_length': 5} | 0.296896 | 5 | | 1 | 4.795189 | 1.559630 | 10 | {'window_length': 10} | 0.269926 | 4 | | 2 | 4.777340 | 1.652045 | 15 | {'window_length': 15} | 0.245826 | 3 | | 3 | 4.634498 | 1.150868 | 20 | {'window_length': 20} | 0.242409 | 2 | | 4 | 4.768382 | 1.578212 | 25 | {'window_length': 25} | 0.237839 | 1 |

请注意，到目前为止，上面的还原方法没有考虑任何季节性或趋势，但我们可以很容易地指定一个pipeline ，它首先对数据进行detrends。

sktime提供了一个通用的detrender，一个使用任何预测器并返回预测器预测值的样本内残差的变换器。例如，为了去除时间序列的线性趋势，我们可以写：

[37]:
# liner detrending
forecaster = PolynomialTrendForecaster(degree=1)
transformer = Detrender(forecaster=forecaster)
yt = transformer.fit_transform(y_train)

# internally, the Detrender uses the in-sample predictions
# of the PolynomialTrendForecaster
forecaster = PolynomialTrendForecaster(degree=1)
fh_ins = -np.arange(len(y_train))  # in-sample forecasting horizon
y_pred = forecaster.fit(y_train).predict(fh=fh_ins)

plot_series(y_train, y_pred, yt, labels=["y_train", "fitted linear trend", "residuals"]);

让我们在pipeline中使用去季节化的同时，也使用detrender。需要注意的是，在预测中，当我们在拟合前应用数据变换时，我们需要对预测值进行逆向变换。为此，我们提供了以下pipeline 类。

[38]:
forecaster = TransformedTargetForecaster(
    [
        ("deseasonalise", Deseasonalizer(model="multiplicative", sp=12)),
        ("detrend", Detrender(forecaster=PolynomialTrendForecaster(degree=1))),
        (
            "forecast",
            ReducedRegressionForecaster(
                regressor=regressor, window_length=12, strategy="recursive"
            ),
        ),
    ]
)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
smape_loss(y_test, y_pred)
[38]:
0.05448013755454164

当然，我们可以再尝试优化pipeline各组件的超参数。

下面我们讨论预测的另外两个方面：online learning，我们要随着新数据的到来动态更新预测；预测区间，让我们可以量化预测的不确定性。

Online Forecasting

对于模型评估，我们有时想要评估多个预测，使用测试数据的滑动窗口进行时间交叉验证。为此，我们可以利用online_forecasting模块中的预测器，它使用复合预测器PredictionWeightedEnsemble来跟踪每个预测器积累的损失，并创建一个由最 "准确 "的预测器的预测加权的预测。

请注意，预测任务发生了变化：我们进行35次预测，因为我们需要第一次预测来帮助更新权重，我们不提前36步预测。

[39]:
from sklearn.metrics import mean_squared_error

from sktime.forecasting.online_learning import (
    NormalHedgeEnsemble,
    OnlineEnsembleForecaster,
)

首先，我们需要初始化一个PredictionWeightedEnsembler，它将跟踪每个forecaster 积累的损失，并定义我们想要使用的损失函数。

[40]:
hedge_expert = NormalHedgeEnsemble(n_estimators=3, loss_func=mean_squared_error)

然后我们可以通过定义各个forecaster 并指定我们使用的PredictionWeightedEnsembler来创建forecaster 。然后通过拟合我们的forecaster ，并用update_predict函数进行更新和预测，我们得到。

[41]:
forecaster = OnlineEnsembleForecaster(
    [
        ("ses", ExponentialSmoothing(seasonal="multiplicative", sp=12)),
        (
            "holt",
            ExponentialSmoothing(
                trend="add", damped_trend=False, seasonal="multiplicative", sp=12
            ),
        ),
        (
            "damped",
            ExponentialSmoothing(
                trend="add", damped_trend=True, seasonal="multiplicative", sp=12
            ),
        ),
    ],
    ensemble_algorithm=hedge_expert,
)

forecaster.fit(y_train)
y_pred = forecaster.update_predict(y_test)
plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
smape_loss(y_test[1:], y_pred)
[41]:
0.04998488843486813

对于单次更新，您可以使用update方法。

预测区间

到目前为止，我们只研究了点预测。在很多情况下，我们也对预测区间感兴趣。sktime的接口支持预测区间，但我们还没有为所有算法实现它们。

在这里，我们使用Theta预测算法。

[42]:
forecaster = ThetaForecaster(sp=12)
forecaster.fit(y_train)
alpha = 0.05  # 95% prediction intervals
y_pred, pred_ints = forecaster.predict(fh, return_pred_int=True, alpha=alpha)
smape_loss(y_test, y_pred)
[42]:
0.08661467699983212
[43]:
fig, ax = plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
ax.fill_between(
    ax.get_lines()[-1].get_xdata(),
    pred_ints["lower"],
    pred_ints["upper"],
    alpha=0.2,
    color=ax.get_lines()[-1].get_c(),
    label=f"{1 - alpha}% prediction intervals",
)
ax.legend();

总结

正如我们所看到的，为了进行预测，我们需要首先指定（或建立）一个模型，然后将其与训练数据相适应，最后调用predict来生成给定预测范围的预测。

sktime自带多种预测算法（或forecasters）和复合模型构建工具。所有的预测器都有一个共同的界面。预测器在单一数据系列上进行训练，并对所提供的预测范围进行预测。
sktime有许多统计预测算法，基于statsmodels中的实现。例如，为了使用带有加法趋势成分和乘法季节性的指数平滑算法，我们可以写出以下内容：

使用sktime预测

准备工作

数据

明确预测任务

相对预测范围

绝对预测范围

进行预测

预测最后的数值

预测同季最后的数值

为什么不直接用scikit-learn？

关键思想：精简

陷阱三：给定一个拟合回归算法，我们如何生成预测？

使用sktime预测

构建组合模型

调优

Online Forecasting

预测区间

总结

更多资料