人工智能 时间序列分析
今天学习AI (LEARN AI TODAY)
This is the 4th story in the Learn AI Today series! If you have not already, make sure to check the previous story.
这是《 今日学习AI》中的第四个故事 系列! 如果还没有,请确保检查以前的故事。
您将从这个故事中学到什么: (What you will learn in this story:)
- Create a chaotic time series 创建一个混乱的时间序列
- Split the series in sequences to feed to a model 按顺序拆分系列以提供给模型
- Define and train a 1d convolutional neural network for time series forecast 定义和训练一维卷积神经网络以进行时间序列预测
Using fastai2 Dataset and Learner
使用fastai2 数据集和学习者
1.创建一个混沌时间序列 (1. Create a Chaotic Time Series)
A chaotic system is one in which a very small change in the initial conditions can result in a completely different outcome. In such systems, you cannot predict the future exactly even if you know the deterministic equations that describe the future. Why? Because you would need an infinitely precise computer and infinitely precise initial conditions. In reality that’s fundamentally impossible. This is why the weather forecasts are only accurate in a short-range. They have been improving over time as dynamical and statistical models improve and the computation power increases. However, it is not possible to predict the daily weather accurately in the long term since it is a chaotic system.
混沌系统是这样一种系统,其中初始条件的很小变化会导致完全不同的结果。 在这样的系统中,即使您知道描述未来的确定性方程式,也无法准确预测未来。 为什么? 因为您将需要一台无限精确的计算机和无限精确的初始条件。 实际上,这根本是不可能的。 这就是为什么天气预报仅在短期内是准确的。 随着动态和统计模型的改进以及计算能力的提高,它们随着时间的推移一直在改进。 但是,由于这是一个混乱的系统,因此无法长期准确地预测每日天气。
To generate a chaotic time series I will use the Mackey-Glass equation, with the parameters described here. The Python code looks like this:
为了生成混沌时间序列,我将使用Mackey-Glass方程 ,并在此处描述参数。 Python代码如下所示:
N = 20000
b = 0.1
c = 0.2
tau = 17
y = [0.9697, 0.9699, 0.9794, 1.0003, 1.0319, 1.0703, 1.1076, 1.1352, 1.1485,
1.1482, 1.1383, 1.1234, 1.1072, 1.0928, 1.0820, 1.0756, 1.0739, 1.0759]
for n in range(17,N+99):
y.append(y[n] - b*y[n] + c*y[n-tau]/(1+y[n-tau]**10))
y = y[100:]
The result for the first 500 samples is the following:
前500个样本的结果如下:
Notice there’s a pattern in the time series but with a lot of variations that cannot be reliably predicted over a long period of time.
请注意,时间序列中有一个模式,但是有很多变化,无法长期可靠地预测。
2.按顺序拆分系列以提供给模型 (2. Split the series in sequences to feed to a model)
The data generated with the code above is a very long sequence. To train a model I will slice the data in smaller sequences of 500 elements. The input data will be the first 100 elements and the target data (the future forecast) the 400 remaining elements. Furthermore, the first 2/3 of the data will be used as training data whereas the last 1/3 as validation data.
用上面的代码生成的数据序列很长。 为了训练模型,我将数据按500个元素的较小序列进行切片。 输入数据将是前100个元素, 目标数据 (未来预测)将是其余400个元素。 此外,数据的前2/3将用作训练数据 ,而后1/3将用作验证数据 。
Note: When working with time-series is often a good idea to select the last portion of the series for validation, particularly if you want to predict the future. Otherwise, the model may learn auto-correlations and produce a misleadingly good result that will “break” in practice when applying for the future.
注意:使用时间序列时,通常最好选择序列的最后一部分进行验证,尤其是在您要预测未来时。 否则,该模型可能会学习自相关并产生误导性良好的结果,在将来申请时会在实践中“打破”。
def create_sequences(yin, input_seq_size, output_seq_size):
xout, yout = [], []
for ii in tqdm(range(yin.shape[0]-input_seq_size-output_seq_size)):
xout.append(yin[ii:ii+input_seq_size, ...].view(1, 1, -1))
yout.append(yin[ii+input_seq_size:ii+input_seq_size+output_seq_size, ...].view(1, 1, -1))
xout = torch.cat(xout, dim=0)
yout = torch.cat(yout, dim=0)
return xout, yout.squeeze()
The code above simply runs over the long series and creates sequences of inputs and targets as described above, reshaped in the format needed for the model. For image data and the usual 2D convolutional neural networks, the tensors are organized with the shape [batch-size, channels, rows, columns]. Here for 1D convolutional neural networks, it’s the same but dropping the last dimension. And indeed it possible to have an input of multiple time series (represented as several channels).
上面的代码只是在较长的序列上运行,并如上所述创建输入和目标的序列,并以模型所需的格式对其进行了调整。 对于图像数据和通常的2D卷积神经网络,张量以[批大小,通道,行,列]的形状组织。 对于一维卷积神经网络,这是相同的,只是删除了最后一个维度。 实际上,可能有多个时间序列的输入(表示为多个通道)。
3.定义和训练模型 (3. Define and Train a Model)
The model I defined for this experiment is a 1D convolutional neural network with 3 convolutional layers followed by ReLU activation and Batch Normalization and finally average pooling and two linear layers. The rationale is similar to the 2D convolution layers. Each successive layer can capture patterns in the data that are more and more complex. The Batch Normalization is useful to make the training faster and more stable even in very deep neural networks. You can see this 10-min video where Andrew Ng explains Why Does Batch Norm Works.
我为此实验定义的模型是一维卷积神经网络,具有3个卷积层,然后进行ReLU激活和批归一化 ,最后是平均池化和两个线性层。 基本原理类似于2D卷积层。 每个连续的层都可以捕获越来越复杂的数据模式。 即使在非常深的神经网络中, 批次归一化也可以使训练更快,更稳定。 您可以观看此10分钟的视频,其中Andrew Ng解释了Batch Norm为何起作用 。
class TimeSeriesModel(nn.Module):
def __init__(self, input_size, output_size):
super(TimeSeriesModel, self).__init__()
self.conv1 = nn.Conv1d(input_size, 64, kernel_size=7, stride=2)
self.conv1_bn = nn.BatchNorm1d(64)
self.conv2 = nn.Conv1d(64, 128, kernel_size=5, stride=2)
self.conv2_bn = nn.BatchNorm1d(128)
self.conv3 = nn.Conv1d(128, 256, kernel_size=3, stride=2)
self.conv3_bn = nn.BatchNorm1d(256)
self.drop = nn.Dropout(0.5)
self.pool = nn.AdaptiveAvgPool1d(10)
self.linear = nn.Linear(10*256, output_size)
self.linear_bn = nn.BatchNorm1d(output_size)
self.out = nn.Linear(output_size, output_size)
def forward(self, x):
x = F.relu(self.conv1(x))
x = self.conv1_bn(x)
x = F.relu(self.conv2(x))
x = self.conv2_bn(x)
x = F.relu(self.conv3(x))
x = self.conv3_bn(x)
x = self.pool(x)
x = x.view(-1, 10*256)
x = F.relu(self.linear(x))
x = self.drop(self.linear_bn(x))
return self.out(x)
Also notice that before the last linear layer I used a dropout (line 26). The dropout (defined in line 10 with a probability of 0.5) is a usually successful method of regularization. It simply drops some of the elements in the feature map randomly making the data slightly different each time. Think about an image for example. Dropout is like removing some pixels of the image. Most likely you can still identify the object in the image and so does the model has to learn it, resulting in a more robust model.
还要注意,在最后一个线性层之前,我使用了一个滤除线(第26行)。 辍学(在第10行中定义,概率为0.5)是通常成功的正则化方法。 它只是简单地将某些要素随机丢弃在特征图中,从而使数据每次都略有不同。 考虑一下图像。 丢失就像删除图像的某些像素。 您很可能仍然可以识别图像中的对象,因此模型也必须学习它,从而获得更可靠的模型。
Note: During inference time the dropout is not active. This is one of the reasons you need to call model.eval()
before running the inference, otherwise, dropout will be applied.
注意:在推断时间内,辍学未激活。 这是您需要在运行推理之前调用model.eval()
的原因之一,否则将应用model.eval()
。
Using the fastai Datasets
class is easy to create the dataloaders and then the fastai Learner
to combine the dataloaders with the model. The Learner class handles training and predictions.
使用fastai Datasets
类很容易创建Datasets
加载器,然后使用fastai Learner
器将数据加载器与模型结合起来。 学习者课程负责培训和预测。
data = Datasets(list(range(len(x))), [lambda i : x[i], lambda i : y[i]], splits=[train_ind, valid_ind])
dls = data.dataloaders(bs=64)
model = TimeSeriesModel(1, pred_len)
learn = Learner(dls, model, loss_func=nn.MSELoss())
learn.fit_one_cycle(20, max_lr=3e-2)
In the code above the Datasets are created (line 1) by giving a list of items and one function to define the inputs and other the targets. I defined the sequences and targets before as tensors
x
andy
(full code here) therefore I just need to select elements from those tensors. You can read more about the fastaiDatasets
in the documentation here.在上面的代码中,通过提供项目列表和一个函数来定义输入和其他目标,来创建数据集 (第1行)。 我之前将序列和目标定义为张量
x
和y
( 此处为完整代码 ),因此我只需要从这些张量中选择元素。 您可以在此处的文档中阅读有关fastaiDatasets
更多信息。As you can see, in line 5, I train the model for 20 epochs using one cycle learning rate schedule (learning rate goes up until reaching
max_lr
and then gradually decays). One more nice feature that fastai provides!如您所见,在第5行中,我使用一个周期的学习率计划对模型进行了20个时期的训练(学习率上升直到达到
max_lr
,然后逐渐衰减)。 fastai提供的另一个不错的功能!
During the training, the following table is printed with the results, where you can see that the train and validation losses gradually drop as the train progresses.
在训练过程中,将打印出下表并显示结果,您可以在其中看到训练和验证损失随着训练的进行而逐渐降低。
There are several reasons to have a train loss higher than the validation loss as seen in the table above. One possible reason, in this case, is the use of Dropout itself that is only applied for the training. You can try to run the code without the dropout and check if that’s the case!
如上表所示,火车损失高于验证损失有多种原因。 在这种情况下,一个可能的原因是仅使用了Dropout本身的训练。 您可以尝试在没有辍学的情况下运行代码,并检查情况是否如此!
Now that the model is trained, let’s compute the prediction for the validation. The get_preds
method of the Learner
can be used as follows: ye_valid, y_valid = learn.get_preds().
Notice that I don’t need to call model.eval()
as fastai get_preds
already takes care of such details. Nevertheless is something to keep in mind.
现在已经对模型进行了训练,让我们计算验证的预测。 Learner
的get_preds
方法可以按如下方式使用: ye_valid, y_valid = learn.get_preds().
注意,我不需要调用model.eval()
因为fastai get_preds
已经处理了这些细节。 不过,要记住一点。
With the results in hand, it’s time to visualize the predictions! This is the most fun part. The following code creates an image with the visualization of 12 validation sequences.
掌握了结果之后,就该可视化预测了! 这是最有趣的部分。 以下代码创建了具有12个验证序列可视化的图像。
fig, axes = plt.subplots(ncols=4, nrows=3, figsize=(12,6), dpi=150)
for i, ax in enumerate(axes.flat):
plot_idx = np.random.choice(np.arange(0, len(ye_valid)))
true = np.concatenate([x_valid.numpy()[plot_idx,-1,:].reshape(-1), y_valid.numpy()[plot_idx,:].reshape(-1)])
pred = np.concatenate([x_valid.numpy()[plot_idx,-1,:].reshape(-1), ye_valid[plot_idx,:].reshape(-1)])
ax.plot(pred, color='red', label='preds')
ax.plot(true, color='green', label='true')
ax.vlines(trn_len-1, np.min(true), np.max(true), color='black')
if i == 0: ax.legend()
fig.tight_layout();
As you can see in the image above, the model predictions (red) start by closely following the targets (green) but the performance starts degrading over time. In fact, if I plot the Mean Square Error (MSE) along the time axis, this is the result:
如上图所示,模型预测(红色)首先紧跟目标(绿色)开始,但性能随时间而开始下降。 实际上,如果我沿着时间轴绘制均方误差(MSE),则结果如下:
This is expected due to the chaotic nature of the data. Can this result be further improved? Probably yes, I tried just a few different options when preparing this story, the real learning is in looking to the code and make your own observations and experiments.
由于数据的混乱性质,这是可以预期的。 这个结果可以进一步改善吗? 可能是的,在准备这个故事时,我尝试了几种不同的选择,真正的学习是查看代码并进行自己的观察和实验。
家庭作业 (Homework)
I can show you a thousand examples but you will learn the most if you can make one or two experiments by yourself! The complete code for this story is available on this notebook.
我可以向您展示一千个示例,但如果您自己进行一两个实验,您将学到最多的知识! 有关此故事的完整代码,请参阅此笔记本 。
- Try to change the model, hyper-parameters, optimization function, size of input and output sequences and see how it affects the results. 尝试更改模型,超参数,优化功能,输入和输出序列的大小,并查看其如何影响结果。
- Apply the model to another time-series of some problem you may be interested in. How are the results? 将模型应用于您可能感兴趣的另一个时间序列。结果如何?
And as always, if you create interesting notebooks with nice animations as a result of your experiments, go ahead and share them on GitHub, Kaggle or write a Medium story!
与往常一样,如果您通过实验创建了带有精美动画的有趣笔记本,请继续在GitHub,Kaggle上分享它们,或撰写一个中型故事!
结束语 (Final remarks)
This ends the fourth story in the Learn AI Today series!
到此为止,《今日学习AI》系列的第四个故事!
Please consider joining my mailing list in this link so that you won’t miss any of my upcoming stories!
请考虑通过此链接加入我的邮件列表 这样您就不会错过任何我即将发表的故事!
I will also be listing the new stories at learn-ai-today.com — the page I created for this learning journey — and at this GitHub repository!
我也将上市的新故事learn-ai-today.com -此学习之旅创建的页面我-在这GitHub的仓库 !
And in case you missed it before, this is the link for the Kaggle notebook with the code for this story!
万一您之前错过了它, 这是Kaggle笔记本的链接以及此故事的代码 !
Feel free to give me some feedback in the comments. What did you find most useful or what could be explained better? Let me know!
请随时在评论中给我一些反馈。 您觉得最有用的是什么? 让我知道!
You can read more about my Deep Learning journey on the following stories!
您可以在以下故事中阅读有关我的深度学习之旅的更多信息!
Thanks for reading! Have a great day!
谢谢阅读! 祝你有美好的一天!
翻译自: https://towardsdatascience.com/learn-ai-today-04-time-series-multi-step-forecasting-6eb48bbcc724
人工智能 时间序列分析