PyTorch学习笔记：RuntimeError: one of the variables needed for gradient computation has been modified by

发布时间：2023-02-01 14:30

报错信息：

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [784, 512]], which is output 0 of TBackward, is at version 3; expected version 2 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

Traceback (most recent call last):
  File "E:\Anaconda\envs\torch_c_13\lib\site-packages\IPython\core\interactiveshell.py", line 3343, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "", line 1, in 
    runfile('E:/Code/AEs by PyTorch/AEsingle_train_test_temp.py', wdir='E:/Code/AEs by PyTorch')
  File "E:\SoftWare\PyCharm\PyCharm 2021.2.3\plugins\python\helpers\pydev\_pydev_bundle\pydev_umd.py", line 198, in runfile
    pydev_imports.execfile(filename, global_vars, local_vars)  # execute the script
  File "E:\SoftWare\PyCharm\PyCharm 2021.2.3\plugins\python\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "E:/Code/AEs by PyTorch/AEsingle_train_test_temp.py", line 205, in 
    train_ae_x_h2 = AEtrain(AEmodel2, train_ae_x_h1, 10, "AEmodel2")
  File "E:/Code/AEs by PyTorch/AEsingle_train_test_temp.py", line 95, in AEtrain
    loss.backward()
  File "E:\Anaconda\envs\torch_c_13\lib\site-packages\torch\tensor.py", line 195, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "E:\Anaconda\envs\torch_c_13\lib\site-packages\torch\autograd\__init__.py", line 99, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [784, 512]], which is output 0 of TBackward, is at version 3; expected version 2 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

解决方法：

查看报错信息发现是loss.backward()部分报错，查阅多篇博客后找到解决方法如下

optimizer.zero_grad()
loss.backward(retain_graph=True)
optimizer.step()

将loss.backward()函数内的参数retain_graph值设置为True即可成功解决。

retain_graph=True，这个参数的作用是什么，官方定义为：

retain_graph (bool, optional) – If False, the graph used to compute the grad will be freed. Note that in nearly all cases setting this option to True is not needed and often can be worked around in a much more efficient way. Defaults to the value of create_graph.

大意是如果设置为False，计算图中的中间变量在计算完后就会被释放。但是在平时的使用中这个参数默认都为False从而提高效率，和creat_graph的值一样。

参考链接：

另，该问题可能在不同版本的pytorch中出现，可以参考以下链接的回答：

求pytorch大神解答，问题出在哪里 - 虎扑社区 (hupu.com)

也可能会因为代码中出现a=a.xxx这种赋值而报错，可改成b=a.xxx

【PyTorch】RuntimeError: one of the variables needed for gradient computation has been modified by an_ncc1995的博客-CSDN博客

PyTorch中retain_graph参数的作用

Pytorch中retain_graph参数的作用 - Oldpan的个人博客

PyTorch学习笔记：RuntimeError: one of the variables needed for gradient computation has been modified by

相关推荐