Pytorch on google-colaboratory GPU - Illegal memory access
up vote
0
down vote
favorite
I am using pytorch(0.4.0) on google-colaboratory ( NVIDIA-SMI 396.44 Driver Version: 396.44)
When running my code outside any function, I am able to send pytorch tensors and model to the GPU :
...
model.cuda()
data_tensor = data_tensor.cuda()
...
And my CNN model is trained successfully with 98% accurancy.
But when I put the same code in a function,
def main(...):
....
model.cuda()
data_tensor= data_tensor.cuda()
...
if __name__ == "__main__":
main('...)
I have the following error:
cuda runtime error (77) : an illegal memory access was encountered at /pytorch/aten/src/THC/generic/THCTensorCopy.c:20
UPDATE(18/11/21):
It turned out that being part or not of a function is irrelevant. Usually, I have first a CUDNN_STATUS_EXECUTION_FAILED error then the second time a cuda runtime error (77) as shown below. But it sometimes works a few times before failing.
CUDNN_STATUS_EXECUTION_FAILED (first try) :
RuntimeError Traceback (most recent call last)
<ipython-input-27-53476e08e017> in <module>()
1 main('mnist', 'to', 'ndd', Xd=16, epo=5, bs=100, tXn=-1, vXn=300,
----> 2 lr=0.05, suf="s1", n_class=10, cuda=True)
<ipython-input-23-918584456207> in main(ds, framework, format, Xd, epo, bs, tXn, vXn, lr, suf, n_class, cuda)
12 opt = torch.optim.SGD(net.parameters(), lr)
13
---> 14 train(net, opt, Xd, epo, bs, cuda, tXn, tX, tT, vX, vT,lr)
15
<ipython-input-26-6b574a9e8af6> in train(model, optimizer, Xd, epo, bs, cuda, Xn, tX, tT, vX, vT, lr)
26 #t = t.cuda()
27 optimizer.zero_grad()
---> 28 z = model(x)
29 bat_loss = criterion(z, t)
30 bat_loss.backward()
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
489 result = self._slow_forward(*input, **kwargs)
490 else:
--> 491 result = self.forward(*input, **kwargs)
492 for hook in self._forward_hooks.values():
493 hook_result = hook(self, input, result)
<ipython-input-22-b4bc2e0b39b8> in forward(self, X)
10 H0 = torch.zeros(self.n_H, X.size(0), self.Wh)
11 C0 = torch.zeros(self.n_H, X.size(0), self.Wh)
---> 12 O, (Hn, Cn), = self.lstm1(X, (H0, C0))
13 O = self.linear1(O[:, -1, :])
14 return O
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
489 result = self._slow_forward(*input, **kwargs)
490 else:
--> 491 result = self.forward(*input, **kwargs)
492 for hook in self._forward_hooks.values():
493 hook_result = hook(self, input, result)
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/rnn.py in forward(self, input, hx)
190 flat_weight=flat_weight
191 )
--> 192 output, hidden = func(input, self.all_weights, hx, batch_sizes)
193 if is_packed:
194 output = PackedSequence(output, batch_sizes)
/usr/local/lib/python3.6/dist-packages/torch/nn/_functions/rnn.py in forward(input, *fargs, **fkwargs)
321 func = decorator(func)
322
--> 323 return func(input, *fargs, **fkwargs)
324
325 return forward
/usr/local/lib/python3.6/dist-packages/torch/nn/_functions/rnn.py in forward(input, weight, hx, batch_sizes)
285 batch_first, dropout, train, bool(bidirectional),
286 list(batch_sizes.data) if variable_length else (),
--> 287 dropout_ts)
288
289 if cx is not None:
RuntimeError: CUDNN_STATUS_EXECUTION_FAILED
cuda runtime error (77) (other tries):
RuntimeError Traceback (most recent call last)
<ipython-input-28-53476e08e017> in <module>()
1 main('mnist', 'to', 'ndd', Xd=16, epo=5, bs=100, tXn=-1, vXn=300,
----> 2 lr=0.05, suf="s1", n_class=10, cuda=True)
<ipython-input-23-918584456207> in main(ds, framework, format, Xd, epo, bs, tXn, vXn, lr, suf, n_class, cuda)
12 opt = torch.optim.SGD(net.parameters(), lr)
13
---> 14 train(net, opt, Xd, epo, bs, cuda, tXn, tX, tT, vX, vT,lr)
15
<ipython-input-26-6b574a9e8af6> in train(model, optimizer, Xd, epo, bs, cuda, Xn, tX, tT, vX, vT, lr)
4 if cuda and torch.cuda.is_available():
5 print("tX type (before):", tX.type())
----> 6 model.cuda()
7 tX = tX.cuda()
8 tT = tT.cuda()
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in cuda(self, device)
247 Module: self
248 """
--> 249 return self._apply(lambda t: t.cuda(device))
250
251 def cpu(self):
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
174 def _apply(self, fn):
175 for module in self.children():
--> 176 module._apply(fn)
177
178 for param in self._parameters.values():
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/rnn.py in _apply(self, fn)
109
110 def _apply(self, fn):
--> 111 ret = super(RNNBase, self)._apply(fn)
112 self.flatten_parameters()
113 return ret
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
180 # Tensors stored in modules are graph leaves, and we don't
181 # want to create copy nodes, so we have to unpack the data.
--> 182 param.data = fn(param.data)
183 if param._grad is not None:
184 param._grad.data = fn(param._grad.data)
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in <lambda>(t)
247 Module: self
248 """
--> 249 return self._apply(lambda t: t.cuda(device))
250
251 def cpu(self):
RuntimeError: cuda runtime error (77) : an illegal memory access was encountered at /pytorch/aten/src/THC/generic/THCTensorCopy.c:20
gpu pytorch google-colaboratory
|
show 1 more comment
up vote
0
down vote
favorite
I am using pytorch(0.4.0) on google-colaboratory ( NVIDIA-SMI 396.44 Driver Version: 396.44)
When running my code outside any function, I am able to send pytorch tensors and model to the GPU :
...
model.cuda()
data_tensor = data_tensor.cuda()
...
And my CNN model is trained successfully with 98% accurancy.
But when I put the same code in a function,
def main(...):
....
model.cuda()
data_tensor= data_tensor.cuda()
...
if __name__ == "__main__":
main('...)
I have the following error:
cuda runtime error (77) : an illegal memory access was encountered at /pytorch/aten/src/THC/generic/THCTensorCopy.c:20
UPDATE(18/11/21):
It turned out that being part or not of a function is irrelevant. Usually, I have first a CUDNN_STATUS_EXECUTION_FAILED error then the second time a cuda runtime error (77) as shown below. But it sometimes works a few times before failing.
CUDNN_STATUS_EXECUTION_FAILED (first try) :
RuntimeError Traceback (most recent call last)
<ipython-input-27-53476e08e017> in <module>()
1 main('mnist', 'to', 'ndd', Xd=16, epo=5, bs=100, tXn=-1, vXn=300,
----> 2 lr=0.05, suf="s1", n_class=10, cuda=True)
<ipython-input-23-918584456207> in main(ds, framework, format, Xd, epo, bs, tXn, vXn, lr, suf, n_class, cuda)
12 opt = torch.optim.SGD(net.parameters(), lr)
13
---> 14 train(net, opt, Xd, epo, bs, cuda, tXn, tX, tT, vX, vT,lr)
15
<ipython-input-26-6b574a9e8af6> in train(model, optimizer, Xd, epo, bs, cuda, Xn, tX, tT, vX, vT, lr)
26 #t = t.cuda()
27 optimizer.zero_grad()
---> 28 z = model(x)
29 bat_loss = criterion(z, t)
30 bat_loss.backward()
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
489 result = self._slow_forward(*input, **kwargs)
490 else:
--> 491 result = self.forward(*input, **kwargs)
492 for hook in self._forward_hooks.values():
493 hook_result = hook(self, input, result)
<ipython-input-22-b4bc2e0b39b8> in forward(self, X)
10 H0 = torch.zeros(self.n_H, X.size(0), self.Wh)
11 C0 = torch.zeros(self.n_H, X.size(0), self.Wh)
---> 12 O, (Hn, Cn), = self.lstm1(X, (H0, C0))
13 O = self.linear1(O[:, -1, :])
14 return O
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
489 result = self._slow_forward(*input, **kwargs)
490 else:
--> 491 result = self.forward(*input, **kwargs)
492 for hook in self._forward_hooks.values():
493 hook_result = hook(self, input, result)
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/rnn.py in forward(self, input, hx)
190 flat_weight=flat_weight
191 )
--> 192 output, hidden = func(input, self.all_weights, hx, batch_sizes)
193 if is_packed:
194 output = PackedSequence(output, batch_sizes)
/usr/local/lib/python3.6/dist-packages/torch/nn/_functions/rnn.py in forward(input, *fargs, **fkwargs)
321 func = decorator(func)
322
--> 323 return func(input, *fargs, **fkwargs)
324
325 return forward
/usr/local/lib/python3.6/dist-packages/torch/nn/_functions/rnn.py in forward(input, weight, hx, batch_sizes)
285 batch_first, dropout, train, bool(bidirectional),
286 list(batch_sizes.data) if variable_length else (),
--> 287 dropout_ts)
288
289 if cx is not None:
RuntimeError: CUDNN_STATUS_EXECUTION_FAILED
cuda runtime error (77) (other tries):
RuntimeError Traceback (most recent call last)
<ipython-input-28-53476e08e017> in <module>()
1 main('mnist', 'to', 'ndd', Xd=16, epo=5, bs=100, tXn=-1, vXn=300,
----> 2 lr=0.05, suf="s1", n_class=10, cuda=True)
<ipython-input-23-918584456207> in main(ds, framework, format, Xd, epo, bs, tXn, vXn, lr, suf, n_class, cuda)
12 opt = torch.optim.SGD(net.parameters(), lr)
13
---> 14 train(net, opt, Xd, epo, bs, cuda, tXn, tX, tT, vX, vT,lr)
15
<ipython-input-26-6b574a9e8af6> in train(model, optimizer, Xd, epo, bs, cuda, Xn, tX, tT, vX, vT, lr)
4 if cuda and torch.cuda.is_available():
5 print("tX type (before):", tX.type())
----> 6 model.cuda()
7 tX = tX.cuda()
8 tT = tT.cuda()
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in cuda(self, device)
247 Module: self
248 """
--> 249 return self._apply(lambda t: t.cuda(device))
250
251 def cpu(self):
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
174 def _apply(self, fn):
175 for module in self.children():
--> 176 module._apply(fn)
177
178 for param in self._parameters.values():
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/rnn.py in _apply(self, fn)
109
110 def _apply(self, fn):
--> 111 ret = super(RNNBase, self)._apply(fn)
112 self.flatten_parameters()
113 return ret
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
180 # Tensors stored in modules are graph leaves, and we don't
181 # want to create copy nodes, so we have to unpack the data.
--> 182 param.data = fn(param.data)
183 if param._grad is not None:
184 param._grad.data = fn(param._grad.data)
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in <lambda>(t)
247 Module: self
248 """
--> 249 return self._apply(lambda t: t.cuda(device))
250
251 def cpu(self):
RuntimeError: cuda runtime error (77) : an illegal memory access was encountered at /pytorch/aten/src/THC/generic/THCTensorCopy.c:20
gpu pytorch google-colaboratory
2
Can you share a self-contained notebook that reproduces the problem?
– Bob Smith
Nov 19 at 17:51
Bob, I updated my post. To be honest, when I took the trace this morning, it worked once !!! then failed again. Strange. The trace shows that it failed to put the model on GPU, but I also tested that it failed to put pytorch tensors.
– u2gilles
Nov 20 at 1:21
1
Can you share a complete, self-contained notebook? It will significantly simplify diagnosis.
– Bob Smith
Nov 20 at 1:27
Bob, please find a link to a standalone ipynb file to reproduce the problem if you like (without mounting google-drive) : drive.google.com/open?id=1enYkRsAuotTGsoce93XP2gAIuK6-i9ub
– u2gilles
Nov 21 at 4:25
I find the same problem on colab, but for me any attempt to convert a tensor to the gpu results in this error. After looking online, I'm still unsure what is wrong. Maybe colab gpus are being quirky?
– Superman
Nov 25 at 18:41
|
show 1 more comment
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I am using pytorch(0.4.0) on google-colaboratory ( NVIDIA-SMI 396.44 Driver Version: 396.44)
When running my code outside any function, I am able to send pytorch tensors and model to the GPU :
...
model.cuda()
data_tensor = data_tensor.cuda()
...
And my CNN model is trained successfully with 98% accurancy.
But when I put the same code in a function,
def main(...):
....
model.cuda()
data_tensor= data_tensor.cuda()
...
if __name__ == "__main__":
main('...)
I have the following error:
cuda runtime error (77) : an illegal memory access was encountered at /pytorch/aten/src/THC/generic/THCTensorCopy.c:20
UPDATE(18/11/21):
It turned out that being part or not of a function is irrelevant. Usually, I have first a CUDNN_STATUS_EXECUTION_FAILED error then the second time a cuda runtime error (77) as shown below. But it sometimes works a few times before failing.
CUDNN_STATUS_EXECUTION_FAILED (first try) :
RuntimeError Traceback (most recent call last)
<ipython-input-27-53476e08e017> in <module>()
1 main('mnist', 'to', 'ndd', Xd=16, epo=5, bs=100, tXn=-1, vXn=300,
----> 2 lr=0.05, suf="s1", n_class=10, cuda=True)
<ipython-input-23-918584456207> in main(ds, framework, format, Xd, epo, bs, tXn, vXn, lr, suf, n_class, cuda)
12 opt = torch.optim.SGD(net.parameters(), lr)
13
---> 14 train(net, opt, Xd, epo, bs, cuda, tXn, tX, tT, vX, vT,lr)
15
<ipython-input-26-6b574a9e8af6> in train(model, optimizer, Xd, epo, bs, cuda, Xn, tX, tT, vX, vT, lr)
26 #t = t.cuda()
27 optimizer.zero_grad()
---> 28 z = model(x)
29 bat_loss = criterion(z, t)
30 bat_loss.backward()
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
489 result = self._slow_forward(*input, **kwargs)
490 else:
--> 491 result = self.forward(*input, **kwargs)
492 for hook in self._forward_hooks.values():
493 hook_result = hook(self, input, result)
<ipython-input-22-b4bc2e0b39b8> in forward(self, X)
10 H0 = torch.zeros(self.n_H, X.size(0), self.Wh)
11 C0 = torch.zeros(self.n_H, X.size(0), self.Wh)
---> 12 O, (Hn, Cn), = self.lstm1(X, (H0, C0))
13 O = self.linear1(O[:, -1, :])
14 return O
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
489 result = self._slow_forward(*input, **kwargs)
490 else:
--> 491 result = self.forward(*input, **kwargs)
492 for hook in self._forward_hooks.values():
493 hook_result = hook(self, input, result)
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/rnn.py in forward(self, input, hx)
190 flat_weight=flat_weight
191 )
--> 192 output, hidden = func(input, self.all_weights, hx, batch_sizes)
193 if is_packed:
194 output = PackedSequence(output, batch_sizes)
/usr/local/lib/python3.6/dist-packages/torch/nn/_functions/rnn.py in forward(input, *fargs, **fkwargs)
321 func = decorator(func)
322
--> 323 return func(input, *fargs, **fkwargs)
324
325 return forward
/usr/local/lib/python3.6/dist-packages/torch/nn/_functions/rnn.py in forward(input, weight, hx, batch_sizes)
285 batch_first, dropout, train, bool(bidirectional),
286 list(batch_sizes.data) if variable_length else (),
--> 287 dropout_ts)
288
289 if cx is not None:
RuntimeError: CUDNN_STATUS_EXECUTION_FAILED
cuda runtime error (77) (other tries):
RuntimeError Traceback (most recent call last)
<ipython-input-28-53476e08e017> in <module>()
1 main('mnist', 'to', 'ndd', Xd=16, epo=5, bs=100, tXn=-1, vXn=300,
----> 2 lr=0.05, suf="s1", n_class=10, cuda=True)
<ipython-input-23-918584456207> in main(ds, framework, format, Xd, epo, bs, tXn, vXn, lr, suf, n_class, cuda)
12 opt = torch.optim.SGD(net.parameters(), lr)
13
---> 14 train(net, opt, Xd, epo, bs, cuda, tXn, tX, tT, vX, vT,lr)
15
<ipython-input-26-6b574a9e8af6> in train(model, optimizer, Xd, epo, bs, cuda, Xn, tX, tT, vX, vT, lr)
4 if cuda and torch.cuda.is_available():
5 print("tX type (before):", tX.type())
----> 6 model.cuda()
7 tX = tX.cuda()
8 tT = tT.cuda()
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in cuda(self, device)
247 Module: self
248 """
--> 249 return self._apply(lambda t: t.cuda(device))
250
251 def cpu(self):
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
174 def _apply(self, fn):
175 for module in self.children():
--> 176 module._apply(fn)
177
178 for param in self._parameters.values():
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/rnn.py in _apply(self, fn)
109
110 def _apply(self, fn):
--> 111 ret = super(RNNBase, self)._apply(fn)
112 self.flatten_parameters()
113 return ret
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
180 # Tensors stored in modules are graph leaves, and we don't
181 # want to create copy nodes, so we have to unpack the data.
--> 182 param.data = fn(param.data)
183 if param._grad is not None:
184 param._grad.data = fn(param._grad.data)
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in <lambda>(t)
247 Module: self
248 """
--> 249 return self._apply(lambda t: t.cuda(device))
250
251 def cpu(self):
RuntimeError: cuda runtime error (77) : an illegal memory access was encountered at /pytorch/aten/src/THC/generic/THCTensorCopy.c:20
gpu pytorch google-colaboratory
I am using pytorch(0.4.0) on google-colaboratory ( NVIDIA-SMI 396.44 Driver Version: 396.44)
When running my code outside any function, I am able to send pytorch tensors and model to the GPU :
...
model.cuda()
data_tensor = data_tensor.cuda()
...
And my CNN model is trained successfully with 98% accurancy.
But when I put the same code in a function,
def main(...):
....
model.cuda()
data_tensor= data_tensor.cuda()
...
if __name__ == "__main__":
main('...)
I have the following error:
cuda runtime error (77) : an illegal memory access was encountered at /pytorch/aten/src/THC/generic/THCTensorCopy.c:20
UPDATE(18/11/21):
It turned out that being part or not of a function is irrelevant. Usually, I have first a CUDNN_STATUS_EXECUTION_FAILED error then the second time a cuda runtime error (77) as shown below. But it sometimes works a few times before failing.
CUDNN_STATUS_EXECUTION_FAILED (first try) :
RuntimeError Traceback (most recent call last)
<ipython-input-27-53476e08e017> in <module>()
1 main('mnist', 'to', 'ndd', Xd=16, epo=5, bs=100, tXn=-1, vXn=300,
----> 2 lr=0.05, suf="s1", n_class=10, cuda=True)
<ipython-input-23-918584456207> in main(ds, framework, format, Xd, epo, bs, tXn, vXn, lr, suf, n_class, cuda)
12 opt = torch.optim.SGD(net.parameters(), lr)
13
---> 14 train(net, opt, Xd, epo, bs, cuda, tXn, tX, tT, vX, vT,lr)
15
<ipython-input-26-6b574a9e8af6> in train(model, optimizer, Xd, epo, bs, cuda, Xn, tX, tT, vX, vT, lr)
26 #t = t.cuda()
27 optimizer.zero_grad()
---> 28 z = model(x)
29 bat_loss = criterion(z, t)
30 bat_loss.backward()
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
489 result = self._slow_forward(*input, **kwargs)
490 else:
--> 491 result = self.forward(*input, **kwargs)
492 for hook in self._forward_hooks.values():
493 hook_result = hook(self, input, result)
<ipython-input-22-b4bc2e0b39b8> in forward(self, X)
10 H0 = torch.zeros(self.n_H, X.size(0), self.Wh)
11 C0 = torch.zeros(self.n_H, X.size(0), self.Wh)
---> 12 O, (Hn, Cn), = self.lstm1(X, (H0, C0))
13 O = self.linear1(O[:, -1, :])
14 return O
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
489 result = self._slow_forward(*input, **kwargs)
490 else:
--> 491 result = self.forward(*input, **kwargs)
492 for hook in self._forward_hooks.values():
493 hook_result = hook(self, input, result)
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/rnn.py in forward(self, input, hx)
190 flat_weight=flat_weight
191 )
--> 192 output, hidden = func(input, self.all_weights, hx, batch_sizes)
193 if is_packed:
194 output = PackedSequence(output, batch_sizes)
/usr/local/lib/python3.6/dist-packages/torch/nn/_functions/rnn.py in forward(input, *fargs, **fkwargs)
321 func = decorator(func)
322
--> 323 return func(input, *fargs, **fkwargs)
324
325 return forward
/usr/local/lib/python3.6/dist-packages/torch/nn/_functions/rnn.py in forward(input, weight, hx, batch_sizes)
285 batch_first, dropout, train, bool(bidirectional),
286 list(batch_sizes.data) if variable_length else (),
--> 287 dropout_ts)
288
289 if cx is not None:
RuntimeError: CUDNN_STATUS_EXECUTION_FAILED
cuda runtime error (77) (other tries):
RuntimeError Traceback (most recent call last)
<ipython-input-28-53476e08e017> in <module>()
1 main('mnist', 'to', 'ndd', Xd=16, epo=5, bs=100, tXn=-1, vXn=300,
----> 2 lr=0.05, suf="s1", n_class=10, cuda=True)
<ipython-input-23-918584456207> in main(ds, framework, format, Xd, epo, bs, tXn, vXn, lr, suf, n_class, cuda)
12 opt = torch.optim.SGD(net.parameters(), lr)
13
---> 14 train(net, opt, Xd, epo, bs, cuda, tXn, tX, tT, vX, vT,lr)
15
<ipython-input-26-6b574a9e8af6> in train(model, optimizer, Xd, epo, bs, cuda, Xn, tX, tT, vX, vT, lr)
4 if cuda and torch.cuda.is_available():
5 print("tX type (before):", tX.type())
----> 6 model.cuda()
7 tX = tX.cuda()
8 tT = tT.cuda()
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in cuda(self, device)
247 Module: self
248 """
--> 249 return self._apply(lambda t: t.cuda(device))
250
251 def cpu(self):
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
174 def _apply(self, fn):
175 for module in self.children():
--> 176 module._apply(fn)
177
178 for param in self._parameters.values():
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/rnn.py in _apply(self, fn)
109
110 def _apply(self, fn):
--> 111 ret = super(RNNBase, self)._apply(fn)
112 self.flatten_parameters()
113 return ret
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
180 # Tensors stored in modules are graph leaves, and we don't
181 # want to create copy nodes, so we have to unpack the data.
--> 182 param.data = fn(param.data)
183 if param._grad is not None:
184 param._grad.data = fn(param._grad.data)
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in <lambda>(t)
247 Module: self
248 """
--> 249 return self._apply(lambda t: t.cuda(device))
250
251 def cpu(self):
RuntimeError: cuda runtime error (77) : an illegal memory access was encountered at /pytorch/aten/src/THC/generic/THCTensorCopy.c:20
gpu pytorch google-colaboratory
gpu pytorch google-colaboratory
edited Nov 21 at 15:50
asked Nov 19 at 17:46
u2gilles
1,40211634
1,40211634
2
Can you share a self-contained notebook that reproduces the problem?
– Bob Smith
Nov 19 at 17:51
Bob, I updated my post. To be honest, when I took the trace this morning, it worked once !!! then failed again. Strange. The trace shows that it failed to put the model on GPU, but I also tested that it failed to put pytorch tensors.
– u2gilles
Nov 20 at 1:21
1
Can you share a complete, self-contained notebook? It will significantly simplify diagnosis.
– Bob Smith
Nov 20 at 1:27
Bob, please find a link to a standalone ipynb file to reproduce the problem if you like (without mounting google-drive) : drive.google.com/open?id=1enYkRsAuotTGsoce93XP2gAIuK6-i9ub
– u2gilles
Nov 21 at 4:25
I find the same problem on colab, but for me any attempt to convert a tensor to the gpu results in this error. After looking online, I'm still unsure what is wrong. Maybe colab gpus are being quirky?
– Superman
Nov 25 at 18:41
|
show 1 more comment
2
Can you share a self-contained notebook that reproduces the problem?
– Bob Smith
Nov 19 at 17:51
Bob, I updated my post. To be honest, when I took the trace this morning, it worked once !!! then failed again. Strange. The trace shows that it failed to put the model on GPU, but I also tested that it failed to put pytorch tensors.
– u2gilles
Nov 20 at 1:21
1
Can you share a complete, self-contained notebook? It will significantly simplify diagnosis.
– Bob Smith
Nov 20 at 1:27
Bob, please find a link to a standalone ipynb file to reproduce the problem if you like (without mounting google-drive) : drive.google.com/open?id=1enYkRsAuotTGsoce93XP2gAIuK6-i9ub
– u2gilles
Nov 21 at 4:25
I find the same problem on colab, but for me any attempt to convert a tensor to the gpu results in this error. After looking online, I'm still unsure what is wrong. Maybe colab gpus are being quirky?
– Superman
Nov 25 at 18:41
2
2
Can you share a self-contained notebook that reproduces the problem?
– Bob Smith
Nov 19 at 17:51
Can you share a self-contained notebook that reproduces the problem?
– Bob Smith
Nov 19 at 17:51
Bob, I updated my post. To be honest, when I took the trace this morning, it worked once !!! then failed again. Strange. The trace shows that it failed to put the model on GPU, but I also tested that it failed to put pytorch tensors.
– u2gilles
Nov 20 at 1:21
Bob, I updated my post. To be honest, when I took the trace this morning, it worked once !!! then failed again. Strange. The trace shows that it failed to put the model on GPU, but I also tested that it failed to put pytorch tensors.
– u2gilles
Nov 20 at 1:21
1
1
Can you share a complete, self-contained notebook? It will significantly simplify diagnosis.
– Bob Smith
Nov 20 at 1:27
Can you share a complete, self-contained notebook? It will significantly simplify diagnosis.
– Bob Smith
Nov 20 at 1:27
Bob, please find a link to a standalone ipynb file to reproduce the problem if you like (without mounting google-drive) : drive.google.com/open?id=1enYkRsAuotTGsoce93XP2gAIuK6-i9ub
– u2gilles
Nov 21 at 4:25
Bob, please find a link to a standalone ipynb file to reproduce the problem if you like (without mounting google-drive) : drive.google.com/open?id=1enYkRsAuotTGsoce93XP2gAIuK6-i9ub
– u2gilles
Nov 21 at 4:25
I find the same problem on colab, but for me any attempt to convert a tensor to the gpu results in this error. After looking online, I'm still unsure what is wrong. Maybe colab gpus are being quirky?
– Superman
Nov 25 at 18:41
I find the same problem on colab, but for me any attempt to convert a tensor to the gpu results in this error. After looking online, I'm still unsure what is wrong. Maybe colab gpus are being quirky?
– Superman
Nov 25 at 18:41
|
show 1 more comment
1 Answer
1
active
oldest
votes
up vote
0
down vote
accepted
It now works with Pytorch 1.0 using:
!pip3 install https://download.pytorch.org/whl/cu80/torch-1.0.0-cp36-cp36m-linux_x86_64.whl
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53380068%2fpytorch-on-google-colaboratory-gpu-illegal-memory-access%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
accepted
It now works with Pytorch 1.0 using:
!pip3 install https://download.pytorch.org/whl/cu80/torch-1.0.0-cp36-cp36m-linux_x86_64.whl
add a comment |
up vote
0
down vote
accepted
It now works with Pytorch 1.0 using:
!pip3 install https://download.pytorch.org/whl/cu80/torch-1.0.0-cp36-cp36m-linux_x86_64.whl
add a comment |
up vote
0
down vote
accepted
up vote
0
down vote
accepted
It now works with Pytorch 1.0 using:
!pip3 install https://download.pytorch.org/whl/cu80/torch-1.0.0-cp36-cp36m-linux_x86_64.whl
It now works with Pytorch 1.0 using:
!pip3 install https://download.pytorch.org/whl/cu80/torch-1.0.0-cp36-cp36m-linux_x86_64.whl
answered 2 days ago
u2gilles
1,40211634
1,40211634
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53380068%2fpytorch-on-google-colaboratory-gpu-illegal-memory-access%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
Can you share a self-contained notebook that reproduces the problem?
– Bob Smith
Nov 19 at 17:51
Bob, I updated my post. To be honest, when I took the trace this morning, it worked once !!! then failed again. Strange. The trace shows that it failed to put the model on GPU, but I also tested that it failed to put pytorch tensors.
– u2gilles
Nov 20 at 1:21
1
Can you share a complete, self-contained notebook? It will significantly simplify diagnosis.
– Bob Smith
Nov 20 at 1:27
Bob, please find a link to a standalone ipynb file to reproduce the problem if you like (without mounting google-drive) : drive.google.com/open?id=1enYkRsAuotTGsoce93XP2gAIuK6-i9ub
– u2gilles
Nov 21 at 4:25
I find the same problem on colab, but for me any attempt to convert a tensor to the gpu results in this error. After looking online, I'm still unsure what is wrong. Maybe colab gpus are being quirky?
– Superman
Nov 25 at 18:41