Hello,
I am trying to use predict functionality.
When i try to slide the predictions and convert to numpy, it gives a OOM error.
for preds, i_batch, batch in pred_model.iter_predict(test_data_iter, num_batch=3):
print(len(preds))
print(preds[0].shape)
print(preds[0].context)
a = preds[0][1][:10]
y = a.as_in_context(mx.cpu())
print(y)
Output:
1
(1000, 2997129)
gpu(0)
---------------------------------------------------------------------------
MXNetError Traceback (most recent call last)
<timed exec> in <module>()
~/mxnet/incubator-mxnet/python/mxnet/ndarray/ndarray.py in wait_to_read(self)
1714 0.0893700122833252
1715 """
-> 1716 check_call(_LIB.MXNDArrayWaitToRead(self.handle))
1717
1718 @property
~/mxnet/incubator-mxnet/python/mxnet/base.py in check_call(ret)
147 """
148 if ret != 0:
--> 149 raise MXNetError(py_str(_LIB.MXGetLastError()))
150
151
MXNetError: [23:54:07] src/operator/tensor/./../mxnet_op.h:576: Check failed: err == cudaSuccess (2 vs. 0) Name: mxnet_generic_kernel ErrStr:out of memory
Stack trace returned 10 entries:
[bt] (0) /home/ec2-user/mxnet/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(dmlc::StackTrace()+0x4a) [0x7f5c82d3377a]
[bt] (1) /home/ec2-user/mxnet/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x21) [0x7f5c82d33d81]
[bt] (2) /home/ec2-user/mxnet/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(void mxnet::op::mxnet_op::Kernel<mxnet::op::mxnet_op::set_to_int<0>, mshadow::gpu>::Launch<int*>(mshadow::Stream<mshadow::gpu>*, int, int*)+0x16d) [0x7f5c85ce724d]
[bt] (3) /home/ec2-user/mxnet/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(void mxnet::op::SparseEmbeddingOpForwardRspImpl<mshadow::gpu>(mxnet::OpContext const&, mxnet::TBlob const&, mxnet::NDArray const&, mxnet::OpReqType, mxnet::TBlob const&)+0x1907) [0x7f5c868823c7]
[bt] (4) /home/ec2-user/mxnet/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(void mxnet::op::SparseEmbeddingOpForwardEx<mshadow::gpu>(nnvm::NodeAttrs const&, mxnet::OpContext const&, std::vector<mxnet::NDArray, std::allocator<mxnet::NDArray> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::NDArray, std::allocator<mxnet::NDArray> > const&)+0x82d) [0x7f5c86aafdcd]
[bt] (5) /home/ec2-user/mxnet/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(+0x366a766) [0x7f5c85761766]
[bt] (6) /home/ec2-user/mxnet/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(mxnet::engine::ThreadedEngine::ExecuteOprBlock(mxnet::RunContext, mxnet::engine::OprBlock*)+0x589) [0x7f5c856ea369]
[bt] (7) /home/ec2-user/mxnet/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(void mxnet::engine::ThreadedEnginePerDevice::GPUWorker<(dmlc::ConcurrentQueueType)0>(mxnet::Context, bool, mxnet::engine::ThreadedEnginePerDevice::ThreadWorkerBlock<(dmlc::ConcurrentQueueType)0>*, std::shared_ptr<dmlc::ManualEvent> const&)+0xeb) [0x7f5c856faf7b]
[bt] (8) /home/ec2-user/mxnet/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(std::_Function_handler<void (std::shared_ptr<dmlc::ManualEvent>), mxnet::engine::ThreadedEnginePerDevice::PushToExecute(mxnet::engine::OprBlock*, bool)::{lambda()#3}::operator()() const::{lambda(std::shared_ptr<dmlc::ManualEvent>)#1}>::_M_invoke(std::_Any_data const&, std::shared_ptr<dmlc::ManualEvent>)+0x46) [0x7f5c856fb1c6]
[bt] (9) /home/ec2-user/mxnet/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(std::thread::_Impl<std::_Bind_simple<std::function<void (std::shared_ptr<dmlc::ManualEvent>)> (std::shared_ptr<dmlc::ManualEvent>)> >::_M_run()+0x44) [0x7f5c856f7dd4]