At Magnet we’re trying MXNet via the Clojure bindings. You can read more on: Clojure MXNet for musculoskeletal disease diagnosis.
One technical issue we have found is that our training reports memory leaks, and those are in Scala API we don’t control, and can’t call dispose
on those objects.
Here you have an example of an leak trace:
WARN org.apache.mxnet.WarnIfNotDisposed: LEAK: An instance of class org.apache.mxnet.NDArray was not disposed. Creation point of this resource was:
java.lang.Thread.getStackTrace(Thread.java:1559)
org.apache.mxnet.WarnIfNotDisposed$class.$init$(WarnIfNotDisposed.scala:52)
org.apache.mxnet.NDArray.<init>(NDArray.scala:549)
org.apache.mxnet.NDArray$$anonfun$genericNDArrayFunctionInvoke$4$$anonfun$6.apply(NDArray.scala:100)
org.apache.mxnet.NDArray$$anonfun$genericNDArrayFunctionInvoke$4$$anonfun$6.apply(NDArray.scala:100)
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
scala.collection.AbstractTraversable.map(Traversable.scala:104)
org.apache.mxnet.NDArray$$anonfun$genericNDArrayFunctionInvoke$4.apply(NDArray.scala:100)
org.apache.mxnet.NDArray$$anonfun$genericNDArrayFunctionInvoke$4.apply(NDArray.scala:99)
scala.Option.getOrElse(Option.scala:121)
org.apache.mxnet.NDArray$.genericNDArrayFunctionInvoke(NDArray.scala:99)
org.apache.mxnet.NDArray$.crop(NDArray.scala:33)
org.apache.mxnet.module.DataParallelExecutorGroup$$anonfun$loadGeneralMulti$2$$anonfun$apply$2.apply(DataParallelExecutorGroup.scala:52)
org.apache.mxnet.module.DataParallelExecutorGroup$$anonfun$loadGeneralMulti$2$$anonfun$apply$2.apply(DataParallelExecutorGroup.scala:35)
scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
org.apache.mxnet.module.DataParallelExecutorGroup$$anonfun$loadGeneralMulti$2.apply(DataParallelExecutorGroup.scala:35)
org.apache.mxnet.module.DataParallelExecutorGroup$$anonfun$loadGeneralMulti$2.apply(DataParallelExecutorGroup.scala:34)
scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
scala.collection.Iterator$class.foreach(Iterator.scala:893)
scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
scala.collection.AbstractIterable.foreach(Iterable.scala:54)
scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
org.apache.mxnet.module.DataParallelExecutorGroup$.loadGeneralMulti(DataParallelExecutorGroup.scala:34)
org.apache.mxnet.module.DataParallelExecutorGroup$.org$apache$mxnet$module$DataParallelExecutorGroup$$loadData(DataParallelExecutorGroup.scala:72)
org.apache.mxnet.module.DataParallelExecutorGroup.forward(DataParallelExecutorGroup.scala:486)
org.apache.mxnet.module.Module.forward(Module.scala:447)
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
java.lang.reflect.Method.invoke(Method.java:498)
clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93)
clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:28)
org.apache.clojure_mxnet.module$forward.invokeStatic(module.clj:186)
org.apache.clojure_mxnet.module$forward.invoke(module.clj:179)
org.apache.clojure_mxnet.module$forward.invokeStatic(module.clj:192)
org.apache.clojure_mxnet.module$forward.invoke(module.clj:179)
org.apache.clojure_mxnet.module$fit$fn__1538.invoke(module.clj:567)
org.apache.clojure_mxnet.module$fit.invokeStatic(module.clj:564)
org.apache.clojure_mxnet.module$fit.invoke(module.clj:535)
xenon.train$fit_epoch.invokeStatic(train.clj:60)
xenon.train$fit_epoch.invoke(train.clj:56)
xenon.train$fit$iter__2325__2329$fn__2330$fn__2331.invoke(train.clj:71)
xenon.train$fit$iter__2325__2329$fn__2330.invoke(train.clj:71)
clojure.lang.LazySeq.sval(LazySeq.java:40)
clojure.lang.LazySeq.seq(LazySeq.java:49)
clojure.lang.RT.seq(RT.java:528)
clojure.core$seq__5124.invokeStatic(core.clj:137)
clojure.core$dorun.invokeStatic(core.clj:3125)
clojure.core$dorun.invoke(core.clj:3125)
xenon.train$fit.invokeStatic(train.clj:71)
xenon.train$fit.invoke(train.clj:69)
xenon.train$fine_tune_BANG_.invokeStatic(train.clj:80)
xenon.train$fine_tune_BANG_.invoke(train.clj:73)
xenon.train$fine_tune_BANG_.invokeStatic(train.clj:75)
xenon.train$fine_tune_BANG_.invoke(train.clj:73)
xenon.core$_main.invokeStatic(core.clj:6)
xenon.core$_main.doInvoke(core.clj:5)
It seems that the NDArray
object leak is present in some other code (Clojure or Scala). You can see the presentation on the virtual meetup, where an example code also leaked some NDArray
objects.
Is this related to issues with FeedForward.scala
and will be solved by the new auto collector?
Any hints/pointers on this topic are highly appreciated.
Iván