MXNet and spark / Symbol API


#1

Hi folks,

I’m new to MXNet. I’m interested in the Symbol API to reproduce a computation DAG we have, and I’d like to use this Symbol in a spark processing pipeline.
So far, so good, but I face the dreaded “task not serializable” issue, because Symbol objects are not.
Still, I need to send symbols to workers, since it’s on symbol objects we can bind data…

What’s the good way to achieve this ? What am i missing ?

(So far, I haven’t look much at the dedicated package mxnet-spark. I don’t see how symbols could be use in that. But maybe I’m missing the point)

PS: btw, i’m using spark 2.3, and i’m in scala

Thanks !
Mathieu


#2

I found my way, using Symbol json serialisation and spark broadcast variables.


#3

cloudpickle works for me (using ray)


#4

You can also have a look at this blog post if that helps: https://aws.amazon.com/fr/blogs/machine-learning/distributed-inference-using-apache-mxnet-and-apache-spark-on-amazon-emr/