Hi, I am trying to run distributed training the example provided on 2 different node with the following internal IP address:
I created a hosts file with the following ip, and ssh got no issue at all from one to other machine. When i launch the code:
python ../../tools/launch.py -n 2 --launcher ssh -H hosts python train_mnist.py --network lenet --kv-store dist_device_sync
And it prompt the following output at the same time:
firstname.lastname@example.org's password: email@example.com's password: firstname.lastname@example.org's password: email@example.com's password:
For both machine I’m using the same admin password, no matter how hard I try it just prom Permission denied, please try again.
It’s there any way I can get debug message on what really happening behind the background?