Distributed TensorFlow on Raspberry Pi’s Hadoop 3 Cluster


  1. Raspbian OS Upgrade
sudo apt-get install pdsh
export PDSH_RCMD_TYPE=ssh
pdsh -R ssh -w pi@192.168.0.[7,8,9,13,17] 'YOUR_SHELL_COMMAND'
alias runshell="pdsh -R ssh -w pi@192.168.0.[7,8,9,13,17]
runshell 'sudo apt-get install virtualenv'
sudo vim /etc/dphys-swapfile# Update
/etc/init.d/dphys-swapfile restart

Do the job!

1. Install Hadoop 3.1.1 in your Raspberry Pi cluster

pi@master:~ $ jps
18912 NodeManager
2947 NameNode
3091 DataNode
31577 Jps
3257 SecondaryNameNode
18799 ResourceManager
HDFS Overview
Data nodes conditions
YARN Cluster page
YARN nodes

2. Install TensorFlow

runshell 'virtualenv python3 ~/p3 && source ~/p3/bin/activate && pip3 install tensorflow'

3. Get a copy of TonY jar

4. Kick off your distributed TensorFlow job!

pi@master:~/tf $ ls
src tony-cli-0.1.0-all.jar tony.xml
pi@master:~/tf $ cat tony.xml
pi@master:~/tf $ ls src/
pi@master:~/tf $ CLASSPATH=$(${HADOOP_HDFS_HOME}/bin/hadoop classpath --glob):/home/pi/tf/:/home/pi/tf/* java com.linkedin.tony.cli.ClusterSubmitter --src_dir src --executes src/distributed.py --python_binary_path /home/pi/p3/bin/python

6. Check your job in resource manager web page

pi@slave-3:~ $ source p3/bin/activate
(p3) pi@slave-3:~ $ tensorboard --logdir /tmp/mnist/1
TensorBoard 1.9.0 at http://slave-3:6006 (Press CTRL+C to quit)


  1. My pip is screwed up after the Raspbian Stretch upgrade:
pi@master:/usr/local/bin $ pip
Traceback (most recent call last):
File “/usr/bin/pip”, line 9, in <module>
from pip import main
ImportError: cannot import name main
sudo vim /usr/bin/pip 
from pip import main if __name__ == ‘__main__’: 
from pip import __main__if __name__ == ‘__main__’:   sys.exit(__main__._main())




