with Ansible, Prometheus and Grafana
Disclaimer: I’m newbie to Ansible, Prometheus and Grafana.
Grafana dashboard for the cluster:
Easy cluster management with Ansible:
My Pi cluster config management started with:
alias runshell=”pdsh -R ssh -w email@example.com.[7,8,6,13,17]”
It worked at the beginning when I set up the cluster as a pretty homogenous Hadoop cluster (most nodes are data nodes or node managers). However, as I’m getting more serious with my cluster with some really useful functionalities (instead of for the sake of running a Hadoop/k8s cluster), I had been running into cluster management challenges.
In my current cluster, I have 6 nodes with different roles:
img0: self-hosted docker image repository.
monitor0: promethus & grafana dashboard
vpn0: vpn service.
ws0: web service.
ansible0: ansible controller.
proxy0: reverse web proxy.
A few obvious challenges:
- Run pdsh as root is not easy.
- No easy way to track past commands running in the cluster.
- No easy way to converge states for the whole cluster.
- No easy way to split the cluster into different roles and manage separately.
In a sentence, I need a cluster config management tool. A few choices came out: Chef, Puppet CFEngine, Salt.
Salt & CFEngine are NO to me since even professional system admins struggle to understand & use them, and they are not the most fashionable ones nowadays anyway.
I was thinking of using Puppet since Facebook seems to use it, till I found it is written in Ruby, same for Chef, a language I was frustrated with long time ago when I built a web site the first time with Ruby on Rails. Isn’t there a Python, Go or even Java version of modern infra config management system? Then I found Ansible, which is written in Python seems to be very popular as well!
- Installing Ansible is very easy:
$ sudo apt install ansible
2. Create an inventory file
pi@ansible0:~/ansible $ cat /etc/ansible/hosts[pis]
3. Run your first command
ansible all -a "/bin/echo hello"
If you got any error, make you you have ssh-copy-id your public key to all the hosts.
4. Explore the documentation.. for one use case, I wanted to remove my stale /etc/hosts file which contained a lot of invalid hosts. I need to clean it up and update it with hosts described in the inventory file.
There are two tasks, 1) remove any entry containing 192.168.0.* in my /etc/hosts 2) add new hosts mapping from inventory to the /etc/hosts. The second task is more complicated you can check https://github.com/oliverhu/ansible_config/blob/master/hostname.yml.
For the first one, it is very explanatory by itself..
- name: keep 10 lines of /etc/hosts file
- name: update etc/hosts
5. A harder one, to set up Promethus and Grafana for the cluster with Anisle: https://github.com/oliverhu/picluster-ansible. After run that repo, you will get the nice monitoring UI presented at the beginning of the post.
It is far more efficient to use a proper infra management tool to orchestrate your cluster.. Ansible seems to be a good one. Folks might ask:
Why not just use Kubernetes & Dockers?
- There is an unfortunate fact that, Kubernetes & Docker daemons would blow your Raspberry Pis away with ~600MB memory footprint, remember you only have ~1GB in total.
- If your apt gets stuck at resolving ipv6 addresses:
sudo nano /etc/sysctl.conf
append the following lines to turn off ipv6:
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
sudo sysctl -p to take effect or just reboot.
2. Force apt-get instead of aptitude for upgrading packages.
For some reason, aptitude takes 2GB (eating all my physical memory & swap space).
- name: Playbook for upgrading the RPis hosts: raspberry_pi user: pi gather_facts: no tasks: - name: Update and upgrade apt packages become: true apt: upgrade: yes update_cache: yes force_apt_get: yes # This line
ansible update /etc/hosts file with IP of all hosts across all hosts
In this post, we are going to discuss how to update /etc/hosts file of all servers in ansible, thereby enabling a…
Ansible Documentation - Ansible Documentation
Ansible is an IT automation tool. It can configure systems, deploy software, and orchestrate more advanced IT tasks…
Ansible lineinfile examples - Add, Modify, Delete, Replace lines
0 Ansible lineinfile module could be the saviour of your day when you want to work with files and especially modify…