DNS-based service discovery for Mesos
How does it work?
From time to time mesos-dns query mesos-master (so frameworks does not need to update it) and retrives data about running tasks so it can creatie appropriate DNS entries.
Any docker container that we run on marathon will be pingable via A record, any ephemeral port given by marathon will be visable via SRV record.
Mesos-DNS is as simple as it can get. Forget about extra features:
- application management
- health checks
- events about joining/leaving cluster
Health checks are not needed as you can use marathon which will make mesos-dns fault tolerant.
Mesos-DNS is stateless so no replication is needed if you want more then one.
There are of course things to concider:
- by default DNS use UDP protocol
- mesos-dns might be vulnerable to DNS attacks
I am going to provide information about other service discovery like consul, serf, etcd, etc in my next blog posts.
Service discovery? Service discovery!
Service discovery allows for detection of applications on a network. Implementiations vary, mesos-dns use DNS, but for example serf use gossip protocol.
Quick example - we got an application which need to be loadbalanced.
I would say old school style would be to connect one instance of LB to one instance of application:
Well, we don’t want static IP’s and ports in our configuration, what we want is the good way, using technology to our advantage!
With marathon you don’t have an option. You never know which slave will accept job and which ephemeral port will be given to application so without service discovery our mesos infrastructure might be useless.
Using service discovery would help in:
- scaling application
- replacing nodes
- apps crashing
Seems like a good idea to use it then, doesn’t it?
I might exaggerate a bit to use service discovery for this exact example but keep in mind that it’s just an example :)
To build mesos-dns you need to have go and godep configured and ready to use.
Too much hassle? Dependency hell? Neva! So I thought creating automated docker image is a great idea. Here it is
In my repo there is config.json
MESOS_IP and RESOLVER fields are replaced with sed on start.
To start mesos-dns simply execute below command replacing environment variables with your configuration:
Now you can check if mesos-dns is up and running.
If all is good try to resolve and check your new service discovery in mesos cluster.
Now the fun part begins. We need to think what our service discovery can do for us!
Let’s confgiure mesos-slaves so they’ll query mesos-dns and provide logic for creating and updating app configuration so it can use service discovery.
A record - Cassandra
To start cassandra you need correct configuration in cassandra.yaml file. Most interesting stuff for us is section where we need to specify seeds. This database has got its own gossip protocol implemented but how this gossip protocol can get information from mesos-dns? Right now it simply can’t.
To achive cassandra nodes visability within the cluster we need to provide them information about seeds. For a starter we need a script to gather correct configuration from network using mesos-dns.
DNS A record will suffice because cassandra need to bind on default ports.
Marathon job is given env:
And startup script on cassandra adds SEED to yaml configuration with that bash command
If you want to see whole repo here it is.
SRV record - nginx proxy
Nginx, aaaah, my favourite http server. I’m not going to get into details of how to configure it, I’m just interested in one thing upstream
I have prepared empty upstream file:
A bit offtopic about haproxy. In haproxy it’s bit easier to use bash for adding and removing new nodes in configuration because you can prepare conf file in a way that appending host on the end will be correct. Here we have curly braces. We need to take care of that.
My starting script:
Dig will get our SRV record as a list, awk will get it in right order and sed will push it to the upstream file. We got it! :)
Mesos-DNS is a quick way to start service discovery in mesos cluster.
We need to remember that it’s just alpha version but nevertheless it works like a charm.
What I like about it?
There is just one thing that I don’t know how to overcome. We need cron or loop with delay in a script to continously check if DNS records have changed. That is a case with nginx. Using cron we need to wait a minute for changes to propagate.
What if we can’t wait?
Mesos-dns will never tell the application that it needs to rebuild config because of cluster change. It’s how DNS is working. For some apps it will be fine but for some it just won’t work.
So is mesos-dns the best? That question is open.
Bye bye, see you real soon!