etcd CoreOS cluster on Azure

The basics – what is etcd?

etcd is an open source distributed key value store.

It was created by CoreOS to provide a store for data and service discovery for their distributed applications like fleet,flannel and locksmithd.

etcd exposes a REST based API for applications to communicate

etcd runs as a daemon across all cluster members. Clusters should be created in the recommended odd numbers of 3,5,7 etc nodes to allow for node failures and a quorum (>51%) to remain.

Clusters run with a single leader and other cluster members following. If the leader fails, remaining nodes through a consensus algorithm elect a new one.

What’s it used for?

Apart from the CoreOS use of etcd, who else uses it?

The big one is Kubernetes – Management, orchestration and service discovery of containers across a cluster of worker nodes. etcd used to store cluster state and configuration data and is core to the operation of Kuberbetes.

Apache Mesos and Docker Swarm can also optionally use etcd as their cluster KV stores.

etcd on Azure?

I decided to use the Azure CLI to deploy a three node etcd cluster.

I could have also used PowerShell on Windows or authored an ARM template, but it was interesting to use the Azure CLI, as I tend to use OS X and a Linux desktop more than Windows these days.

The final script sets up:
-An Azure resource group
-An Azure Storage account
-A virtual network with single subnet
-Three vnics with static IPs
-Three Public IPs for connecting in via SSH later if required
-Finally three CoreOS Linux VMs, that each use a cloud-config.yaml file to automate the configuration of the etcd cluster and Systemd units. I also add in my public SSH key to allow remote SSH logins if required.

The script can be git cloned from

Alternatively, here it is:

#Tested on MacOS X 10.11.3 and Ubuntu 15.10 with Azure CLI tools installed
#Use azure login to get a token against your Azure subscription before running script
#Variables to change
location="australiasoutheast" #Azure DC Location of choice
resourcegroup="etcd-cluster" #Azure Resource Group name
vnetname="etcd-vnet" #Virtual Network Name
vnetaddr="" #Virtual Network CIDR
subnetname="etcd-subnet" #Subnet name
subnetaddr="" #Subnet CIDR
availgroup="etcd-avail-group" #Availibilty group for VMs
storageacname="etcdstorage01" #Storage Account for the VMs
networksecgroup="etcd-net-sec" #Network Sec Group for VMs
#VM name and private IP for each
coreos_image="coreos:coreos:Stable:835.9.0" #CoreOS image to use
vm_size="Basic_A0" #VM Sizes can be listed by using azure vm sizes --location YourAzureDCLocaitonOfChoice
#Azure CLI------------------------------------------------------------------------------------
#Azure Resource Group setup
azure config mode arm
azure group create --location $location $resourcegroup
#Azure Storage account setup
azure storage account create --location $location --resource-group $resourcegroup --type lrs $storageacname
#Azure Network Security Group setup
azure network nsg create --resource-group $resourcegroup --location $location --name $networksecgroup
#Virtual network and subnet setup
azure network vnet create --location $location --address-prefixes $vnetaddr --resource-group $resourcegroup --name $vnetname
azure network vnet subnet create --address-prefix $subnetaddr --resource-group $resourcegroup --vnet-name $vnetname --name $subnetname --network-security-group-name $networksecgroup
#Public IP for VMs (can create SSH inbound rules to access these if required)
azure network public-ip create --resource-group $resourcegroup --location $location --name "$vm01_name"-pub-ip
azure network public-ip create --resource-group $resourcegroup --location $location --name "$vm02_name"-pub-ip
azure network public-ip create --resource-group $resourcegroup --location $location --name "$vm03_name"-pub-ip
#Virtual Nics with private IPs for CoreOS VMs
azure network nic create --resource-group $resourcegroup --subnet-vnet-name $vnetname --subnet-name $subnetname --location $location --name "$vm01_name"-priv-nic --private-ip-address $vm_static_IP1 --network-security-group-name $networksecgroup --public-ip-name "$vm01_name"-pub-ip
azure network nic create --resource-group $resourcegroup --subnet-vnet-name $vnetname --subnet-name $subnetname --location $location --name "$vm02_name"-priv-nic --private-ip-address $vm_static_IP2 --network-security-group-name $networksecgroup --public-ip-name "$vm02_name"-pub-ip
azure network nic create --resource-group $resourcegroup --subnet-vnet-name $vnetname --subnet-name $subnetname --location $location --name "$vm03_name"-priv-nic --private-ip-address $vm_static_IP3 --network-security-group-name $networksecgroup --public-ip-name "$vm03_name"-pub-ip
#Create 3 CoreOS-Stable VMs, fly in cloud-config file, also provide ssh public key to connect with in future
azure vm create --custom-data=etcd-01-cloud-config.yaml --admin-username core --name $vm01_name --vm-size $vm_size --resource-group $resourcegroup --vnet-subnet-name $subnetname --os-type linux --availset-name $availgroup --location $location --image-urn $coreos_image --nic-names "$vm01_name"-priv-nic --storage-account-name $storageacname
azure vm create --custom-data=etcd-02-cloud-config.yaml --admin-username core --name $vm02_name --vm-size $vm_size --resource-group $resourcegroup --vnet-subnet-name $subnetname --os-type linux --availset-name $availgroup --location $location --image-urn $coreos_image --nic-names "$vm02_name"-priv-nic --storage-account-name $storageacname
azure vm create --custom-data=etcd-03-cloud-config.yaml --admin-username core --name $vm03_name --vm-size $vm_size --resource-group $resourcegroup --vnet-subnet-name $subnetname --os-type linux --availset-name $availgroup --location $location --image-urn $coreos_image --nic-names "$vm03_name"-priv-nic --storage-account-name $storageacname
#Azure CLI------------------------------------------------------------------------------------

Generally it takes around 10 minutes to deploy all the components (sometimes a storage account can take 15 minutes by itself – bizarre but repeatable!) But after it’s done, your resource group should look like this.

Screen Shot 2016-03-26 at 2.02.10 PM

At this point, I created an SSH inbound rule into one of the public IPs on one VM and test that my etcd cluster is all good.

Screen Shot 2016-03-26 at 2.14.24 PM

I can then insert some key-value pairs for testing and then think about deploying Kubernetes. All in another blog🙂