Configuring HAProxy With Ansible Roles on EC2 Instances

Poojan Mehta

Published in

The Startup

8 min readOct 13, 2020

IN THIS ARTICLE, I WILL DEMONSTRATE HOW TO CONFIGURE AND MANAGE HAProxy LOAD BALANCER FOR WEBSERVERS RUNNING ON AWS

→Pre-requisites: -

Foundational knowledge of AWS cloud and some common yet important services of the provider.
Basic understanding of Ansible and playbooks. My previous article for reference which will help you to run playbooks in the local system and get familiar with the tool :

LINUX AUTOMATION WITH ANSIBLE

This article is a demonstration of Automation using ANSIBLE

medium.com

How to provision EC2 instance using Ansible. Reference for enhancing the understanding and integrating with the public cloud with the concept of dynamic inventory:

Setting Up Ansible for EC2 With Dynamic Inventory

Provision OS on top of AWS cloud using Ansible and make the environment more agile using dynamic inventory.

medium.com

~ Problem Statement:

🔅Provision EC2 instances through ansible.

🔅 Retrieve the IP Address of instances using the dynamic inventory concept.

🔅Configure the web servers through the ansible role.

🔅Configure the load balancer through the ansible role.

🔅The target nodes of the load balancer should auto-update as per the status of web servers.

As Ansible is built on top of python, a Python Software Development Kit (SDK) is required that enables the configuration of AWS services. The package is an object-oriented API named boto3.

pip3 install boto3   //assuming python3 is installed

→STEP-1)

The local system contacts to the AWS console and runs the tasks from the playbook.
The user authentication is done by providing ACCESS_KEY and SECRET_KEY in an encrypted file that is called a vault.

- hosts: localhost
  gather_facts: yes
  vars_files:
      - secret.yml

ansible-vault create file_name.yml

Put all the sensitive data inside this file and pass them as variables in the playbook.

→Adding up tasks for this playbook that will provision 3 instances for webserver and 1 instance for the load balancer.

→In the current dynamic environment, it’s never a good call to run a website on a single web server as the loss of a web server may lead to loss of clients and client data and the sole reputation of the product.

tasks:
   - name: Provision os in AWS for webserver 
     ec2:
      key_name: "keytask"  //keypair to connect to the instance
      instance_type: "t2.micro"  
      image: "ami-0ebc1ac48dfd14136"
      count: 3
      wait: yes
      vpc_subnet_id: "subnet-bedaded6 " // subnet id for ap-south-1a
      region: "ap-south-1"  //asia pacific south(mumbai) region 
      state: restarted
      assign_public_ip: yes
      instance_tags:
             Name: "webserver"
      group_id: "sg-0844f1e8ad419e348" //pre-created security group 
      aws_access_key: "{{ myuser }}"  //access-key
      aws_secret_key: "{{ mypass }}"  // secret key- name: Provision os in AWS for loadBalancer
     ec2:
      key_name: "keytask"
      instance_type: "t2.micro"
      image: "ami-0ebc1ac48dfd14136"
      count: 1
      wait: yes
      vpc_subnet_id: "subnet-bedaded6"
      region: "ap-south-1"
      state: restarted
      assign_public_ip: yes
      instance_tags:
             Name: "loadBalancer"
      group_id: "sg-08ebf6fbae80de33a"
      aws_access_key: "{{ myuser }}"
      aws_secret_key: "{{ mypass }}"

Only my current system and the IP of the load balancer instance are allowed for interaction to the webserver instance. will discuss this in later part of the article👇.

→The instance tags will divide the instances into groups and can be used separately after they launch.

→Don’t forget to add the key to the ansible.cfg file as private_key_file for future login to these systems.

Run the playbook with

 ansible-playbook file_name.yml — ask-vault-pass

→STEP:2)

Step 2 is to retrieve the IP address of the instances with the dynamic inventory
run ./ec2.py — -list to display the details retrieved from the console. (Assuming you’ve gone through the pre-requisite article of mine)🙌

→The next step is to configure the webserver using Ansible ROLE

What is Ansible ROLE ?🤔

As we develop larger and complex playbooks, we often discover opportunities where we can reuse the same code from playbooks. But, in larger playbooks, many imported files and handlers may present, and copy whole content may not be a good call!
Ansible Role provides a way to manage the play and enhance the chances of reusability of the code. We can bundle the play in a directory structure in a standard manner. So, copying the role is as simple as copying files and directories.
The role enables the playbook to be highly scalable📈 and easily sharable and a place with batter management.

→STEP:3)

This step refers to configure the web server using the role
First, create an Ansible role by ansible-galaxy init role

The default location for the roles is /etc/ansible/roles

Update the same in the ansible.cfg file in roles path

Now, write all configuration tasks inside corresponding files to complete the setup.

Ansible Galaxy is a centralized repository by the community where a huge amount of pre-created roles are available and can be freely downloaded and used. multiple time tested roles are available here.

# tasks file for webserver
#- name: Download Httpd,Git and php in remote system
      package:
       name:
        - httpd
        - git
        - php
       state: present- name: Clone code from GitHub
      git:
        repo: 'https://username:password@github.com/poojan1812/Ansible.git'
        dest: "/var/www/html/"- name: start the services of httpd
      service:
       name: "httpd"
       state: restarted

Write the above code in the tasks/main.yml file in the current role directory. Consider this as the most basic example without any imported files and handlers.

With this, all of our web servers will be configured and the same code will be deployed in the document root of the webserver.

With the curiosity of yours, some questions are obvious!😃😃
→In which pattern client requests to access the website?
→With so many webservers running, which public accessible IP should be provided to the end-user?
→Will all end users connect to the same system and choke all of its resources and then request to other systems?
→Is it a good idea to give direct access to the server system to all the end-users?

One simple solution for all such questions is
LOAD BALANCER⚖️

What is a Load Balancer?

It’s pretty much clear from the name itself that it balances the load!
A load balancer is a program that runs as an intermediate system between the client and the server and distributes the web traffic

Types of load balancer:

Software (ex: HAProxy)
Hardware
Managed (ex: AWS ELB)

The Load Balancer system is the one that receives the requests from the user and diverts towards backend servers and replies to the output. With this, only one IP address of the load balancer is supposed to be provided to the user to access the site. And Load balancer will equally distribute the traffic implies no more overloading on a single server. Also, restrict public access to the server system to ensure more security.

→For this reason, now my servers are only accessible by the load balancer. (p.s: security group configuration for the server)

→STEP:4)

As an example, I have taken an HAProxy (high availability proxy) software load balancer. HAProxy is free open-source software that provides high availability and load balancing(front-end server) to TCP requests and diverts to multiple backend servers.

HAProxy works on the Round Robin algorithm that select the backend servers turn by turn and equally distributes traffic.

→ Mention the port number you need to expose to the public besides the bind keyword. This allocates a listener port(8080 here) to the system.

→All the backend servers are supposed to be registered in the configuration file of HAProxy in order to balance load across all the servers. We can add them manually but not a good idea for a larger environment. As Ansible supports python jinja2 conventions, use the loop that prints the IP address of the servers with the pre-defined group keyword of Ansible. Mention the port (80 default) of the httpd service.

Here are the tasks to download and configure HAProxy in the remote ec2 system and start the services.

# tasks file for loadBalanacer- name: Download haproxy package
             package:
                 name:
                    - haproxy- name: Copy the config file to remote system
             template:
                     src: "haproxy.cfg"
                     dest: "/etc/haproxy/haproxy.cfg"- name: Start service
             service:
                     name: "haproxy"
                     state: restarted

To copy the configuration file in the remote system I’ve used the template module of Ansible. It copies the file but not in the static terms. It allows us to parse jinja2 syntax and copy files with system evaluated values. Copy the conf. the file inside /loadBalancer/templates/ directory of the role.

→Both roles are configured. Now, create a single playbook that runs both the roles and configure accordingly.

- hosts: tag_Name_webserver
   become: yes
   remote_user: ec2-user
   roles:
      - webserver- hosts: tag_Name_loadBalancer
   become: yes
   remote_user: ec2-user
   roles:
      - loadBalancer

BINGO!! EVERYTHING CONFIGURED PROPERLY!

FINAL OUTPUT:

on 3 different requests, the load balancer diverts to a different server and balances the load

→A highly scalable📈 and agentless structure is ready😇. Now even if some of my backend servers get down, the load balancer will be notified automatically and work accordingly very seamlessly.

Never believe anything without proof🤨 :