Overcoming the NGINX active health check limitation in free edition

Vikash Kumar
3 min readNov 23, 2020

Nginx as we all know comes in free community edition as well as a paid one called NGINX Plus. There are several features addition in the Nginx plus over the free version to make it more attractive and value for money . One such feature is active health check for backend servers and its absence is very much felt in the free community version of Nginx. However , given the rich integration ecosystem of Nginx , its very much possible to plug in other tools to compensate for this particular feature and keep reaping the benefits of using Nginx without any cost. I would be mentioning the details about our integration of Consul with Nginx for achieving active health check feature otherwise absent in Nginx

Why active health checks are important?

First of all, let’s understand why we need active health checks . Nginx when used as a reverse proxy , also serves the functionality of load balancing between the group of upstream servers .Whenever any one of the upstream servers become unhealthy, Nginx will remove that server from the pool of servers serving traffic . To determine whether the servers are healthy at any given point of time, Nginx constantly queries at regular intervals the API endpoints (provided by user) pointing to the upstream server. Nginx has 2 ways to perform health check ,Active and Passive. In the active health check, one can provide the API endpoint to be queried by Nginx and its response code or response body would be used to determine whether the health check passed or not. This is only supported in paid version Nginx Plus

In the Passive mode , Nginx passively monitors the response code being returned by the upstream servers and when the error in response exceeds a user configured threshold, it would remove that particular server from traffic.After this, Nginx would keep diverting few request to this unhealthy server at regular times to check if it has recovered or not. So If your server has developed some serious fault which persists for longer time , all the requests which is forwarded by Nginx to the unhealthy server for determining its latest health status would invariably fail. This would surely not be a good user experience from the end user point of view and we wanted to rectify this behaviour without incurring the huge cost for Nginx PLUS .

Leveraging Consul for active health check along with service discovery

Consul, is already a very well known and familiar tool in the Devops community for its great strength in area of dynamic service discovery suitable for immutable distributed environments. We planned to leverage the built-in health check and service discovery feature of Consul to discover the dynamic group of upstream servers in Nginx. I am outlining the high level sequence which we followed for achieving our objective

  1. We wrote a Python script which would generate the dynamic list of upstream servers by leveraging the consul health check
  2. This list was saved in a intermediate file and a corresponding upstream.conf was generated to be used in the nginx configuration
  3. The latest upstream.conf would be referenced by the main nginx config and we would trigger a nginx reload so that changes could take effect.
  4. We wrote a wrapper script which invoked the python script and its output of latest healthy upstream servers were compared with previous intermediate file. If there was no change, nginx trigger would not be required. This wrapper script was scheduled as a cron to be run every 5 minutes.
Sample nginx configuration

In this way , we were able to continue using the free community version of nginx while at the same time leveraged consul to achieve active health check functionality otherwise only available in paid version of Nginx Plus.

--

--

Vikash Kumar

Seasoned DevOPs and SRE leader , Cloud and infra solution architect