Contents

Handling Network Retries

In a distributed environment, network retries are an essential mechanism to handle temporary failures in communication between different components. When using Docker containers, network retries can help to ensure the reliability and availability of containerized applications.

Network retries involve attempting a failed network operation multiple times until a successful response is received. This can be achieved through various approaches, including:

Retry within the application code: This involves adding retry logic within the application code to handle network failures. The application code can implement retries using libraries such as retrying or resilience.

External retry services: An external service, such as Apache ZooKeeper or Hashicorp Consul, can be used to coordinate retries across multiple containers. These services can detect when a container is down and redirect traffic to healthy containers.

Load balancers: Load balancers can also implement network retries by redirecting traffic to healthy containers when a container is down. Load balancers can also monitor the health of containers and perform automatic failover to healthy containers.

In Docker containers, network retries can be particularly useful when deploying applications that consist of multiple container instances. For example, if a containerized application has three instances of a web server, and one of them fails, network retries can help to ensure that requests are redirected to the other two instances until the failed instance is restored. This can improve the overall reliability and availability of the application.

In summary, network retries are an essential mechanism to ensure the reliability of containerized applications. Docker containers can implement network retries through various approaches, including retrying within the application code, external retry services, and load balancers.

Bash+curl implementation

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
function is_online() {
  # Checks url for response HTTP-200
  # Arguments: 
  #   {url} 
  #   {max_attempts}
  local attempt_counter=0
  local url=${1:-"http://localhost:8081"}
  local max_attempts=${2:-30}

  until [ "$(curl -s -w '%{http_code}' -o /dev/null ${url})" -eq 200 ]
  do
    if [ "${attempt_counter}" -eq "${max_attempts}" ];then
      printf "\n ---> (is_online helper)> Max attempts reached \n\n"
      exit 1
    fi

    attempt_counter=$(($attempt_counter+1))
    printf "\n ---> (is_online helper)> Wait for ${url} up [${attempt_counter}/${max_attempts}]:\n"
    sleep 30
  done
}

is_online "${URL}" ${RETRIES}

Only curl in one line

Check flags with https://explainshell.com

1
curl --retry-all-errors --retry 30 --connect-timeout 5 --retry-delay 30 --show-error -sL http://localhost:8081

References: