LVS NAT + Keepalived HOWTO

By Adam Fletcher (C) 2002, released under GPL


Install, testing and running of a Keepalived HA based LVS/NAT

1. keepalived - what is it?

From Alexandre Cassen, author of keepalived:

"The main goal of the keepalived project is to add a strong & robust keepalive facility to the Linux Virtual Server project. This project is written in C with multilayer TCP/IP stack checks. Keepalived implements a framework based on three family checks : Layer3, Layer4 & Layer5. This framework gives the daemon the ability of checking a LVS server pool states. When one of the server of the LVS server pool is down, keepalived informs the linux kernel via a setsockopt call to remove this server entrie from the LVS topology. In addition keepalived implements a VRRPv2 stack to handle director failover. So in short keepalived is a userspace daemon for LVS cluster nodes healthchecks and LVS directors failover.

"keepalived is a project started to create a full-featured virtual router for Linux, which includes load balancing through Linux Virtual Server, failover via VRRP and health checks to monitor real servers. Essentially, it is a single package for doing what is typically done in Linux via lvs+mon+fake+hearbeat. With keepalived an administrator can quickly build a redundant load balancing solution without the hassle of using numerous packages and custom scripts.

keepalived will:
keepalived is available from www.keepalived.org

Software used in this example:
Optional software:

2. Plan your network!

Draw out a logical diagram of your network, either by hand or with a tool like xfig or Visio. Planning your  network saves hassle and time later! Make a list of the IP addresses you are going to use, any external router IPs you may need, the IP addresses of the machine you are going to load balancer and other related information.

3. Configuring your kernel

Configuring your kernel for LVS should be done according to the directions on www.linuxvirtualserver.org. Be sure you enable full NAT, and IP Forwarding.
After patching your kernel for the latest LVS and installing the new kernel and rebooting, you should turn on IP forwarding. Many Linux distributions allow you to do this through the system configuration editor (YaST2 on SuSE, linuxconf on Red Hat, for example), or you can do this in your keepalived startup scripts (we'll give some examples of this later).  For now, just

echo "1" > /proc/sys/net/ipv4/ip_forwarding

as root.

4. Building ipvsadm (optional)

ipvsadm is a tool available from www.linuxvirtualserver.org that allows you to setup virtual servers by hand. It is also a useful debugging/status tool, so I recommend building this small tool.

5. Building keepalived

The quick version:

example-01:~ # tar xzvf keepalived-0.7.1.tar.gz; cd keepalived-0.7.1; ./configure; make; make install

Keepalived is very simple to build - grab the latest package from www.keepalived.org, untar, the run ./configure, then make, and make install.
For more information, read the INSTALL file shipped with keepalived.

If you have any trouble, or keepalived says it is not installing support for something you expected (such as SSL health checks), be sure to verify that you have the missing library or header file in the location keepalived expects it to be - for instance, the location of LVS's header file has changed in recent releases, so keepalived may not find the header in older versions of LVS.

6. Setting up keepalived: a simple network: 1 load balancer/virtual router, 1 real server on port 22 (ssh).

Now that we have keepalived built and installed, let's set up this network:

Client (on the internet somewhere) --> load balancer --> realserver

Load balancer IPs:
Realserver: Our first step is to configure keepalived. The typical location for
this file is /etc/keepalived/keepalived.conf

Note that keepalived, as of this writing, does not report errors in the
configuration file! This means if something is not right in the config file it may be difficult to notice. Try starting keepalived with the -d option, which will dump a config to syslog.

-- cut here --
! This is a comment
! Configuration File for keepalived

global_defs {
   ! this is who emails will go to on alerts
   notification_email {
        admins@example.com
    fakepager@example.com
    ! add a few more email addresses here if you would like
   }
   notification_email_from admins@example.com

   ! I use the local machine to relay mail
   smtp_server 127.0.0.1
   smtp_connect_timeout 30

   ! each load balancer should have a different ID
   ! this will be used in SMTP alerts, so you should make
   ! each router easily identifiable
   lvs_id LVS_EXAMPLE_01
}

! vrrp_sync_groups make sure that several router instances
! stay together on a failure - a good example of this is
! that the external interface on one router fails and the backup server
! takes over, you want the internal interface on the failed server
! to failover as well, otherwise nothing will work.
! you can have as many vrrp_sync_group blocks as you want.
vrrp_sync_group VG1 {
   group {
      VI_1
      VI_GATEWAY
   }
}

! each interface needs at least one vrrp_instance
! each vrrp_instance is a group of VIPs that are logically grouped
! together
! you can have as many vrrp_instaces as you want

vrrp_instance VI_1 {
        state MASTER
        interface eth0
     
        lvs_sync_daemon_inteface eth0

    ! each virtual router id must be unique per instance name!
        virtual_router_id 51

    ! MASTER and BACKUP state are determined by the priority
    ! even if you specify MASTER as the state, the state will
    ! be voted on by priority (so if your state is MASTER but your
    ! priority is lower than the router with BACKUP, you will lose
    ! the MASTER state)
    ! I make it a habit to set priorities at least 50 points apart
    ! note that a lower number is lesser priority - lower gets less vote
        priority 150

    ! how often should we vote, in seconds?
        advert_int 1

    ! send an alert when this instance changes state from MASTER to BACKUP
        smtp_alert

    ! this authentication is for syncing between failover servers
    ! keepalived supports PASS, which is simple password
    ! authentication
    ! or AH, which is the IPSec authentication header.
    ! I don't use AH
    ! yet as many people have reported problems with it
        authentication {
                auth_type PASS
                auth_pass example
        }

    ! these are the IP addresses that keepalived will setup on this
    ! machine. Later in the config we will specify which real
        ! servers  are behind these IPs
    ! without this block, keepalived will not setup and takedown the
    ! any IP addresses
     
        virtual_ipaddress {
                192.168.1.11
        ! and more if you want them
        }
}

! now I setup the instance that the real servers will use as a default
! gateway
! most of the config is the same as above, but on a different interface

vrrp_instance VI_GATEWAY {
        state MASTER
        interface eth1
        lvs_sync_daemon_inteface eth1
        virtual_router_id 52
        priority 150
        advert_int 1
        smtp_alert
        authentication {
                auth_type PASS
                auth_pass example
        }
        virtual_ipaddress {
                10.20.40.1
        }
}

! now we setup more information about are virtual server
! we are just setting up one for now, listening on port 22 for ssh
! requests.

! notice we do not setup a virtual_server block for the 10.20.40.1
! address in the VI_GATEWAY instance. That's because we are doing NAT
! on that IP, and nothing else.

virtual_server 192.168.1.11 22 {
    delay_loop 6

    ! use round-robin as a load balancing algorithm
    lb_algo rr

    ! we are doing NAT
    lb_kind NAT
    nat_mask 255.255.255.0

    protocol TCP

    ! there can be as many real_server blocks as you need

    real_server 10.20.40.10 22 {

    ! if we used weighted round-robin or a similar lb algo,
    ! we include the weight of this server

        weight 1

    ! here is a health checker for this server.
    ! we could use a custom script here (see the keepalived docs)
    ! but we will just make sure we can do a vanilla tcp connect()
    ! on port 22
    ! if it fails, we will pull this realserver out of the pool
    ! and send email about the removal
        TCP_CHECK {
                connect_timeout 3
        connect_port 22
        }
    }
}

! that's all

-- cut here --

When you start keepalived with the -d flag, you should see this in /var/log/message (or equivalent):

Sep 12 14:13:11 example-01 Keepalived: ------< Global definitions >------
Sep 12 14:13:11 example-01 Keepalived:  LVS ID = LVS_EXAMPLE_01
Sep 12 14:13:11 example-01 Keepalived:  Smtp server = 127.0.0.1
Sep 12 14:13:11 example-01 Keepalived:  Smtp server connection timeout = 100
Sep 12 14:13:11 example-01 Keepalived:  Email notification from = admins@example.com, fakepager@example.com
Sep 12 14:13:11 example-01 Keepalived:  Email notification = admins@example.com
Sep 12 14:13:11 example-01 Keepalived: ------< SSL definitions >------
Sep 12 14:13:11 example-01 Keepalived:  Using autogen SSL context
Sep 12 14:13:11 example-01 Keepalived: ------< VRRP Topology >------
Sep 12 14:13:11 example-01 Keepalived:  VRRP Instance = VI_1
Sep 12 14:13:11 example-01 Keepalived:    Want State = MASTER
Sep 12 14:13:11 example-01 Keepalived:    Runing on device = eth0
Sep 12 14:13:11 example-01 Keepalived:    Virtual Router ID = 51
Sep 12 14:13:11 example-01 Keepalived:    Priority = 150
Sep 12 14:13:11 example-01 Keepalived:    Advert interval = 1sec
Sep 12 14:13:11 example-01 Keepalived:    Preempt Active
Sep 12 14:13:11 example-01 Keepalived:    Authentication type = SIMPLE_PASSWORD
Sep 12 14:13:11 example-01 Keepalived:    Password = example
Sep 12 14:13:11 example-01 Keepalived:    VIP count = 1
Sep 12 14:13:11 example-01 Keepalived:      VIP1 = 192.168.1.11/32
Sep 12 14:13:11 example-01 Keepalived:  VRRP Instance = VI_GATEWAY
Sep 12 14:13:11 example-01 Keepalived:    Want State = MASTER
Sep 12 14:13:11 example-01 Keepalived:    Runing on device = eth1
Sep 12 14:13:11 example-01 Keepalived:    Virtual Router ID = 52
Sep 12 14:13:11 example-01 Keepalived:    Priority = 150
Sep 12 14:13:11 example-01 Keepalived:    Advert interval = 1sec
Sep 12 14:13:11 example-01 Keepalived:    Preempt Active
Sep 12 14:13:11 example-01 Keepalived:    Authentication type = SIMPLE_PASSWORD
Sep 12 14:13:11 example-01 Keepalived:    Password = example
Sep 12 14:13:11 example-01 Keepalived:    VIP count = 1
Sep 12 14:13:11 example-01 Keepalived:      VIP1 = 10.20.40.1/32
Sep 12 14:13:11 example-01 Keepalived: ------< VRRP Sync groups >------
Sep 12 14:13:11 example-01 Keepalived:  VRRP Sync Group = VG1, MASTER
Sep 12 14:13:11 example-01 Keepalived:    monitor = VI_1
Sep 12 14:13:11 example-01 Keepalived:    monitor = VI_GATEWAY
Sep 12 14:13:11 example-01 Keepalived: ------< LVS Topology >------
Sep 12 14:13:11 example-01 Keepalived:  System is compiled with LVS v1.0.4
Sep 12 14:13:11 example-01 Keepalived:  VIP = 192.168.1.11, VPORT = 22
Sep 12 14:13:11 example-01 Keepalived:    delay_loop = 10, lb_algo = rr
Sep 12 14:13:11 example-01 Keepalived:    protocol = TCP
Sep 12 14:13:11 example-01 Keepalived:    lb_kind = NAT
Sep 12 14:13:11 example-01 Keepalived:    RIP = 10.20.40.11, RPORT = 22, WEIGHT = 1
Sep 12 14:13:11 example-01 Keepalived: ------< Health checkers >------
Sep 12 14:13:11 example-01 Keepalived:  10.20.40.11:22
Sep 12 14:13:11 example-01 Keepalived:    Keepalive method = TCP_CHECK
Sep 12 14:13:11 example-01 Keepalived:    Connection timeout = 10
Sep 12 14:13:11 example-01 Keepalived:    Connection port = 22

Let's see what ipvsadm has to say about this, after keepalived starts up:

example-01:~ # ipvsadm
IP Virtual Server version 1.0.4 (size=65536)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  192.168.1.11:ssh rr
  -> 10.20.40.10:ssh            Masq    1      0          0
example-01:~ #

And finally, we should see the new IP addresses in our IP address list:

example-01:~ # ip addr list
1: lo: <LOOPBACK,UP> mtu 16436 qdisc noqueue
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 brd 127.255.255.255 scope host lo
2: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 100
    link/ether 00:e0:81:21:bb:1c brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.9/24 brd 192.168.1.254 scope global eth0
    inet 192.168.1.11/32 scope global eth0
3: eth1: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 100
    link/ether 00:e0:81:21:bb:1d brd ff:ff:ff:ff:ff:ff
    inet 10.20.40.2/24 brd 10.20.40.255 scope global eth1
    inet 10.20.40.1/32 scope global eth1
example-01:~ #

ipvsadm, ip addr list,  and starting keepalived with the -d option are good ways to verify your config is working.

7. Failover

With our basic config from above, we can easily move to a failover situation. All you have to do is setup keepalived on another box, copy over the keepalived.conf, change the lvs_id, change any priorities down 50 points, states to BACKUP, and run keepalived. You'll see in the logs on the backup server that the server accepts it's BACKUP state, and if you unplug the network cable(s) from the MASTER server, the BACKUP server takes over the MASTER state.

For the example, use the config file from the simple example above on the MASTER machine. On the BACKUP machine, use this config file:

-- cut here --
! This is a comment
! Configuration File for keepalived


global_defs {
   ! this is who emails will go to on alerts
   notification_email {
        admins@example.com
    fakepager@example.com
    ! add a few more email addresses here if you would like
   }
   notification_email_from admins@example.com

   ! I use the local machine to relay mail
   smtp_server 127.0.0.1
   smtp_connect_timeout 30

   ! each load balancer should have a different ID
   ! this will be used in SMTP alerts, so you should make
   ! each router easily identifiable

   ! this is router 2
   lvs_id LVS_EXAMPLE_02
}

! vrrp_sync_groups make sure that several router instances
! stay together on a failure - a good example of this is
! that the external interface on one router fails and the backup server
! takes over, you want the internal interface on the failed server
! to failover as well, otherwise nothing will work.
! you can have as many vrrp_sync_group blocks as you want.
vrrp_sync_group VG1 {
   group {
      VI_1
      VI_GATEWAY
   }
}

! each interface needs at least one vrrp_instance
! each vrrp_instance is a group of VIPs that are logically grouped
! together
! you can have as many vrrp_instaces as you want

vrrp_instance VI_1 {
        ! we are the failover
        state BACKUP
        interface eth0
     
        lvs_sync_daemon_inteface eth0

    ! each virtual router id must be unique per instance name!
        ! instance names are the same on MASTER and BACKUP, so the
        ! virtual router_id is the same as VI_1 on the MASTER
        virtual_router_id 51

    ! MASTER and BACKUP state are determined by the priority
    ! even if you specify MASTER as the state, the state will
    ! be voted on by priority (so if your state is MASTER but your
    ! priority is lower than the router with BACKUP, you will lose
    ! the MASTER state)
    ! I make it a habit to set priorities at least 50 points apart
    ! note that a lower number is lesser priority -
    ! lower gets less vote
        priority 100

    ! how often should we vote, in seconds?
        advert_int 1

    ! send an alert when this instance changes state from
    ! MASTER to BACKUP
        smtp_alert

    ! this authentication is for syncing between failover servers
    ! keepalived supports PASS, which is simple
    ! password authentication
    ! or AH, which is the ipsec authentication header.
    ! I don't use AH
    ! yet as many people have reported problems with it
        authentication {
                auth_type PASS
                auth_pass example
        }


        
        virtual_ipaddress {
                192.168.1.11
        ! and more if you want them
        }
}

! now I setup the instance that the real servers will use as a default
! gateway
! most of the config is the same as above, but on a different interface

vrrp_instance VI_GATEWAY {
        state BACKUP
        interface eth1
        lvs_sync_daemon_inteface eth1
        virtual_router_id 52
        priority 100
        advert_int 1
        smtp_alert
        authentication {
                auth_type PASS
                auth_pass example
        }
        virtual_ipaddress {
                10.20.40.1
        }
}


! now we setup more information about are virtual server
! we are just setting up one for now, listening on port 22 for ssh
! requests.

! notice we do not setup a virtual_server block for the 10.20.40.1
! address in the VI_GATEWAY instance. That's because we are doing NAT
! on that IP, and nothing else.

virtual_server 192.168.1.11 22 {
    delay_loop 6

    ! use round-robin as a load balancing alogorithm
    lb_algo rr

    ! we are doing NAT
    lb_kind NAT
    nat_mask 255.255.255.0


    protocol TCP

    ! there can be as many real_server blocks as you need

    real_server 10.20.40.10 22 {

    ! if we used weighted round-robin or a similar lb algo,
    ! we include the weight of this server

        weight 1

    ! here is a health checker for this server.
    ! we could use a custom script here (see the keepalived docs)
    ! but we will just make sure we can do a vanilla tcp connect()
    ! on port 22
    ! if it fails, we will pull this realserver out of the pool
    ! and send email about the removal
        TCP_CHECK {
                connect_timeout 3
        connect_port 22
        }
    }
}

! that's all

-- cut here --

Notice how little is different between the MASTER and BACKUP config file - just the lvs_id directive, the priorities, and the state directive. That's it, that's all. Make sure these are different but everything else is the same.

Once you startup keepalived on the MASTER and the BACKUP, you should be able  to kill keepalived on the MASTER server and watch the BACKUP take over in  the logs on the BACKUP server.

If you did an ip addr list on the backup server, you won't see the VIPs until the backup server takes over the MASTER state.

8. Setting up keepalived: a more complicated network: 2 VIPs (1 http/https, 1 ssh) with 2 real servers in each.

Load balancer IPs:
Realserver 1 (http, https):
Realserver 2 (http, https):
Realserver 3 (ssh):
Realserver 4 (ssh):

A few oddities occur with this setup. In particular, you'll want to learn to use the "genhash" command that comes with keepalived to generate MD5 sums for the HTTP_GET and the SSL_GET service checks. Also, you'll want to setup persistence on the https - persistence will allow your clients to always connect to the same realserver, in case you have something like a shopping cart that's state is maintained on the realserver.


genhash is simple to use. Let's say we have a test.html on our web servers, and use that for service checks.

example-01:~ # genhash -s 192.168.1.11 -p 80 -u /test.html
-----------------------[    HTTP Header Buffer    ]-----------------------
0000  48 54 54 50 2f 31 2e 31 - 20 32 30 30 20 4f 4b 0d   HTTP/1.1 200 OK.
0010  0a 44 61 74 65 3a 20 54 - 68 75 2c 20 31 32 20 53   .Date: Thu, 12 S
0020  65 70 20 32 30 30 32 20 - 31 39 3a 34 31 3a 35 39   ep 2002 19:41:59
0030  20 47 4d 54 0d 0a 53 65 - 72 76 65 72 3a 20 41 70    GMT..Server: Ap
0040  61 63 68 65 2f 32 2e 30 - 2e 33 39 20 28 55 6e 69   ache/2.0.39 (Uni
0050  78 29 20 6d 6f 64 5f 73 - 73 6c 2f 32 2e 30 2e 33   x) mod_ssl/2.0.3
0060  39 20 4f 70 65 6e 53 53 - 4c 2f 30 2e 39 2e 36 20   9 OpenSSL/0.9.6
0070  50 48 50 2f 34 2e 32 2e - 31 0d 0a 4c 61 73 74 2d   PHP/4.2.1..Last-
0080  4d 6f 64 69 66 69 65 64 - 3a 20 54 75 65 2c 20 30   Modified: Tue, 0
0090  33 20 53 65 70 20 32 30 - 30 32 20 31 37 3a 34 31   3 Sep 2002 17:41
00a0  3a 31 31 20 47 4d 54 0d - 0a 45 54 61 67 3a 20 22   :11 GMT..ETag: "
00b0  31 65 35 35 66 2d 34 32 - 2d 64 33 36 63 33 62 63   1e55f-42-d36c3bc
00c0  30 22 0d 0a 41 63 63 65 - 70 74 2d 52 61 6e 67 65   0"..Accept-Range
00d0  73 3a 20 62 79 74 65 73 - 0d 0a 43 6f 6e 74 65 6e   s: bytes..Conten
00e0  74 2d 4c 65 6e 67 74 68 - 3a 20 36 36 0d 0a 43 6f   t-Length: 66..Co
00f0  6e 6e 65 63 74 69 6f 6e - 3a 20 63 6c 6f 73 65 0d   nnection: close.
0100  0a 43 6f 6e 74 65 6e 74 - 2d 54 79 70 65 3a 20 74   .Content-Type: t
0110  65 78 74 2f 68 74 6d 6c - 3b 20 63 68 61 72 73 65   ext/html; charse
0120  74 3d 49 53 4f 2d 38 38 - 35 39 2d 31 0d 0a 0d 0a   t=ISO-8859-1....
-----------------------[ HTTP Header Ascii Buffer ]-----------------------
HTTP/1.1 200 OK
Date: Thu, 12 Sep 2002 19:41:59 GMT
Server: Apache/2.0.39 (Unix) mod_ssl/2.0.39 OpenSSL/0.9.6 PHP/4.2.1
Last-Modified: Tue, 03 Sep 2002 17:41:11 GMT
ETag: "1e55f-42-d36c3bc0"
Accept-Ranges: bytes
Content-Length: 66
Connection: close
Content-Type: text/html; charset=ISO-8859-1


-----------------------[       HTML Buffer        ]-----------------------
0000  3c 48 54 4d 4c 3e 0a 3c - 42 4f 44 59 3e 0a 54 68   <HTML>.<BODY>.Th
0010  69 73 20 69 73 20 61 20 - 74 65 73 74 20 70 61 67   is is a test pag
0020  65 20 66 6f 72 20 6d 6f - 6e 69 74 6f 72 69 6e 67   e for monitoring
0030  2e 0a 3c 2f 42 4f 44 59 - 3e 0a 3c 2f 48 54 4d 4c   ..</BODY>.</HTML
0040  3e 0a                   -                           >.
-----------------------[    HTML MD5 resulting    ]-----------------------
0000  42 28 34 d1 d2 b9 72 ee - e9 e5 b8 75 e4 bd 8c 33   B(4...r....u...3
-----------------------[ HTML MD5 final resulting ]-----------------------
422834d1d2b972eee9e5b875e4bd8c33

example-01:~#

That very last string is what you need to keep track, as you will use this in your service check setup below.

Now for the config file:

-- cut here --
! This is a comment
! Configuration File for keepalived


global_defs {
   ! this is who emails will go to on alerts
   notification_email {
        admins@example.com
    fakepager@example.com
    ! add a few more email addresses here if you would like
   }
   notification_email_from admins@example.com

   ! I use the local machine to relay mail
   smtp_server 127.0.0.1
   smtp_connect_timeout 30

   ! each load balancer should have a different ID
   ! this will be used in SMTP alerts, so you should make
   ! each router easily identifiable
   lvs_id LVS_EXAMPLE_01
}

! takes over, you want the internal interface on the failed server
! to failover as well, otherwise nothing will work.
! you can have as many vrrp_sync_group blocks as you want.
vrrp_sync_group VG1 {
   group {
      VI_1
      VI_GATEWAY
   }
}

! now we setup more information about are virtual server
! we are just setting up one for now, listening on port 22 for ssh
! requests.
! each interface needs at least one vrrp_instance
! each vrrp_instance is a group of VIPs that are logically grouped
! together
! you can have as many vrrp_instaces as you want

vrrp_instance VI_1 {
        state MASTER
        interface eth0    
        lvs_sync_daemon_inteface eth0
    ! each virtual router id must be unique per instance name!
        virtual_router_id 51
        priority 150
    ! how often should we vote, in seconds?
        advert_int 1
        smtp_alert      
        authentication {
                auth_type PASS
                auth_pass example
        }
        virtual_ipaddress {
                192.168.1.11
        192.168.1.12
        ! and more if you want them
        }
}

! now I setup the instance that the real servers will use as a default
! gateway
! most of the config is the same as above, but on a different interface

vrrp_instance VI_GATEWAY {
        state MASTER
        interface eth1
        lvs_sync_daemon_inteface eth1
        virtual_router_id 52
        priority 150
        advert_int 1
        smtp_alert
        authentication {
                auth_type PASS
                auth_pass example
        }
        virtual_ipaddress {
                10.20.40.1
        }
}

! vrrp_sync_groups make sure that several router instances
! stay together on a failure - a good example of this is
! that the external interface on one router fails and the backup server

! notice we do not setup a virtual_server block for the 10.20.40.1
! address in the VI_GATEWAY instance. That's because we are doing NAT
! on that IP, and nothing else.


virtual_server 192.168.1.12 22 {
    delay_loop 6

    ! use round-robin as a load balancing algorithm
    lb_algo rr

    ! we are doing NAT
    lb_kind NAT
    nat_mask 255.255.255.0

    protocol TCP

    ! there can be as many real_server blocks as you need

    real_server 10.20.40.20 22 {

    ! if we used weighted round-robin or a similar lb algo,
    ! we include the weight of this server

        weight 1

    ! here is a health checker for this server.
    ! we could use a custom script here (see the keepalived docs)
    ! but we will just make sure we can do a vanilla tcp connect()
    ! on port 22
    ! if it fails, we will pull this realserver out of the pool
    ! and send email about the removal
        TCP_CHECK {
                connect_timeout 3
        connect_port 22
        }
    }
    real_server 10.20.40.21 22 {

    ! if we used weighted round-robin or a similar lb algo,
    ! we include the weight of this server

        weight 1
        TCP_CHECK {
                connect_timeout 3
        connect_port 22
        }
    }
}

virtual_server 192.168.1.11 80 {
    delay_loop 10
    lb_algo rr
    lb_kind NAT
    nat_mask 255.255.255.0
    protocol TCP

! use this to specify which host keepalived asks for during an HTTP GET
    virtualhost www.example.com

    real_server 10.20.40.10 80 {
        weight 1
        HTTP_GET {

        ! for the path, don't include the host if you use
        ! a virtualhost
                url {
                        path /test.html
            
            ! the results from genhash go here
                        digest 422834d1d2b972eee9e5b875e4bd8c33
                }
                connect_timeout 10
                connect_port 80
                ! keepalived will retry this many times before a failure
                ! is marked
                nb_get_retry 3
                ! each retry will occur after this delay
                delay_before_retry 10
        }
    }
    real_server 10.20.40.11 80 {
        weight 1
        HTTP_GET {
                url {
                        path /test.html
                        digest 422834d1d2b972eee9e5b875e4bd8c33
                }
                connect_timeout 10
                nb_get_retry 3
                delay_before_retry 10
                connect_port 80
        }
    }
}

virtual_server 192.168.1.11 443 {
    delay_loop 10
    lb_algo rr
    lb_kind NAT
    nat_mask 255.255.255.0
    protocol TCP
    virtualhost www.example.com
    real_server 10.20.40.10 443 {
        weight 1
        SSL_GET {
                url {
                        path /test.html
                        digest 422834d1d2b972eee9e5b875e4bd8c33
                }
                connect_timeout 10
                connect_port 80
                nb_get_retry 3
                delay_before_retry 10
        }
    }
    real_server 10.20.40.11 443 {
        weight 1
        SSL_GET {
                url {
                        path /test.html
                        digest 422834d1d2b972eee9e5b875e4bd8c33
                }
                connect_timeout 10
                nb_get_retry 3
                delay_before_retry 10
                connect_port 80
        }
    }
}
! that's all

-- cut here --

Okay let's see what keepalived -d shows us:

Sep 12 14:13:11 example-01 Keepalived: ------< Global definitions >------
Sep 12 14:13:11 example-01 Keepalived:  LVS ID = LVS_EXAMPLE_01
Sep 12 14:13:11 example-01 Keepalived:  Smtp server = 127.0.0.1
Sep 12 14:13:11 example-01 Keepalived:  Smtp server connection timeout = 100
Sep 12 14:13:11 example-01 Keepalived:  Email notification from = admins@example.com, fakepager@example.com
Sep 12 14:13:11 example-01 Keepalived:  Email notification = admins@example.com
Sep 12 14:13:11 example-01 Keepalived: ------< SSL definitions >------
Sep 12 14:13:11 example-01 Keepalived:  Using autogen SSL context
Sep 12 14:13:11 example-01 Keepalived: ------< VRRP Topology >------
Sep 12 14:13:11 example-01 Keepalived:  VRRP Instance = VI_1
Sep 12 14:13:11 example-01 Keepalived:    Want State = MASTER
Sep 12 14:13:11 example-01 Keepalived:    Runing on device = eth0
Sep 12 14:13:11 example-01 Keepalived:    Virtual Router ID = 51
Sep 12 14:13:11 example-01 Keepalived:    Priority = 150
Sep 12 14:13:11 example-01 Keepalived:    Advert interval = 1sec
Sep 12 14:13:11 example-01 Keepalived:    Preempt Active
Sep 12 14:13:11 example-01 Keepalived:    Authentication type = SIMPLE_PASSWORD
Sep 12 14:13:11 example-01 Keepalived:    Password = example
Sep 12 14:13:11 example-01 Keepalived:    VIP count = 2
Sep 12 14:13:11 example-01 Keepalived:      VIP1 = 192.168.1.11/32
Sep 12 14:13:11 example-01 Keepalived:      VIP1 = 192.168.1.12/32
Sep 12 14:13:11 example-01 Keepalived:  VRRP Instance = VI_GATEWAY
Sep 12 14:13:11 example-01 Keepalived:    Want State = MASTER
Sep 12 14:13:11 example-01 Keepalived:    Runing on device = eth1
Sep 12 14:13:11 example-01 Keepalived:    Virtual Router ID = 52
Sep 12 14:13:11 example-01 Keepalived:    Priority = 150
Sep 12 14:13:11 example-01 Keepalived:    Advert interval = 1sec
Sep 12 14:13:11 example-01 Keepalived:    Preempt Active
Sep 12 14:13:11 example-01 Keepalived:    Authentication type = SIMPLE_PASSWORD
Sep 12 14:13:11 example-01 Keepalived:    Password = example
Sep 12 14:13:11 example-01 Keepalived:    VIP count = 1
Sep 12 14:13:11 example-01 Keepalived:      VIP1 = 10.20.40.1/32
Sep 12 14:13:11 example-01 Keepalived: ------< VRRP Sync groups >------
Sep 12 14:13:11 example-01 Keepalived:  VRRP Sync Group = VG1, MASTER
Sep 12 14:13:11 example-01 Keepalived:    monitor = VI_1
Sep 12 14:13:11 example-01 Keepalived:    monitor = VI_GATEWAY
Sep 12 14:13:11 example-01 Keepalived: ------< LVS Topology >------
Sep 12 14:13:11 example-01 Keepalived:  System is compiled with LVS v1.0.4
Sep 12 14:13:11 example-01 Keepalived:  VIP = 192.168.1.11, VPORT = 22
Sep 12 14:13:11 example-01 Keepalived:    delay_loop = 10, lb_algo = rr
Sep 12 14:13:11 example-01 Keepalived:    protocol = TCP
Sep 12 14:13:11 example-01 Keepalived:    lb_kind = NAT
Sep 12 14:13:11 example-01 Keepalived:    RIP = 10.20.40.20, RPORT = 22, WEIGHT = 1
Sep 12 14:13:11 example-01 Keepalived:    RIP = 10.20.40.21, RPORT = 22, WEIGHT = 1
Sep 12 14:13:11 example-01 Keepalived:  VIP = 192.168.1.12, VPORT = 80
Sep 12 14:13:11 example-01 Keepalived:    VirtualHost = www.example.com
Sep 12 14:13:11 example-01 Keepalived:    delay_loop = 10, lb_algo = rr
Sep 12 14:13:11 example-01 Keepalived:    protocol = TCP
Sep 12 14:13:11 example-01 Keepalived:    lb_kind = NAT
Sep 12 14:13:11 example-01 Keepalived:    RIP = 10.20.40.10, RPORT = 80, WEIGHT = 1
Sep 12 14:13:11 example-01 Keepalived:    RIP = 10.20.40.11, RPORT = 80, WEIGHT = 1
Sep 12 14:13:11 example-01 Keepalived:  VIP = 192.168.1.12, VPORT = 443
Sep 12 14:13:11 example-01 Keepalived:    VirtualHost = www.example.com
Sep 12 14:13:11 example-01 Keepalived:    delay_loop = 10, lb_algo = rr
Sep 12 14:13:11 example-01 Keepalived:    persistence timeout = 360
Sep 12 14:13:11 example-01 Keepalived:    protocol = TCP
Sep 12 14:13:11 example-01 Keepalived:    lb_kind = NAT
Sep 12 14:13:11 example-01 Keepalived:    RIP = 10.20.40.10, RPORT = 443, WEIGHT = 1
Sep 12 14:13:11 example-01 Keepalived:    RIP = 10.20.40.11, RPORT = 443, WEIGHT = 1
Sep 12 14:13:11 example-01 Keepalived: ------< Health checkers >------
Sep 12 14:13:11 example-01 Keepalived:  10.20.40.20:22
Sep 12 14:13:11 example-01 Keepalived:    Keepalive method = TCP_CHECK
Sep 12 14:13:11 example-01 Keepalived:    Connection timeout = 10
Sep 12 14:13:11 example-01 Keepalived:    Connection port = 22
Sep 12 14:13:11 example-01 Keepalived:  10.20.40.21:22
Sep 12 14:13:11 example-01 Keepalived:    Keepalive method = TCP_CHECK
Sep 12 14:13:11 example-01 Keepalived:    Connection timeout = 10
Sep 12 14:13:11 example-01 Keepalived:    Connection port = 22
Sep 12 14:13:11 example-01 Keepalived:  10.20.40.10:80
Sep 12 14:13:11 example-01 Keepalived:    Keepalive method = HTTP_GET
Sep 12 14:13:11 example-01 Keepalived:    Connection port = 80
Sep 12 14:13:11 example-01 Keepalived:    Connection timeout = 10
Sep 12 14:13:11 example-01 Keepalived:    Nb get retry = 3
Sep 12 14:13:11 example-01 Keepalived:    Delay before retry = 10
Sep 12 14:13:11 example-01 Keepalived:    Checked url = /test.html, digest = 422834d1d2b972eee9e5b875e4bd8c33
Sep 12 14:13:11 example-01 Keepalived:  10.20.40.11:80
Sep 12 14:13:11 example-01 Keepalived:    Keepalive method = HTTP_GET
Sep 12 14:13:11 example-01 Keepalived:    Connection port = 80
Sep 12 14:13:11 example-01 Keepalived:    Connection timeout = 10
Sep 12 14:13:11 example-01 Keepalived:    Nb get retry = 3
Sep 12 14:13:11 example-01 Keepalived:    Delay before retry = 10
Sep 12 14:13:11 example-01 Keepalived:    Checked url = /test.html, digest = 422834d1d2b972eee9e5b875e4bd8c33
Sep 12 14:13:11 example-01 Keepalived:  10.20.40.10:443
Sep 12 14:13:11 example-01 Keepalived:    Keepalive method = SSL_GET
Sep 12 14:13:11 example-01 Keepalived:    Connection port = 443
Sep 12 14:13:11 example-01 Keepalived:    Connection timeout = 10
Sep 12 14:13:11 example-01 Keepalived:    Nb get retry = 3
Sep 12 14:13:11 example-01 Keepalived:    Delay before retry = 10
Sep 12 14:13:11 example-01 Keepalived:    Checked url = /test.html, digest = 422834d1d2b972eee9e5b875e4bd8c33
Sep 12 14:13:11 example-01 Keepalived:  10.20.40.11:443
Sep 12 14:13:11 example-01 Keepalived:    Keepalive method = SSL_GET
Sep 12 14:13:11 example-01 Keepalived:    Connection port = 443
Sep 12 14:13:11 example-01 Keepalived:    Connection timeout = 10
Sep 12 14:13:11 example-01 Keepalived:    Nb get retry = 3
Sep 12 14:13:11 example-01 Keepalived:    Delay before retry = 10
Sep 12 14:13:11 example-01 Keepalived:    Checked url = /test.html, digest = 422834d1d2b972eee9e5b875e4bd8c33

And ipvasdm:

example-01:~# ipvsadm
IP Virtual Server version 1.0.4 (size=65536)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  192.168.1.12:ssh rr
  -> 10.20.40.20:ssh        Masq    1      0          0
  -> 10.20.40.21:ssh               Masq    1      0          1
TCP  192.168.1.11:http rr
  -> 10.20.40.10:http         Masq    1      0          0
  -> 10.20.40.11:http         Masq    1      0          0
TCP  192.168.1.11:http rr persistent 360
  -> 10.20.40.10:https            Masq    1      0          0
  -> 10.20.40.11:https            Masq    1      1          0

example-01:~ # ip addr list
1: lo: <LOOPBACK,UP> mtu 16436 qdisc noqueue
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 brd 127.255.255.255 scope host lo
2: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 100
    link/ether 00:e0:81:21:bb:1c brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.9/24 brd 192.168.1.254 scope global eth0
    inet 192.168.1.11/32 scope global eth0
    inet 192.168.1.12/32 scope global eth0
3: eth1: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 100
    link/ether 00:e0:81:21:bb:1d brd ff:ff:ff:ff:ff:ff
    inet 10.20.40.2/24 brd 10.20.40.255 scope global eth1
    inet 10.20.40.1/32 scope global eth1

Remeber, if we wanted to put this in a failover config, we could just add another box with the same config file (modified with a different lvs_id, state and priority) and start up keepalived on the backup box.

9. Example startup script for SuSE 8.0

#! /bin/sh
# Copyright (c) 1995-2002 SuSE Linux AG, Nuernberg, Germany.
# All rights reserved.
#
# Author of template: Kurt Garloff <feedback@suse.de>
# Modified for keepalived by Adam Fletcher <adamf+keepalived@csh.rit.edu>
#
# /etc/init.d/keepalived
#
#
# LSB compliant service control script; see http://www.linuxbase.org/spec/
#
# System startup script for some example service or daemon keepalived (template)
#
### BEGIN INIT INFO
# Provides: keepalived
# Required-Start: $remote_fs $syslog
# Required-Stop:  $remote_fs $syslog
# Default-Start:  3 5
# Default-Stop:   0 1 2 6
# Description:    Start keepalived to allow XY and provide YZ
#       continued on second line by '#<TAB>'
### END INIT INFO
#
# Note on Required-Start: It does specify the init script ordering,
# not real dependencies. Depencies have to be handled by admin
# resp. the configuration tools (s)he uses.

# Source SuSE config (if still necessary, most info has been moved)
test -r /etc/rc.config && . /etc/rc.config

# Check for missing binaries (stale symlinks should not happen)
KEEPALIVED_BIN=/usr/local/sbin/keepalived
test -x $KEEPALIVED_BIN || exit 5
# Shell functions sourced from /etc/rc.status:
#      rc_check         check and set local and overall rc status
#      rc_status        check and set local and overall rc status
#      rc_status -v     ditto but be verbose in local rc status
#      rc_status -v -r  ditto and clear the local rc status
#      rc_failed        set local and overall rc status to failed
#      rc_failed <num>  set local and overall rc status to <num><num>
#      rc_reset         clear local rc status (overall remains)
#      rc_exit          exit appropriate to overall rc status
#      rc_active        checks whether a service is activated by symlinks
. /etc/rc.status

# First reset status of this service
rc_reset

# Return values acc. to LSB for all commands but status:
# 0 - success
# 1 - generic or unspecified error
# 2 - invalid or excess argument(s)
# 3 - unimplemented feature (e.g. "reload")
# 4 - insufficient privilege
# 5 - program is not installed
# 6 - program is not configured
# 7 - program is not running
#
# Note that starting an already running service, stopping
# or restarting a not-running service as well as the restart
# with force-reload (in case signalling is not supported) are
# considered a success.

case "$1" in
    start)
        echo -n "Starting keepalived"
        ## Start daemon with startproc(8). If this fails
        ## the echo return value is set appropriate.

        # NOTE: startproc returns 0, even if service is
        # already running to match LSB spec.
        startproc $KEEPALIVED_BIN -d


        # Remember status and be verbose
        rc_status -v
        ;;
    stop)
        echo -n "Shutting down keepalived"
        ## Stop daemon with killproc(8) and if this fails
        ## set echo the echo return value.

        killproc -TERM $KEEPALIVED_BIN
        # masquerade the rest of the 10.20.40.0/24 network
        # through the external IP

        # Remember status and be verbose
        rc_status -v
        ;;
    try-restart)
        ## Stop the service and if this succeeds (i.e. the
        ## service was running before), start it again.
        ## Note: try-restart is not (yet) part of LSB (as of 0.7.5)
        $0 status >/dev/null &&  $0 restart

        # Remember status and be quiet
        rc_status
        ;;
    restart)
        ## Stop the service and regardless of whether it was
        ## running or not, start it again.
        $0 stop
        $0 start

        # Remember status and be quiet
        rc_status
        ;;
    force-reload)
        ## Signal the daemon to reload its config. Most daemons
        ## do this on signal 1 (SIGHUP).
        ## If it does not support it, restart.

        echo -n "Reload service keepalived"
        ## if it supports it:
        killproc -HUP $KEEPALIVED_BIN
        touch /var/run/keepalived.pid
        rc_status -v

        ## Otherwise:
        #$0 stop  &&  $0 start
        #rc_status
        ;;
    reload)
        ## Like force-reload, but if daemon does not support
        ## signalling, do nothing (!)

        # If it supports signalling:
        echo -n "Reload service keepalived"
        killproc -HUP $KEEPALIVED_BIN
        touch /var/run/keepalived.pid
        rc_status -v

        ## Otherwise if it does not support reload:
        #rc_failed 3
        #rc_status -v
        ;;
    status)
        echo -n "Checking for service keepalived: "
        ## Check status with checkproc(8), if process is running
        ## checkproc will return with exit status 0.

        # Return value is slightly different for the status command:
        # 0 - service running
        # 1 - service dead, but /var/run/  pid  file exists
        # 2 - service dead, but /var/lock/ lock file exists
        # 3 - service not running

        # NOTE: checkproc returns LSB compliant status values.
        checkproc $KEEPALIVED_BIN
        rc_status -v
        ;;
    probe)
        ## Optional: Probe for the necessity of a reload,
        ## print out the argument which is required for a reload.

        test /etc/keepalived/keepalived.conf -nt /var/run/keepalived.pid && echo reload
        ;;
    *)
        echo "Usage: $0 {start|stop|status|try-restart|restart|force-reload|reload|probe}"
        exit 1
        ;;
esac
rc_exit

10. Troubleshooting

Important caveats:
These may seem obvious, but they are all things that happened to me during my setup of keepalived.

If you have further questions, and have read this and all other keepalive documentation, please subscribe to the the keepalived-devel mailing list, available at http://www.keepalived.org/mailinglist.html

11. Credits

Keepalived is written and maintained by Alexandre Cassen. It is primarily through his effort that it exists at all, and I thank him for that efffort.

This HOWTO is written by Adam Fletcher, adamf+keepalived@csh.rit.edu.

This document is Copyright 2002 by Adam Fletcher Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts. A copy of the license is available athttp://www.gnu.org/copyleft/fdl.html