Usefull links

Software Design

Keepalived is integrally written is pure ANSI/ISO C. The software is articulated around a central I/O multiplexer that provide realtime networking design. The main design focus were to provide an homogene modularity between all elements, this why a core library were created to remove code duplication. On the other hand, the goal were to produce a safe and secure code to ensure production robustness and stability.

To ensure robustness and stability, daemon is split into 3 distinct processes. The global design is based on a minimalistic parent process in charge with forked children process monitoring. Then 2 children processes, one responsible for VRRP framework and the other for healthchecking. Each children process has its own scheduling I/O multiplexer, that way VRRP scheduling jitter is optimized since VRRP scheduling is more sensible/critical than healthcheckers. On the other hand this split design minimalize for healthchecking the usage of foreign librairies and minimalize its own action down to and idle mainloop in order to avoid malfunctions caused by itself. The parent process monitoring framework is called watchdog, the design is : each children process open an accept unix domain socket, then while daemon bootstrap, parent process connect to those unix domain socket and send periodic (5s) hello packets to children. If parent cannot send hello packet to remote connected unix domain socket it simply restart children process. This watchdog design offers 2 benefits, first of all hello packets sent from parent process to remote connected children is done throught I/O multiplexer scheduler that way it can detect deadloop in the children scheduling framework. The second benefit is brought by the uses of sysV signal to detect dead children. When running you will see in process list :

  PID  
  111 Keepalived <-- Parent process monitoring children
  112 \_ Keepalived <-- VRRP child
  113 \_ Keepalived <-- Healthchecking child

All the atomic elements are introduced bellow :

Control Plane :
Keepalived configuration is done throught the file keepalived.conf. A compiler design is used for parsing. Parser work with a keyword tree hierarchy for mapping each configuration keyword with specifics handler. A central multi-level recursive function read the configuration file and traverse the keyword tree. During parsing, configuration file is translated into an internal memory representation.
Scheduler - I/O Multiplexer :
All the event are scheduled into the same process. Keepalived is a single process. Keepalived is a network routing software, it is so closed to I/O. The design used here is a central select(...) that is in charge of scheduling all internal task. POSIX thread libs are NOT used. This framework provide its own thread abstraction optimized for networking purpose.
Memory Management :
This framework provides acces to some generic memory managements functions like allocation, reallocation, release,... This framework can be used in two mode : normal_mode & debug_mode. When using debug_mode it provide a strong way to eradicate and track memory leaks. This low level env provide buffer under-run protection by tracking allocation memory and released. All the buffer used are length fixed to prevent against eventual buffer-overflow.
Core components :
This framework define some common and global libraries that are used in all the code. Those libraries are : html parsing, link-list, timer, vector, string formating, buffer dump, networking utils, daemon management, pid handling, low level TCP layer4. The goal here is to factorize code to the max to limite as possible code duplication to increase modularity.

WatchDog :
This framework provide children processes monitoring (VRRP & Healthchecking). Each child accept connection to its own watchdog unix domain socket. Parent process send "hello" messages to this child unix domain socket. Hello messages are sent using I/O multiplexer on the parent side and accepted/processed using I/O multiplexer on children side. If parent detect broken pipe it test using sysV signal if child is still alive and restart it.

Checkers :
This is one of the main Keepalived functionnality. Checkers are in charge of realserver healthchecking. A checker test if realserver is alive, this test end on a binary decision : remove or add realserver from/into the LVS topology. The internal checker design is realtime networking software, it use a fully multi-threaded FSM design (Finite State Machine). This checker stack provide LVS topology manipulation accoring to layer4 to layer5/7 test results. Its run in an independent process monitored by parent process.
VRRP Stack :
The other most important Keepalived functionnality. VRRP (Virtual Router Redundancy Protocol : RFC2338) is focused on director takeover, it provide low-level design for router backup. It implements full IETF RFC2338 standard with some provisions and extensions for LVS and Firewall design. It implements the vrrp_sync_group extension that guarantee persistence routing path after protocol takeover. It implements IPSEC-AH using MD5-96bit crypto provision for securing protocol adverts exchange. For more informations on VRRP please read the RFC. Important things : VRRP code can be used without the LVS support, it has been designed for independant use.Its run in an independent process monitored by parent process.
System call :
This framework offer the ability to launch extra system script. It is mainly used in the MISC checker. In VRRP framework it provides the ability to launch extra script during protocol state transition. The system call is done into a forked process to not pertube the global scheduling timer.

SMTP :
The SMTP protocol is used for administration notification. It implements the IETF RFC821 using a multi-threaded FSM design. Administration notifications are sent for healthcheckers activities and VRRP protocol state transition. SMTP is commonly used and can be interfaced with any other notification sub-system such as GSM-SMS, pagers, ...
Netlink Reflector :
Same as IPVS wrapper. Keepalived work with its own network interface representation. IP address and interface flags are set and monitored through kernel Netlink channel. The Netlink messaging sub-system is used for setting VRRP VIPs. On the other hand, the Netlink kernel messaging broadcast capability is used to reflect into our userspace Keepalived internal data representation any events related to interfaces. So any other userspace (others program) netlink manipulation is reflected to our Keepalived data representation via Netlink Kernel broadcast (RTMGRP_LINK & RTMGRP_IPV4_IFADDR).
IPVS wrapper :
This framework is used for sending rules to the Kernel IPVS code. It provides translation between Keepalived internal data representation and IPVS rule_user representation. It uses the IPVS libipvs to keep generic integration with IPVS code.
IPVS :
The Linux Kernel code provided by Wensong from LinuxVirtualServer.org OpenSource Project.

NETLINK :
The Linux Kernel code provided by Alexey Kuznetov with its very nice advanced routing framework and sub-system capabilities.