HAProxy

The Reliable, High Performance TCP/HTTP Load Balancer

  Mirror Sites: Master
  Language: English

Quick links

Quick News
Recent News
Description
Design Choices
Supported Platforms
Performance
Reliability
Security
Download
Documentation
Live demo
They use it!
Commercial Support
Products using HAProxy
Add-on features
Other Solutions
Contacts
External links
Mailing list archives
10GbE load-balancing (updated)
Contributions
Coding style
Known bugs

Web Based User Interface
HATop: Ncurses Interface



visitors online
 
Thanks for your support !




Quick News

Apr 23th, 2014 : 1.5-dev23

Feb 3rd, 2014 : 1.5-dev22

    The whole polling system was finally reworked to replace the speculative I/O model designed 7 years ago with a fresh new event cache. This was needed because the OpenSSL API is not totally compatible with a polled I/O model since it stores data into its internal buffers. One of the benefits is that despite the complexity of the operation, the code is now less tricky and much safer. HTTP keep-alive is now enabled by default when no other option is mentionned, as it is what users expect by default when they don't specify anything and end up with tunnel mode. A new option "tunnel" was thus added for users who would find a benefit in using tunnel mode. After an HTTP 401 or 407 response, we automatically stick to the same server so there's normally no need for "option prefer-last-server" anymore. SSL buffer size are automatically adjusted to save round trips as suggested by Ilya Grigorik, reducing the handshake time by up to 3. The CLI reports more info in "show info" and now supports "show pools" to check memory usage. SSL supports "maxsslrate" to protect the SSL stack against connection rushes. The "tcp-check" directive now supports a "connect" action, enabling multi-port checking. Very few changes are pending before 1.5-final : ACL/map merge (being reviewed), HTTP body analyzer fixes (in progress), agent check update (started), per-listener process binding (experimentations completed). Source code is available here.

Recent news...

Latest versions

BranchDescriptionLast versionReleasedLinksNotes
Development Development 1.5-dev23 2014/04/23 git / web / dir may be broken
1.4 1.4-stable 1.4.25 2014/03/27 git / web / dir Stable version
1.3 1.3-stable 1.3.26 2011/08/05 git / web / dir Critical fixes only
1.3.15 1.3.15-maint 1.3.15.13 2011/08/05 git / web / dir Critical fixes only
1.3.14 1.3.14-maint 1.3.14.14 2009/07/27 git / web / dir Unmaintained
1.2 1.2-stable 1.2.18 2008/05/25 git / web / dir Unmaintained
1.1 1.1-stable 1.1.34 2006/01/29 git / web / dir Unmaintained
1.0 1.0-old 1.0.2 2001/12/30 git / web / dir Unmaintained

Description

HAProxy is a free, very fast and reliable solution offering high availability, load balancing, and proxying for TCP and HTTP-based applications. It is particularly suited for web sites crawling under very high loads while needing persistence or Layer7 processing. Supporting tens of thousands of connections is clearly realistic with todays hardware. Its mode of operation makes its integration into existing architectures very easy and riskless, while still offering the possibility not to expose fragile web servers to the Net, such as below :

Currently, two major versions are supported :

  • version 1.4 - more flexibility
    This version has brought its share of new features over 1.2, most of which were long awaited : client-side keep-alive to reduce the time to load heavy pages for clients over the net, TCP speedups to help the TCP stack save a few packets per connection, response buffering for an even lower number of concurrent connections on the servers, RDP protocol support with server stickiness and user filtering, source-based stickiness to attach a source address to a server, a much better stats interface reporting tons of useful information, more verbose health checks reporting precise statuses and responses in stats and logs, traffic-based health to fast-fail a server above a certain error threshold, support for HTTP authentication for any request including stats, with support for password encryption, server management from the CLI to enable/disable and change a server's weight without restarting haproxy, ACL-based persistence to maintain or disable persistence based on ACLs, regardless of the server's state, log analyzer to generate fast reports from logs parsed at 1 Gbyte/s,
  • version 1.3 - content switching and extreme loads
    This version has brought a lot of new features and improvements over 1.2, among which content switching to select a server pool based on any request criteria, ACL to write content switching rules, wider choice of load-balancing algorithms for better integration, content inspection allowing to block unexpected protocols, transparent proxy under Linux, which allows to directly connect to the server using the client's IP address, kernel TCP splicing to forward data between the two sides without copy in order to reach multi-gigabit data rates, layered design separating sockets, TCP and HTTP processing for more robust and faster processing and easier evolutions, fast and fair scheduler allowing better QoS by assigning priorities to some tasks, session rate limiting for colocated environments, etc...

Version 1.2 has been in production use since 2006 and provided an improved performance level on top of 1.1. It is not maintained anymore, as most of its users have switched to 1.3 a long time ago. Version 1.1, which has been maintaining critical sites online since 2002, is not maintained anymore either. Users should upgrade to 1.4.

For a long time, HAProxy was used only by a few hundreds of people around the world running very big sites serving several millions hits and between several tens of gigabytes to several terabytes per day to hundreds of thousands of clients, who needed 24x7 availability and who had internal skills to risk to maintain a free software solution. Over the years, things have changed a bit, HAProxy has become the de-facto standard load balancer, and it's often installed by default in cloud environments. Since it does not advertise itself, we only know it's used when the admins report it :-)

Design Choices and history

HAProxy implements an event-driven, single-process model which enables support for very high number of simultaneous connections at very high speeds. Multi-process or multi-threaded models can rarely cope with thousands of connections because of memory limits, system scheduler limits, and lock contention everywhere. Event-driven models do not have these problems because implementing all the tasks in user-space allows a finer resource and time management. The down side is that those programs generally don't scale well on multi-processor systems. That's the reason why they must be optimized to get the most work done from every CPU cycle.

It began in 1996 when I wrote Webroute, a very simple HTTP proxy able to set up a modem access. But its multi-process model cloberred its performance for other usages than home access. Two years later, in 1998, I wrote the event-driven ZProx, used to compress TCP traffic to accelerate modem lines. It was when I first understood the difficulty of event-driven models. In 2000, while benchmarking a buggy application, I heavily modified ZProx to introduce a very dirty support for HTTP header rewriting. HAProxy's ancestor was born. First versions did not perform the load-balancing themselves, but it quickly proved necessary.

Now in 2009, the core engine is reliable and very robust. Event-driven programs are robust and fragile at the same time : their code needs very careful changes, but the resulting executable handles high loads and supports attacks without ever failing. It is the reason why HAProxy only supports a finite set of features. HAProxy has never ever crashed in a production environment. This is something people are not used to nowadays, because the most common things new users tell me is that they're amazed it has never crashed ;-)

People often ask for SSL and Keep-Alive support. Both features will complicate the code and render it fragile for several releases. By the way, both features have a negative impact on performance :

  • Having SSL in the load balancer itself means that it becomes the bottleneck. When the load balancer's CPU is saturated, the overall response times will increase and the only solution will be to multiply the load balancer with another load balancer in front of them. the only scalable solution is to have an SSL/Cache layer between the clients and the load balancer. Anyway for small sites it still makes sense to embed SSL, and it's currently being studied. There has been some work on the CyaSSL library to ease integration with HAProxy, as it appears to be the only one out there to let you manage your memory yourself. Update [2012/09/11] : native SSL support was implemented in 1.5-dev12. The points above about CPU usage are still valid though.
  • Keep-alive was invented to reduce CPU usage on servers when CPUs were 100 times slower. But what is not said is that persistent connections consume a lot of memory while not being usable by anybody except the client who openned them. Today in 2009, CPUs are very cheap and memory is still limited to a few gigabytes by the architecture or the price. If a site needs keep-alive, there is a real problem. Highly loaded sites often disable keep-alive to support the maximum number of simultaneous clients. The real downside of not having keep-alive is a slightly increased latency to fetch objects. Browsers double the number of concurrent connections on non-keepalive sites to compensate for this. With version 1.4, keep-alive with the client was introduced. It resulted in lower access times to load pages composed of many objects, without the cost of maintaining an idle connection to the server. It is a good trade-off. 1.5 will bring keep-alive to the server, but it will probably make sense only with static servers.

However, I'm planning on implementing both features in future versions, because it appears that there are users who mostly need availability above performance, and for them, it's understandable that having both features will not impact their performance, and will reduce the number of components.

Supported platforms

HAProxy is known to reliably run on the following OS/Platforms :

Highest performance should be achieved with haproxy versions newer than 1.2.5 running on Linux 2.6, or epoll-patched Linux kernel 2.4. It is only because of a very OS-specific optimization : the default polling system for version 1.1 is select(), which is common among most OSes, but can become slow when dealing with thousands of file-descriptors. Versions 1.2 and 1.3 uses poll() by default instead of select(), but on some systems it may even be slower. However, it is recommended on Solaris as its implementation is rather good. Haproxy 1.3 will automatically use epoll on Linux 2.6 and patched Linux 2.4, and kqueue on FreeBSD and OpenBSD. Both mechanisms achieve constant performance at any load thus are preferred over poll().

On very recent Linux 2.6 (>= 2.6.27.19), HAProxy can use the new splice() syscall to forward data between interfaces without any copy. Performance above 10 Gbps may only be achieved that way.

Based on those facts, people looking for a very fast load balancer should consider the following options on x86 or x86_64 hardware, in this order :

  1. HAProxy 1.4 on Linux 2.6.32+
  2. HAProxy 1.4 on Linux 2.4 + epoll patch
  3. HAProxy 1.4 on FreeBSD
  4. HAProxy 1.4 on Solaris 10

Current typical 1U servers equipped with a dual-core Opteron or Xeon generally achieve between 15000 and 40000 hits/s and have no trouble saturating 2 Gbps under Linux.

Performance

Well, since a user's testimony is better than a long demonstration, please take a look at Chris Knight's experience with haproxy saturating a gigabit fiber on a video download site. Another big data provider I know constantly pushes between 3 and 4 Gbps of traffic 24 hours a day. Also, my experiments with Myricom's 10-Gig NICs might be of interest.

HAProxy involves several techniques commonly found in Operating Systems architectures to achieve the absolute maximal performance :

  • a single-process, event-driven model considerably reduces the cost of context switch and the memory usage. Processing several hundreds of tasks in a millisecond is possible, and the memory usage is in the order of a few kilobytes per session while memory consumed in Apache-like models is more in the order of megabytes per process.
  • O(1) event checker on systems that allow it (Linux and FreeBSD) allowing instantaneous detection of any event on any connection among tens of thousands.
  • Single-buffering without any data copy between reads and writes whenever possible. This saves a lot of CPU cycles and useful memory bandwidth. Often, the bottleneck will be the I/O busses between the CPU and the network interfaces. At 10 Gbps, the memory bandwidth can become a bottleneck too.
  • Zero-copy forwarding is possible using the splice() system call under Linux, and results in real zero-copy starting with Linux 3.5. This allows a small sub-3 Watt device such as a Seagate Dockstar to forward HTTP traffic at one gigabit/s.
  • MRU memory allocator using fixed size memory pools for immediate memory allocation favoring hot cache regions over cold cache ones. This dramatically reduces the time needed to create a new session.
  • work factoring, such as multiple accept() at once, and the ability to limit the number of accept() per iteration when running in multi-process mode, so that the load is evenly distributed among processes.
  • tree-based storage, making heavy use of the Elastic Binary tree I have been developping for several years. This is used to keep timers ordered, to keep the runqueue ordered, to manage round-robin and least-conn queues, with only an O(log(N)) cost.
  • optimized HTTP header analysis : headers are parsed an interpreted on the fly, and the parsing is optimized to avoid an re-reading of any previously read memory area. Checkpointing is used when an end of buffer is reached with an incomplete header, so that the parsing does not start again from the beginning when more data is read. Parsing an average HTTP request typically takes 2 microseconds on a Pentium-M 1.7 GHz.
  • careful reduction of the number of expensive system calls. Most of the work is done in user-space by default, such as time reading, buffer aggregation, file-descriptor enabling/disabling.

All these micro-optimizations result in very low CPU usage even on moderate loads. And even at very high loads, when the CPU is saturated, it is quite common to note figures like 5% user and 95% system, which means that the HAProxy process consumes about 20 times less than its system counterpart. This explains why the tuning of the Operating System is very important. I personnally build my own patched Linux 2.4 kernels, and finely tune a lot of network sysctls to get the most out of a reasonable machine.

This also explains why Layer 7 processing has little impact on performance : even if user-space work is doubled, the load distribution will look more like 10% user and 90% system, which means an effective loss of only about 5% of processing power. This is why on high-end systems, HAProxy's Layer 7 performance can easily surpass hardware load balancers' in which complex processing which cannot be performed by ASICs has to be performed by slow CPUs. Here is the result of a quick benchmark performed on haproxy 1.3.9 at EXOSEC on a single core Pentium 4 with PCI-Express interfaces:

    In short, a hit rate above 10000/s is sustained for objects smaller than 6 kB, and the Gigabit/s is sustained for objects larger than 40 kB.

In production, HAProxy has been installed several times as an emergency solution when very expensive, high-end hardware load balancers suddenly failed on Layer 7 processing. Hardware load balancers process requests at the packet level and have a great difficulty at supporting requests across multiple packets and high response times because they do no buffering at all. On the other side, software load balancers use TCP buffering and are insensible to long requests and high response times. A nice side effect of HTTP buffering is that it increases the server's connection acceptance by reducing the session duration, which leaves room for new requests. New benchmarks will be executed soon, and results will be published. Depending on the hardware, expected rates are in the order of a few tens of thousands of new connections/s with tens of thousands of simultaneous connections.

There are 3 important factors used to measure a load balancer's performance :

A load balancer's performance related to these factors is generally announced for the best case (eg: empty objects for session rate, large objects for data rate). This is not because of lack of honnesty from the vendors, but because it is not possible to tell exactly how it will behave in every combination. So when those 3 limits are known, the customer should be aware that he will generally be below all of them. A good rule of thumb on software load balancers is to consider an average practical performance of half of maximal session and data rates for average sized objects.

You might be interested in checking the 10-Gigabit/s page.

Reliability - keeping high-traffic sites online since 2002

Being obsessed with reliability, I tried to do my best to ensure a total continuity of service by design. It's more difficult to design something reliable from the ground up in the short term, but in the long term it reveals easier to maintain than broken code which tries to hide its own bugs behind respawning processes and tricks like this.

In single-process programs, you have no right to fail : the smallest bug will either crash your program, make it spin like mad or freeze. There has not been any such bug found in the code nor in production for the last 10 years.

HAProxy has been installed on Linux 2.4 systems serving millions of pages every day, and which have only known one reboot in 3 years for a complete OS upgrade. Obviously, they were not directly exposed to the Internet because they did not receive any patch at all. The kernel was a heavily patched 2.4 with Robert Love's jiffies64 patches to support time wrap-around at 497 days (which happened twice). On such systems, the software cannot fail without being immediately noticed !

Right now, it's being used in several Fortune 500 companies around the world to reliably serve millions of pages per day or relay huge amounts of money. Some people even trust it so much that they use it as the default solution to solve simple problems (and I often tell them that they do it the dirty way). Such people sometimes still use versions 1.1 or 1.2 which sees very limited evolutions and which targets mission-critical usages. HAProxy is really suited for such environments because the indicators it returns provide a lot of valuable information about the application's health, behaviour and defects, which are used to make it even more reliable. Version 1.3 has now received far more testing than 1.1 and 1.2 combined, so users are strongly encouraged to migrate to a stable 1.3 for mission-critical usages.

As previously explained, most of the work is executed by the Operating System. For this reason, a large part of the reliability involves the OS itself. Recent versions of Linux 2.4 offer the highest level of stability. However, it requires a bunch of patches to achieve a high level of performance. Linux 2.6 includes the features needed to achieve this level of performance, but is not yet as stable for such usages. The kernel needs at least one upgrade every month to fix a bug or vulnerability. Some people prefer to run it on Solaris (or do not have the choice). Solaris 8 and 9 are known to be really stable right now, offering a level of performance comparable to Linux 2.4. Solaris 10 might show performances closer to Linux 2.6, but with the same code stability problem. I have too few reports from FreeBSD users, but it should be close to Linux 2.4 in terms of performance and reliability. OpenBSD sometimes shows socket allocation failures due to sockets staying in FIN_WAIT2 state when client suddenly disappears. Also, I've noticed that hot reconfiguration does not work under OpenBSD.

The reliability can significantly decrease when the system is pushed to its limits. This is why finely tuning the sysctls is important. There is no general rule, every system and every application will be specific. However, it is important to ensure that the system will never run out of memory and that it will never swap. A correctly tuned system must be able to run for years at full load without slowing down nor crashing.

Security - Not even one vulnerability in 10 years

Security is an important concern when deploying a software load balancer. It is possible to harden the OS, to limit the number of open ports and accessible services, but the load balancer itself stays exposed. For this reason, I have been very careful about programming style. The only vulnerability found so far dates early 2002 and only lasted for one week. It was introduced when logs were reworked. It could be used to cause BUS ERRORS to crash the process, but it did not seem possible to execute code : the overflow concerned only 3 bytes, too short to store a pointer (and there was a variable next).

Anyway, much care is taken when writing code to manipulate headers. Impossible state combinations are checked and returned, and errors are processed from the creation to the death of a session. A few people around the world have reviewed the code and suggested cleanups for better clarity to ease auditing. By the way, I'm used to refuse patches that introduce suspect processing or in which not enough care is taken for abnormal conditions.

I generally suggest starting HAProxy as root because it can then jail itself in a chroot and drop all of its privileges before starting the instances. This is not possible if it is not started as root because only root can execute chroot().

Logs provide a lot of information to help to maintain a satisfying security level. They can only be sent over UDP because once chrooted, the /dev/log UNIX socket is unreachable, and it must not be possible to write to a file. The following information are particularly useful :

  • source IP and port of requestor make it possible to find their origin in firewall logs ;
  • session set up date generally matches firewall logs, while tear down date often matches proxies dates ;
  • proper request encoding ensures the requestor cannot hide non-printable characters, nor fool a terminal.
  • arbitrary request and response header and cookie capture help to detect scan attacks, proxies and infected hosts.
  • timers help to differentiate hand-typed requests from browsers's.

HAProxy also provides regex-based header control. Parts of the request, as well as request and response headers can be denied, allowed, removed, rewritten, or added. This is commonly used to block dangerous requests or encodings (eg: the Apache Chunk exploit), and to prevent accidental information leak from the server to the client. Other features such as Cache-control checking ensure that no sensible information gets accidentely cached by an upstream proxy consecutively to a bug in the application server for example.

Download

The source code is covered by GPL v2. Source code and pre-compiled binaries for Linux/x86 and Solaris/Sparc can be downloaded right here :

Documentation

There are three types of documentation now : the Reference Manual which explains how to configure HAProxy but which is outdated, the Architecture Guide which will guide you through various typical setups, and the new Configuration Manual which replaces the Reference Manual with more a explicit configuration language explanation.

In addition to Cyril's HTML converter above, an automated format converter is being developed by Pavel Lang. At the time of writing these lines, it is able to produce a PDF from the documentation, and some heavy work is ongoing to support other output formats. Please consult the project's page for more information. Here's an example of what it is able to do on version 1.5 configuration manual.

Commercial Support

If you think you don't have the time and skills to setup and maintain a free load balancer, or if you're seeking for commercial support to satisfy your customers or your boss, you should contact Exceliance. Another solution would be to use Exceliance's ALOHA appliances or the HAPEE distribution (see below).

Products using HAProxy

The following products or projects use HAProxy :

  • redWall Firewall
    From the site : "redWall is a bootable CD-ROM Firewall. Its goal is to provide a feature rich firewall solution, with the main goal, to provide a webinterface for all the logfiles generated!"
  • Exceliance's ALOHA Load Balancer appliance
    Exceliance is a french company who sells a complete haproxy-based solution embedding an optimized and hardened version of Formilux packaged for ease of use via a full-featured Web interface, reduced maintenance, and enhanced availability through the use of VRRP for box fail-over, bonding for link fail-over, configuration synchronization, SSL, transparent mode, etc... (check differences between HAProxy and Aloha). An evaluation version running in VMWare Player is available on the site. Since this is where I work, a lot of features are created there :-)
  • Exceliance's HAPEE distribution
    HAPEE is 100%-software alternative to the ALOHA and standard HAProxy, which runs on standard distributions. It offers pre-patched add-ons (eg: stunnel, ...), system settings, commented config files and command line completion to ease the setup of a complete HAProxy-based load balancer, including VRRP and logging. It also comes with support contracts and assistance tickets.
  • Loadbalancer.org
    This company based in the UK has recently added HAProxy to their load-balancing solution in order to provide the basic layer 7 support that some customers were asking for. They're also among the rare commercial product makers who admit to use HAProxy and who have donated to the project.
  • Snapt HAPROXY
    Snapt develops graphical user interfaces for a few products among which HAProxy. They managed to build a dynamic configuration interface which allows the user to play with a very wide range of settings, including ACLs, and to propose contextual choices when additional options are required (eg: backend lists for some ACLs). They have an online demo which is worth testing.

Add-on features and contributions

Some happy users have contributed code which may or may not be included. Others spent a long time analysing the code, and there are some who maintain ports up to date. The most difficult internal changes have been contributed in the form of paid time by some big customers who can afford to pay a developer for several months working on an opensource project. Unfortunately some of them do not want to be listed, which is the case for the largest of them.

This table enumerates all known significant contributions, as well as proposed fundings and features yet to be developped but waiting for spare time.

Some contributions were developped and not merged, most often by lack of sign of interest from the users or simply because they overlap with some pending changes in a way that could make it harder to maintain future compatibility.

  • Geolocation support
  • Quite some time ago now, Cyril Bonté contacted me about a very interesting feature he has developped, initially for 1.4, and which now supports both 1.4 and 1.5. This feature is Geolocation, which many users have been asking for for a long time, and this one does not require to split the IP files by country codes. In fact it's extremely easy and convenient to configure.

    The feature was not merged yet because it does for a specific purpose (GeoIP) what we want to have for a more general use (map converters, session variables, and use of variables in the redirect URLs), which will allow the same features to be implemented with more flexibility (eg: extract the IP from a header, or pass the country code and/or AS number to a backend server, etc...). Cyril was very receptive to these arguments and accepted to maintain his patchset out of tree waiting for the features to be implemented (Update: 1.5-dev20 with maps now make this possible). Cyril's code is well maintained and used in production so there is no risk in using it, except the fact that the configuration statements will change a bit once the feature is permitted later.

    The code and documentation are available here : https://github.com/cbonte/haproxy-patches/wiki/Geolocation

  • sFlow support
  • Neil Mckee posted a patch to the list in early 2013, and unfortunately this patch did not receive any sign of interest nor feedback, which is sad considering the amount of work that was done. I personally am clueless about sFlow and expressed my skepticism to Neil about the benefits of sampling some HTTP traffic when you can get much more detailed informations for free with existing logs.

    Neil kindly responded with the following elements :

      I agree that the logging you already have in haproxy is more flexible and detailed, and I acknowledge that the benefit of exporting sFlow-HTTP records is not immediately obvious.

      The value that sFlow brings is that the measurements are standard, and are designed to integrate seamlessly with sFlow feeds from switches, routers, servers and applications to provide a comprehensive end to end picture of the performance of large scale multi-tier systems. So the purpose is not so much to troubleshoot haproxy in isolation, but to analyze the performance of the whole system that haproxy is part of.

      Perhaps the best illustration of this is the 1-in-N sampling feature. If you configure sampling.http to be, say, 1-in-400 then you might only see a handful of sFlow records per second from an haproxy instance, but that is enough to tell you a great deal about what is going on -- in real time. And the data will not bury you even if you have a bank of load-balancers, hundreds of web-servers, a huge memcache-cluster and a fast network interconnect all contributing their own sFlow feeds to the same analyzer.

    Even after that explanation, no discussion emerged on the subject on the list, so I guess there is little interest among users for now. I suspect that sFlow is probably more deployed among network equipments than application layer equipments, which could explain this situation. The code is large (not huge though) and I am not convinced about the benefits of merging it and maintaining it if nobody shows even a little bit of interest. Thus for now I prefer to leave it out of tree. Neil has posted it on GitHub here : https://github.com/sflow/haproxy.

    Please, if you do use this patch, report your feedback to the mailing list, and invest some time helping with the code review and testing.

Some older code contributions which possibly do not appear in the table above are still listed here.

  • Application Cookies

    Aleksandar Lazic and Klaus Wagner implemented this feature which was merged in 1.2. It allows the proxy to learn cookies sent by the server to the client, and to find it back in the URL to direct the client to the right server. The learned cookies are automatically purged after some inactive time.

  • Least Connections load balancing algorithm

    This patch for haproxy-1.2.14 was submitted by Oleksandr Krailo. It implements a basic least connection algorithm. I've not merged this version into 1.3 because of scalability concerns, but I'm leaving it here for people who are tempted to include it into version 1.2, and the patch is really clean.

  • Soft Server-Stop

    Aleksandar Lazic sent me this patch against 1.1.28 which in fact does two things. The first interesting part allows one to write a file enumerating servers which will have to be stopped, and then sending a signal to the running proxy to tell it to re-read the file and stop using these servers. This will not be merged into mainline because it has indirect implications on security since the running process will have to access a file on the file-system, while current version can run in a chrooted, empty, read-only directory. What is really needed is a way to send commands to the running process. However, I understand that some people might need this feature, so it is provided here. The second part of the patch has been merged. It allowed both an active and a backup server to share a same cookie. This may sound obvious but it was not possible earlier.

    Usage: Aleks says that you just have to write the server names that you want to stop in the file, then kill -USR2 the running process. I have not tested it though.

  • Server Weight

    Sébastien Brize sent me this patch against 1.1.27 which adds the 'weight' option to a server to provide smoother balancing between fast and slow servers. It is available here because there may be other people looking for this feature in version 1.1.

    I did not include this change because it has a side effect that with high or unequal weights, some servers might receive lots of consecutive requests. A different concept to provide a smooth and fair balancing has been implemented in 1.2.12, which also supports weighted hash load balancing.

    Usage: specify "weight X" on a server line.
    Note: configurations written with this patch applied will normally still work with future 1.2 versions.

  • IPv6 support for 1.1.27

    I implemented IPv6 support on client side for 1.1.27, and merged it into haproxy-1.2. Anyway, the patch is still provided here for people who want to experiment with IPv6 on HAProxy-1.1.

  • Other patches

    Please browse the directory for other useful contributions.

Other Solutions

If you don't need all of HAProxy's features and are looking for a simpler solution, you may find what you need here :

  • Linux Virtual Servers (LVS)
    Very fast layer 3/4 load balancing merged in Linux 2.4 and 2.6 kernels. Should be coupled with Keepalived to monitor servers. This generally is the solution embedded by default in most IP-based load balancers.
  • Nginx ("engine X")
    Nginx is an excellent piece of software. Initially it's a very fast and reliable web server, but it has grown into a full-featured proxy which can also offer load-balancing capabilities. Nginx's load balancing features are less advanced than haproxy's but it can do a lot more things (eg: compression, caching), which explains why they are very commonly found together. I strongly recommend it to whoever needs a fast, reliable and flexible web server !
  • Pound
    Pound can be seen as a complement to HAProxy. It supports SSL, and can direct traffic according to the requested URL. Its code is very small and will stay small for easy auditing. Its configuration file is very small too. However, it does not support persistence, and the performance associated to its multi-threaded model limits its usage to medium sites only.
  • Pen
    Pen is a very simple load balancer for TCP protocols. It supports source IP-based persistence for up to 2048 clients. Supports IP-based ACLs. Uses select() and supports higher loads than Pound but will not scale very well to thousands of simultaneous connections.

Contacts

Feel free to contact me at for any questions or comments :

Some people regularly ask if it is possible to send donations, so I have set up a Paypal account for this. Click here if you want to donate.

An IRC channel for haproxy has been opened on FreeNode (but don't seek me there, I'm not) :

External links

Here are some links to possibly useful external contents I gathered on the net. I have found most of them due to their link to haproxy's site ;-)