This document describes an implementation of a large-scale messaging solution relying on the AXIGEN Mail Server software. The global architecture for the solution is described along with implementation details and operation and maintenance procedures.
The information is intended for users who are evaluating the benefits of a distributed, high availability solution as well as for integrators and operational personnel. The components of such a solution, both software and hardware, are also listed in this document thus ensuring the ability to assess overall associated costs.
Definitions, Terms and Abbreviations
• Vertical scalability – potential increase in processing capacity of a machine attainable by hardware upgrades;
• Horizontal Scalability – potential increase in processing capacity of a cluster attainable by increasing the number of nodes (machines);
• Statefull Services – services that provide access to persistent information (i.e. account configuration and mailbox) over multiple sessions. Typically refers to a service in the backend tier. Ex: IMAP services for an account;
• Stateless Services – services that do not store persistent information over multiple sessions. Typically refers to the services in the front-end tier. Ex: IMAP Proxy.
• Frontend Tier – Subnet, medium security level, provides proxy services
• Backend Tier – Subnet, high security level, provides data storage and directory services
• Frontend Node – Machine residing in the frontend network tier, providing proxy functionality
• Backend Node – Machine residing in the backend network tier, participating in the high-availability cluster
1.1 Statefull Services
Non-distributed email solutions, where account information (configuration and messages) is stored on a single machine allow vertical scalability through hardware upgrades (CPU, RAM, disk). However, due to limitations in a typical machine (i.e. max 2 CPU, max 4 GB RAM etc) an upper limit is eventually reached where one can no longer upgrade one machine – we shall refer to this as vertical scalability limit.
When the vertical scalability limit is reached, the only solution available is to distribute account information (configuration and mailbox) on more than one machine – we shall
refer to this as horizontal scalability. Since information for one account is atomic and cannot be spread across more machines, the solution is to distribute accounts on more than one machine. This way, for a single account, there will be one machine responding to requests (IMAP, POP, SMTP) for that specific account. Thus, when the overall capacity (in terms of active accounts) of the messaging solution is reached, adding one more machine to the solution and making sure new accounts are created provides a capacity upgrade, therefore allowing virtually unlimited horizontal scalability.
It must be noted that, since each account of the system is serviced by a specific node, a centralized location directory must be available to provide location services. In our case an LDAP system will store information about which node is able to service requests for a specific account.
1.2 Stateless Services
Since stateless services do not store information over multiple sessions, we can assume that two different machines are able to service requests for the same account. This way, horizontal scalability can be achieved by simply adding more machines providing the same service in the exact same configuration. The only remaining requirement is to ensure that requests to a specific service are distributed evenly throughout the machines providing that specific service (i.e. if the system contains two machines providing IMAP proxy services, half of the incoming IMAP connections must reach one of the machines and the rest of the connections must reach the other machine). This functionality is provided by a load balancer, be it hardware (dedicated) or software (Linux machine running LVS).
2 High Availability and Fault Tolerance
2.1 Statefull Services
Consider the fact that, for statefull services, requests for one specific account are made to a specific machine. If that specific machine experiences a fault and can no longer respond to requests, none of the other machines are able to service the account in question. A mechanism is required to ensure that, in the event of a catastrophic failure on one machine, some other node must take-over the task of servicing requests for that account thus providing high-availability.
RedHat Clustering Suit provides this exact functionality; it ensures that, if one node running a statefull service fails, another node will automatically detect the fault and start the required service in-place of the failed node, providing minimal downtime to that service.
2.2 Stateless Services
In the case of stateless services, since any of the nodes providing the same service is able to respond to requests for any account, the only requirement is to make sure that the request distribution mechanism (load balancer) can detect when one of the nodes no longer responds to requests and ceases to direct service requests to that specific node. The total request processing capacity is decreased (the system will respond slower, since one node no longer processes requests), but all service requests can still be processed.
Please check this link for details about the Solution Architecture, Requirements, Setup and Configuration and Provisioning: High Availability Distributed Solution on AXIGEN Mail Server