- Table of Contents
- Example Configurations
This document contains hardware sizing guidelines for servers which are running GreenArrow Engine, or GreenArrow Engine and Studio. There are a lot of variables that go into determining the optimal hardware configuration. Some of the factors to consider are:
- Is this a GreenArrow Engine only installation, or GreenArrow Engine and Studio?
- Will the server be running any non-GreenArrow applications?
- How many messages per hour, and per day would you like the server to be capable of sending? The per hour figure is more important to server sizing, but both are factors.
- What is the average message size?
- How many links will be in the average message?
- What is the average deferral and bounce rate? The lower these rates are, the better performance will be.
These additional factors apply to GreenArrow Studio installations:
- How many subscriber records will GreenArrow Studio’s database hold?
- How many custom fields will there be for each subscriber? A custom field is some piece of data other than the email address which gets imported. For example, if you imported the first and last name of each subscriber, then those are two custom fields.
- What is the average custom field length? For example, if you imported the first and last name of each subscriber, then you might estimate the average custom field length to be
These factors are also applicable to mail that’s injected via SMTP or another source outside of GreenArrow Studio:
- Is click and open tracking turned on?
Feel free to contact GreenArrow technical support if you’d like hardware sizing advice, or have a potential configuration that you’d like to have reviewed.
This document breaks GreenArrow Engine and Studio’s hardware requirements down into the following categories:
The disk subsystem is the most common bottleneck. CPU performance is the second most common bottleneck. Network performance usually isn’t a bottleneck unless mail is being injected into GreenArrow Engine by remote hosts via SMTP.
GreenArrow Engine and Studio support 64-bit and 32-bit x86 CPUs.
Most of GreenArrow’s performance sensitive components are multi-threaded, so you can think of a server’s CPU resources in terms of multiplying the number of CPU cores by the speed of each core.
CPU Cores vs CPU Threads
Two values which are often included in the specs of a CPU or server are the number of CPU cores, and the number of CPU threads. For example, a CPU might have 4 cores, and 8 threads. This document discusses CPU sizing in terms of CPU cores since we’ve found that metric to be more relevant to GreenArrow’s performance than CPU threads.
We recommend that all production GreenArrow servers have a minimum of
4GB of RAM. Additional RAM may be required, depending on your performance requirements.
Increasing the amount of installed RAM is an effective way of reducing disk IO overhead. The following components are GreenArrow Engine and Studio’s major RAM users:
- PostgreSQL - The ideal situation is to have enough RAM to buffer the entire database. This isn’t possible in most cases, but the higher percentage of the database that can be buffered in RAM, the better from a performance standpoint. The amount of disk space that PostgreSQL consumes will be covered in the Disk Space Requirements section of this document.
ram-queue - on most installations, this occupies
- Filesystem buffers - the OS typically allocates most remaining RAM to this area. The main area where filesystem buffers can improve performance is GreenArrow Engine’s disk-queue. The amount of disk space that the disk-queue consumes will be covered in the Disk Space Requirements section of this document.
In most situations, the disk subsystem’s random IO performance is more important than sequential read and write performance. On GreenArrow Engine only installations, random disk IO performance usually becomes a bottleneck before CPU or RAM. This is often the case on GreenArrow Engine and Studio installations as well.
Magnetic drives are suitable up to a point, but for maximum performance, we recommend the use of solid state drives (SSDs).
SSDs are expensive when compared to magnetic storage in terms of how much data can be stored. When compared in terms of random IO performance, SSDs can become cost effective, though. One of the key metrics to look at when selecting the disk subsystem to use is the performance in IOPS (Input/Output Operations Per Second) for both random reads and random writes.
GreenArrow currently uses two different tiers of SSD drives in its hosted environment:
- Servers which have sending speed requirements of one million messages per hour or less typically use Intel 320 Series and X-25M SSD drives. 320 Series drives are used for new installations, and X-25M drives were installed prior to 320 Series drives being released.
- Servers which have sending speed requirements of more than one million messages per hour typically use enterprise class SSD drives, such as Intel S3700 Series or Kingston E100 drives. These SSD drives cost significantly more than Intel 320 Series and X-25M drives, but also offer significantly better performance.
We’ve been happy with the performance of all of the above SSD drive models in the roles that they’ve been placed in.
Here are links to performance figures for some of the above-mentioned drives:
Other SSD drive models may also be suitable.
When possible, we recommend enabling TRIM on SSD drives and using proper partition alignment.
When using SSDs, we also recommend reviewing the performance of the disk or RAID controller that the SSDs will be plugged into. In some cases, disk or RAID controller becomes the disk subsystem’s bottleneck.
If SSD drives are not an option, we recommend using the fastest magnetic drive configuration available. For example:
7200RPM, or better yet,
15kRPM drives over
- SAS drives are usually faster than SATA drives.
- Multiple magnetic hard drives can be aggregated into a single RAID 10 array. Even if you don’t need the aggregate storage capacity, this is one method for increasing available disk IO.
Combining SSDs and Magnetic Drives
A combination of SSDs and magnetic drives can be used in order to take advantage of the strengths of each:
- Disk IO intensive components, such as the PostgreSQL database and GreenArrow Engine’s disk-queue can be stored on SSDs.
- Low disk IO components, such as the operating system, logs and backups can be stored on magnetic drives.
In order to optimize performance, and provide redundancy in case of hard drive failure, we recommend using RAID 1 and/or RAID 10 arrays. We recommend against using RAID 5 arrays. With all other things being equal, a GreenArrow installation using a RAID 10 array will usually outperform an installation that’s using a RAID 5 array.
Read and Write Caching
Some RAID controllers have the option of enabling a read and/or write cache.
Here are our recommendations for the write cache:
- Disable the RAID controller’s write cache if you don’t have a known good BBU (battery backup unit). This is because the write cache creates a data loss risk when there’s not a functioning BBU in place.
- If you do have a known good BBU installed, then it often makes sense to enable the RAID controller’s write cache. Doing this on a RAID array that uses non-SSD (magnetic) drives will almost always boost performance. Doing this on a RAID array that uses SSD drives will often, but not always improve performance. This is because SSD drives are fast enough so that sometimes the extra overhead of performing write caching has a performance impact that is greater than write caching’s performance boost.
- Some RAID controllers also have a configuration option to disable write caching if the BBU fails. If this is an option, we recommend enabling it. The Disks from the Perspective of a File System article on ACM.org contains more details about how write caching works.
Here are our recommendations for the read cache:
- If you enable the RAID controller’s write cache, then we recommend disabling its read cache so that more cache resources can be dedicated to writes.
- If you disabled the RAID controller’s write cache, then it often makes sense to enable the RAID controller’s read cache. Doing this on a RAID array that uses non-SSD (magnetic) drives will almost always boost performance. Doing this on a RAID array that uses SSD drives will often, but not always improve performance. This is because SSD drives are fast enough so that sometimes the extra overhead of performing read caching has a performance impact than is greater than read caching’s performance boost.
- We recommend disabling the RAID controller’s read cache. The operating system RAM can serve as a read cache and is more accessible to the CPU.
Disk Space Requirements
Coming soon. Feel free to contact GreenArrow technical support if you’d like us to help estimate your server’s disk space requirements.
SMTP is a latency sensitive protocol. GreenArrow Engine can open up multiple concurrent SMTP connections, so this usually isn’t an issue for mail being sent by GreenArrow Engine, but can be for mail that’s being injected into GreenArrow Engine via SMTP.
If you’re injecting messages into GreenArrow Engine via SMTP, then here are some ideas for reducing the impact of network latency:
- Reduce network latency as much as possible by having GreenArrow Engine and the injecting application reside in the same datacenter, or geographic area. A single threaded injecting process will run much more quickly with
5msof latency than with
- Have the injecting application open up multiple concurrent SMTP sessions to GreenArrow Engine. Applications that generate transactional mail often do this by default, as each transaction occurs.
- Use a less latency sensitive protocol. The QMQP and QMQP-streaming protocols are less latency sensitive than SMTP.
- Use GreenArrow Studio’s API instead of SMTP.
The following example configurations assume the following:
- An average message size of
- Most delivery attempts are successful. A high bounce and/or deferral rate has a negative performance impact.
- The server is only running GreenArrow software. While GreenArrow can run on the same server as some other applications, the resources that those other applications will consume are not factored into these example configurations.
The configurations are ordered from lowest to highest volume.
Here’s a table summarizing the number of messages per hour we estimate each configuration can provide:
|Server||Studio||Engine with click and open tracking||Engine without click and open tracking|
|CPU||2x cores running at
|CPU||4x cores running at
|CPU||8x cores running at
|CPU||16x cores running at
|CPU||16x cores running at
GreenArrow Engine and Studio are compatible with virtualization technologies that provide a 32-bit or 64-bit x86 CPU and one of our Supported Linux Distributions. Some of the virtualization technologies that we’ve successfully installed GreenArrow Engine and Studio on in the past are:
- VMware ESX(i) and vSphere
- VMware Server using either Linux or Windows as the Host OS
- KVM - we use this in our hosted environment.
- Microsoft Hyper-V Server
From a deliverability perspective, we recommend being cautious about using a cloud provider to host a GreenArrow Engine installation. The reason for this is that if a provider sells servers on a short term basis - by the hour, for example, their networks could become a target for abuse, and therefore have a poor IP reputation.