What's in a number?

The Truth or Dare Nature of Describing Latency Performance in High Frequency Trading

Why is information about latency performance so difficult to understand and use? We are all familiar with the typical claims:

  • Technology Vendor “The feed handler is benchmarked at 10us”
  • Market Center “We can execute an order within 350us”
  • Data Provider “The distribution latency of our direct feed is 2ms”
  • Telecom Provider “Our latency is less than 55ms transatlantic”

Advertised latency numbers have taken on major commercial significance in the world of high frequency trading as many in the industry publish numbers in an attempt to show their service in a favorable light, and to demonstrate superior performance over competition. Unfortunately this method of describing latency does little to help end users understand the true performance of the underlying low-latency service. Market pressures are such that few dare to offer latency information that brings real insight and transparency to latency performance because of the fear of “appearing slower” than the competition. As a result most informed customers of these services are forced to ignore the published latency claims and look to measure and benchmark latency service levels independently.

The purpose of this note is to examine how latency performance can be better described such that useful comparison and real operational decisions can be made.

Why is a single number description of latency inadequate?

Latency is not a constant. It is a lot like the weather, always changing. If you don’t like a particular latency number, then wait a millisecond. It will change. Why does it change? Latency is not simply dependent on geographical distance alone. It is also dependent on factors like traffic load, network bandwidth, and processing capacity. If any of these factors change, then latency will change. The load on electronic trading systems is highly variable. We know that intraday patterns often show higher volumes at market open and close. We also see large variations in load at micro timescales. These are known as microbursts. Active components that process trading messages like trading engines, line handlers, gateways, firewalls, switches and routers can often be temporarily congested by these sudden microbursts which adds even further to the latency.

Now, traders often need to understand the likely performance of their systems at particular times of day, i.e. when it is critical for them to trade. This often correlates to when everyone else also wants to trade and is the busiest time of day. Message rates and trade volumes often achieve their maximum at this time, and latency is also at its maximum. This is referred to as the “busy period”. When one measures the latency distribution for the busy period, it is very different to the latency distribution for the complete trading day and typically skewed towards larger latency values. Clearly, if key trades are scheduled to take place at particular times of the day or in reaction to certain events, then a detailed understanding of the end to end latency distributions for both market data and order execution paths during that particular period is essential.

Before you act - check the following:

When consuming and interpreting latency numbers we recommend that you ask for clarification of the following:
1. Is the claimed latency performance based on an average measurement? Average numbers will always be lower (appear faster) and are typically used in marketing collateral.
2. What was the period of time over which the measurement was taken? The longer the measurement period the lower the average latency number.
3. What were the load conditions during the period of measurement? The lower the load, then the lower the latency. Often the latency measurements are done using load conditions not representative of intended operational use.
4. What was the time precision and accuracy of the underlying measurement system? Poor precision and longer sampling periods deliver lower latency numbers. Look for systems with 1 microsecond precision or better.
5. Were 100% of packets or messages used for the measurement? Sampling of packets or messages or use of synthetic measurement methods can significantly underestimate latency.

Real-world latency profiling

Let’s consider the following real case as shown in Figure 1 where we recently measured the one-way latency profile for a direct feed market data service at a co-location center for a client considering the added benefits of locating their trading servers closer to the market center. The service was advertised with a latency service level agreement (SLA) of 100 microseconds or better. A highlighted feature of the service was a guaranteed maximum distance from the distribution switch of 100m.

FIGURE 1:  CorvilClear measurement of one-way market data latency between market center feed assembler output and input to market participant price engine.

To measure the latency, CorvilNet appliances were located at the egress of the feed assembler in the market center and at the ingress to the market participant’s pricing engine. 100% of the packets and messages in the market feed were measured during the trading day with 1 microsecond precision. CorvilClear was enabled on both appliances allowing a peer-to-peer latency measurement channel to be established between the market participant equipment and the market center equipment. Therefore latency information could be exchanged in real time between CorvilNet appliances giving full transparency and precision monitoring of the market data latency.

The results illustrated in Figure 2 show a variable latency profile with latency spikes occurring at approximately 10.30am and 3.30pm. The maximum latency recorded was 3.7ms. The average latency was 52 microseconds over the measurement period. This was within the advertised latency service level agreement (SLA) of 100 microseconds. However, 20% of packets exceeded a latency service level of 100us and 5% of packets exceeded a latency service level of 1 millisecond. Therefore it is at least ambiguous as to whether or not the provider is meeting the advertised latency service levels due to loose specification of the SLA.

FIGURE 2:  CorvilClear one-way latency measurement of market data latency between market center feed assembler output and input to market participant price engine over the trading day. Average latency = 52 microseconds. Maximum latency = 3.7 milliseconds. Measurement period = trading day.

For the case illustrated in Figure 2, the busy 1-second of the trading day occurs at 3:31:23pm. During this time the average latency changes to 1.03ms while the maximum is still 3.7ms (as expected). Using a latency filter within CorvilNet we discover that over 97% of packets failed to meet the latency SLA of 100 microseconds during the busy 1 second.

As this example shows, latency during busy periods can be much larger than the values seen during the rest of the trading day. By definition, the busy periods are the times when most traders want to trade. Therefore statistics measured during these periods are key metrics for understanding performance. Simple service level targets based on daily or even monthly values are a poor guide to the conditions that traders will experience in practice.

Further Reading: To understand these issues in greater detail and to see how to measure, describe and specify latency service levels in a manner that avoids the above pitfalls please read the following Corvil white paper entitled: “Measurement and Characterization of Latency in Trading Networks” by Fergal Toomey - Corvil Chief Scientist.
Please email: to request a copy of this white paper.



Click here to download a copy of this article