Market Data Microbursts

A market data ‘microburst’ is a short burst of high activity in a feed, typically lasting much less than a second but posing a risk of temporarily saturating the resources available to a receiver (and thereby adding latency).  While the term ‘microburst’ might suggest a duration measured in microseconds, in practice it’s often used to describe any burst short enough to pass under the radar of traditional load-monitoring tools (that measure average loads over seconds or minutes). Detecting microbursts requires a monitoring system that can accurately sample activity patterns at very short timescales – a ‘microscope’, if you will.

Microbursts are short, but they are important to market data practitioners who nowadays cannot tolerate the extra microseconds or milliseconds that a slow system might take to clear them. To some practitioners, a burst only qualifies as a microburst if it does actually cause saturation at some point in the market data path. To others, any unusually high spike of activity qualifies. The latter definition is a reasonable one to apply if you do not know the capacities of the downstream systems that will have to handle the spike.

Real-world examples of microbursts in the ArcaBook and OpenBook market data feeds can be seen on the LatencyStats.com website operated by NYSE Technologies in partnership with Corvil. The volume metrics presented on the site show that both feeds contain microbursts, in the sense of short-term activity spikes that greatly exceed the long-term average data rate. A closer look at the activity pattern reveals that busy periods tend to consist of many high-rate bursts in succession. For example, here is a view of 1-millisecond bit-rates in the ArcaBook feed during a fairly active 1-minute period from a few weeks back (from the time of writing 28 September). The average bit-rate during this period is less than 10 Mbps, but there are multiple microbursts extending to over 200 Mbps:


1-millisecond bit-rates in the ArcaBook feed during an active 1-minute period

Strikingly, the activity pattern in the OpenBook feed over the same timeframe reproduces the timing of the ArcaBook spikes with uncanny precision (albeit at lower peak bit-rates):


1-millisecond bit-rates in the OpenBook feed during an active 1-minute period

The OpenBook and ArcaBook feeds are published by different servers in separate data centers and are delivered to the client connection point (where the LatencyStats.com measurements are made) via separate network paths. The spikes are clearly not an artifact of machine or network processing along either delivery path. Each burst consists of a very large number of order book updates covering a broad range of different securities. They represent events in which significant changes occur in both order-books almost simultaneously. Below is a close-up view of one of the microbursts in the chart for ArcaBook. Most of the activity (over 25,000 updates) occurs within the first 100 milliseconds after the onset of the event:


Close-up view of a microburst in the ArcaBook feed

It’s not surprising to find correlation between the ArcaBook and OpenBook feeds, since the corresponding markets trade many of the same securities and orders can be forwarded between them. Nevertheless, the degree of coincidence between the feed microbursts is remarkably close. The onset of the ArcaBook event shown above is separated from that of the equivalent OpenBook event by less than 1 millisecond.

How should network and computing systems be engineered to handle microbursts? A simple but rather conservative approach is to ensure that the available capacity always exceeds the highest microburst data rate measured at some short timescale. For example, if the speed of a network link carrying a market data feed exceeds the feed’s highest 1-millisecond data rate, then the link will never be continuously busy for more than 1 millisecond at a time.  Therefore no data will be delayed on the link for more than 1 millisecond.

In practice the capacity needed to keep latency below 1 millisecond is normally much less than the peak 1-millisecond data rate. This is because the microbursts in the feed tend not to be sustained over time. A 1-millisecond microburst exceeding link capacity will build up a queue of data waiting to be processed. But provided the system can buffer the queue and clear it quickly when the burst ends, it can still prevent any data from being delayed more than a millisecond. For example, LatencyStats.com displays both the peak 1-millisecond rate and the network bandwidth required to avoid 1-millisecond latency for each feed – the latter value is computed using an algorithm based on queuing theory. At the time of writing (28 September 2010) the values displayed for ArcaBook for the last seven days are:

Plainly the peak 1-millisecond data rate is a conservative measure of how much bandwidth is needed. Nevertheless the actual bandwidth required is much higher than the peak 1-second data rate – showing the influence of short timescale microbursts.

What about users who receive multiple market data feeds together, over the same infrastructure? Normally one might hope that microbursts in different feeds would rarely coincide, and therefore the same resources can be shared among the feeds without much risk of overload. This phenomenon – called multiplexing gain – is in fact what usually happens when network and computing workloads from different sources are combined together. The resources needed to handle the total workload are much less than the sum of what each workload needs individually, for the same latency performance.

However, we have seen that microbursts in the ArcaBook and OpenBook feeds do not occur independently and are in fact closely synchronized. Sudden spikes of activity in the feeds during busy periods of the day are found to coincide to within less than 1 millisecond. The correlated activity pattern implies that there will be little opportunity for multiplexing gain when these feeds are combined together over a shared resource.

To illustrate, I took a short two-minute period of data from each of the feeds (from the last hour of trading during a typical trading day) and calculated the network bandwidth needed to prevent 1-millisecond delays, for each feed individually and also for both feeds together.

The bandwidth needed when the feeds are combined is only slightly lower than the sum of their individual requirements.

Just to demonstrate the extent of multiplexing gain that you would normally expect to see when combining independent workloads of the same bursty nature, I also computed the same bandwidth values using feed data taken from periods on different days:

Taking the data from different days means that the microbursts from the two feeds no longer coincide. If that were the case, the bandwidth needed by the combined data set would be only slightly larger than what ArcaBook needs by itself.

The absence of any significant multiplexing gain when combining these feeds unfortunately means that provisioning for them will be more expensive than otherwise. On the plus side, it also means that it’s easy to calculate the network bandwidth they need when combined together: just add the numbers shown on LatencyStats.com for the individual feeds. For example, doing this for the 7-day values currently shown gives a total bandwidth requirement value of 676 Mbps (for no more than 1-millisecond queuing latency). Note that this value may grow in the future if the size of the feeds increases.

Fergal Toomey, Corvil Chief Scientist
Click here to download a copy of this article

Recommended Further Reading

To understand these issues in greater detail we recommend that you read the following:

  • Corvil and NYSE Technologies white paper: LatencyStats.com - Latency Transparency for Market Data
  • Corvil white paper: Managing Performance in Financial Trading Networks

Please email: to request either of these white papers.