Corvil is known for providing granular monitoring of packet data which enables network teams with critical insights into the performance of complex trading environments. With the speed of change in capital markets well documented, it will come as no surprise that our solutions are constantly evolving.
As part of our monthly Action Pack releases, we are providing clients with the Corvil Network Health dashboard, an alternative way of identifying and troubleshooting network issues. A flexible and easy-to-use solution, it delivers faster insights for faster issue resolution. By democratizing our network analytics, we also give immediate value to new Corvil customers who will benefit from an intuitive way to enhance day-to-day network traffic monitoring with zero session configuration.
Our new dashboard is about making it easier for novice Corvil users to solve problems faster and avoid downtime. The best way to explain how it works is to describe its role in familiar troubleshooting scenarios, where the pressure is on network teams to identify the root cause of issues and keep the Mean Time To Resolve (MTTR) as low as possible.
A common problem is slow response times that affect trading portals, applications or web services. There are multiple possible causes. It could be an issue at the system or application level that manifests itself as a TCP Zero Window, suggesting a possible server overload. Now, with a single click in our dashboard, affected hosts will automatically appear in a table with the worst at the top.
Figure: Pinpoint Host Connectivity Issues and Causes with Corvil’s Network Health Dashboard
Sluggish performance could also be caused by a change in the way traffic is routed from client to server that impacts on round trip times. It might also be an out of sequence TCP packets or a high rate of retransmissions that affects performance, a network level issue that could be caused by overloaded SPAN sessions. In both cases, a click in the Network Health dashboard will put the worst affected hosts at the top of the table and the resolution process can begin.
What’s happening here is that our appliances are able to search by the user/client’s IP and then isolate the cause (server, network or other) as well as anyone else on the network who might be affected. Once the issue has been resolved, we verify that all is back to normal.
Essentially, we are replacing cumbersome manual processes with streamlined, assisted workflows. At the fulcrum of our solution is an ability to store summary information on every observed flow on the physical appliance ports. These flows are enriched with TCP metrics to create the Corvil Flow Index. Additionally, flows are linked to any relevant, previously created Corvil sessions to enable rapid access to deeper network and multi-tier application analytics.
The high-speed retrieval process occurs via APIs and can be filtered and aggregated to build out the Flow Index. It’s designed to have a negligible impact on overall appliance performance. Flow retention depends on traffic rates and appliance size, but it’s typically weeks.
The two big benefits are accelerated problem resolution and minimizing downtime. For the first, accelerating MTTI (Mean Time To Innocence) and MTTR will mitigate negative impacts on trading outcomes. Whether it’s a problem with a mission-critical application or a stubborn network performance issue affecting a branch office’s productivity, the quicker you resolve it, the happier your users –and your business stakeholders – are going to be.
This last point can’t be underestimated. Network troubleshooters and analytics administrators that use Corvil products will have their reputations enhanced by delivering useful insights to key stakeholders in a timely way – IT colleagues and end users as well as management.
Downtime, meanwhile, has a direct impact on the bottom line. The average cost, according to Gartner, is a staggering $5,600 per minute – and that’s just the average. In electronic trading it’s not hard to imagine costs running much higher if trading algorithms are receiving market data with gaps or the order lifecycle is interrupted.
We are pleased to say that many Corvil users are already seeing the benefits of our new dashboard. One customer, a large bank, is now able to track resets and other metrics by pinning a default view of TCP stats for internet facing firewalls. It proved so efficient that the company has swapped out a legacy monitoring solution for Corvil.
The Holy Grail for network teams is to move troubleshooting from a reactive to proactive footing, creating an environment where you can pinpoint root causes more quickly and respond to network quality issues before they impact on the business. That’s the journey we are taking our customers on, where network healthchecks become a continuous process rather than something you do by appointment.