Hubble and Reverse Traceroute: Systems for Improving Internet Performance and Reliability

25 Nov
Tuesday, 11/25/2008 7:00am

Ethan Katz-Bassett

Computer Science Building, Room 151

Arun Venkataramani

Although more and more people and services depend on the Internet, its reliability has improved only modestly in recent years, and poor performing paths are just as prevalent as they were a decade ago. Part of the difficulty is that network operators have surprisingly few tools to identify and diagnose outages that occur. Lacking these tools, they often resort to ad hoc measures, such as using email to ask other operators for help in launching probes.
In this talk, I will describe two systems we have recently built to assist operators in improving end-to-end performance and reliability. First, Hubble is a system that continuously identifies and monitors black holes and reachability problems across the entire Internet. We found long-term reachability problems to be much more common than expected, with hundreds of events per day lasting hours and sometimes even days. Hubble uses traceroute and comparisons with historical data to help localize the source of the problem. For instance, we found many instances where multi-homed customers were reachable through one upstream provider and not another.
However, a fundamental limitation of traceroute is that it can only measure the forward path to an arbitrary destination, not the path back. The second system I will present is our reverse traceroute system to address this limitation. We use multiple vantage points, IP options, and a limited form of source address spoofing to measure the complete reverse path in 40% of cases. We use our reverse traceroute system to study previously unmeasurable aspects of the Internet: we uncover thousands of AS peering links invisible to current topology mapping efforts, and we present a case study of how a content provider could use our tool to troubleshoot poor performance.
Bio: Ethan Katz-Bassett grew up in Northampton, MA, and attended Williams College, graduating in 2001. For the next three years, he spent the summers and falls working at the Laboratory for Advanced Software Engineering Research (LASER) at UMass and the winters and springs teaching skiing in New Mexico. Since then, he has been pursuing a PhD in computer science at the University of Washington and expects to graduate next year. His primary research interests are in distributed systems and Internet measurement, especially measurement-based systems to aid network operators.