Bandwidth Estimation using Clink - hang fix


Part of my masters project work involves using network measurement tools to garner information about a path to a website. One useful type of tool that I don’t believe is used that often is a bandwidth estimation tool. These type of tools employ one of a variety of methods to estimate the available bandwidth between each TTL hop along a given router path to a host. To learn more about these tools, including clink, I recommend reading “Creating a Bandwidth Estimation Testbed Summer 2001 Status Report.”
One of these tools, Clink, was written by Allen Downey and has made significant improvements to Van Jaconbson’s similar tool, pathchar. Unfortunately, I noticed a problem where clink seemed to hang on certain hosts. I don’t believe I am alone in reporting this problem. In, “Measuring Bandwidth between PlanetLab Nodes” (PDF Link) as published in the proceedings of PAM 2005 – Passive & Active Measurement Workshop, the researchers noticed that clink would hang on PlanetLab’s machines and attributed the hang to a possible Linux kernel version problem. It is possible the kernel is the case, but I ran into another situation where clink would experience what looked like a program hang and might explain their hang as well.

When clink experiences a timeout on a probe to a TTL hop, it simply retries the probe again. Of course, if the router has been setup to not respond to UDP packets as many routers in todays internet are now setup to do, clink will endlessly try probing the router with no success. To the end user, this looks like a hang, but a tcpdump will confirm clink is still firing off the same UDP packet probe over and over. When clink was written in 1998-99, many routers were configured to (nicely) respond to a probe, but this is not the case any more.

Because I found clink’s bandwidth estimation using the even-odd technique even-odd technique, as described in the SIGCOMM paper, to be the best available, I rewrote part of the code to fix the infinite loop bug caused by router timeouts. I introduced two new program arguments. The first is a maximum probe retry value and the second being a maximum probe failures per TTL hop. Therefore, you could retry a probe of a specific size against a specific TTL hop multiple times using the first argument before declaring the probe a failure. Then, if the number of probe failures on a specific TTL hop exceed the second argument, the TTL hop is simply indicated as failed and is skipped. Clink then goes on to measure the rest of the hops.

I am not publishing the code patches yet as I am still testing it, but if you are interested in taking a peak at it, please comment and I’ll email you a copy.

Information and Links

Join the fray by commenting, tracking what others have to say, or linking to it from your blog.


Other Posts
Getting large list of URLs for testing using Squid logs
A week in London

Write a Comment

Take a moment to comment and tell us what you think. Some basic HTML is allowed for formatting.

Reader Comments

Hi there, am alfred and very new to this, i need your help in getting me some of the information about “clink” i have started on my masters back in Africa-Uganda to be specific and i have been told to do some research starting with this tool.The aim is to convince my classmates to take up this tool instead of the likes of pathrate, bing etc. you seem to be having good knowledge about it. could i also get a copy of those patches, i really need to try it out. Advice and help from you will be welcome.

Hi,
I’m a student of information technology at the university of Ca’Foscari, in Venice. I have to do a research about clink and I have just read about your interesting improvement to solve the bug.
I wish, if it’s possible, to receive a copy of new coded clink so I could present it in my report.
sorry for my bad English…

I’ll wait for an answer soon

Giulia

Hi,
I am a senior student of telecommunication engineering at Beijing University of Post and Telecommunication in China.
I am currently trying to use clink to estimate the bandwidth and latency of the network of our lab, which is an AD HOC wireless network.But i’ve got some problem running clink as instructed in its doc.
Here it is:
If I type: clink host_name(for exzample 123.0.0.1). The result goes as the following: bash: clink: command not found.
I am not sure where did it go wrong. Hope to get some help from you and make it work. Thanks in advance.

btw, can i ask for a copy of the improved code patches?coz i am about to turn in a paper on the underlying mechanism and implementation of clink, i think your work might do me a huge favor.
Hope to get your reply soon.^^