Measurement based IP geolocation in the ETOMIC infrastructure


an illustration
An illustration
The main idea behind this project was helping scientists understanding data by visualizing Internet measurements. The goal was localizing inter-ETOMIC network components, routers and measurement nodes. Only the latters had known geographic positions. We had to locate routers along the network paths between ETOMIC nodes. We used active measurements like others in the geolocation community, but we were the first to apply end-to-end OWD measurements in the area of geolocation. In the literture measurement nodes with known locations are called landmarks. In our situation ETOMIC nodes played this role.


At first we developed a measurement configuration to collect topology, round trip delay and DAG driven one way delay data. These data were integrated into the database of our Network Measurement Virtual Observatory.

1. The traceroute utility was used to discover inter-ETOMIC topology including the router interfaces. Using this technique we determined the network path between every ETOMIC pair. The result was a loosy connected topology graph where every node represented a network interface. We known that every router has several interfaces. If we are using traceroute from different directions then we can see different interfaces of the same router. This means that a single router can appear as more than one independent node in the topology map.

2. We used the mercator tool to reduce the number of nodes in the topology graph by clustering interfaces that belong to the same router. This tool sends UDP probes to different ports of network interfaces. After that it is waiting for a router generated ICMP port-unreachable message. If the destination address in the sent UDP packet is not equal to the source address in the ICMP response then these two addresses are aliases of each other. Most of the time in this situation routers answer by using an internal IP address as source address. The number of nodes in the network graph can be reduced significantly by this clustering. Before the clustering our network graph contained 1192 nodes, and after using mercator this number was 768. The mercator found 160 different clusters with 584 interfaces in all.

3. After determining intermediate routers and interfaces as target nodes we made two kind of latency measurements. We measured round trip delays to every target node and one-way delays between every landmark pair. Every RTT measurement contains 25 ICMP ping probes with 64 bytes packet size. It was repeated 5 times in time-dispersed way. To measure OWDs between ETOMIC pairs very small UDP packages were used. Every OWD measurement contains more than 100 time-dispersed UDP probes. These packages were sent and captured by high-precision Endace DAG cards.


In our geolocation solution we crossed the topology based and the constraint based techniques. To generate constraints from measurements we used a detailed path-latency model. The delay experienced by a packet as it passes through the network is a sum of contributions from various phenomena like hop-count, queuing delays, transmisson delays, propoagation delays, etc. Based on this network model we introduced several types of constraints:

1. The RTT constraints are used to determine the possible maximum geographical distance between landmarks and target nodes. 2. The OWD constraints give a upper limit of the possible path-lengths between ETOMIC pairs. 3. Link-latency estimations approximate the length of edges in the topology map. This type of constraints is not too reliabe.

We solved this geolocation task as a global optimization problem. We used Lagrange multipliers with a modified gradient method. You can read more details about this shortly summarized solution in one of our following papers.

Performance analysis

CDF of absolute errors
CDF of absolute errors
Five different settings were examined in our study. In the first we used only round trip time constrains without (Geo-R) and with hop-count (Geo-Rh). In the next configuration link-latency constrains were added (Geo-RhL). In the fourth setting we used RTT and OWD constraints (Geo-RhO) and in the final setting, every type of constraints was applied (Geo-RhOL).

We made performance test on a reference dataset which was collected manually from the European routers. Every ETOMIC node is located on an academic site and the network paths between them including the target nodes are belonging to the European academic network. So, the accuracy measurements and comparisons were made in an academic environment because of using inter-ETOMIC OWD constraints.

The results of our experiments are depicted in Figure 2. It is conspicuous that OWD constraints increase the accuracy of mean and maximum errors significant. The Table 1 shows more details.

Accuracy of different constraints [km]
Settings Mean errorMedian ErrorMax. errorStdDev
Geo-Rh 576 75 17798 2028
Geo-RhL 562 93 17747 1937
Geo-RhO 172 141 586 153
Geo-RhOL 162 85 548 148

© 2017 The ETOMIC project