Navigation

Web Services in the ETOMIC System

Network Measurement Virtual Observatory

The Internet is a large-scale complex system, composed of a huge number of elements with colorful features. To understand a complex system we need complex models and a lot of measurement data to be able to verify the predictions of the models. To propose new models describing traffic or topology dynamics, we have to be able to answer challenging research questions about the network's spatio-temporal properties and need large-scale and long-term observations. Until now numerous different projects have collected a vast amount of data about various Internet traffic characteristics. However, the scope of these efforts are limited by several factors. Measurements capturing real Internet traffic are generally conducted on a dedicated measurement infrastructure. A single infrastructure usually scans only a narrow segment of the entire Internet topology. Orchestrating a single centralized project is probably not the best answer to these challenges and is not feasible because of practical reasons like financing. This calls for a self-organization similar to the development of community sites, Wikipedia, or the Internet itself. If according to this philosophy distinct datasets could be cross-matched and analyzed together, it would be possible to propose comprehensive studies dealing with the large-scale behaviour of the Internet. Furthermore, measurement archives containing historical Internet data could help data-miners to gain novel insights into the network's long-term evolution.

One of the first large interconnected scientific data archives were introduced in the field of astronomy, hence these archives and this style of research are often called Virtual Observatories. Based on these experiences we are proposing here a new paradigm for the network research community and we show a prototype node for Network Measurement Virtual Observatory (nmVO) which is based on the data collected in a large scale measurement effort. For more details, please see XXXXXX.

Web Services

2hopBThe role of an nmVO goes beyond the simple data collecting and archiving functions. On the consumer side there are the researchers or network managers who want to poll the archive to get either a (usually small) subset of the archive or some composite information. To sketch the principle through a possible application, consider a peer-to-peer overlay network that needs management information in order to optimise routing between peers. It would be unthinkable to use gzipped measurement data for similar purposes. On the contrary, a scenario is feasible where one turns to an nmVO to get, let's say the average loss of a link on Mondays between 2 and 3 o'clock or the fastest path between two nodes. This means that beyond the data itself, easy-touse tools are also needed to perform such data filtering and transformation queries efficiently. The recent efforts on enabling database engines to run complex user code and define new datatypes (e.g. MS SQL Server CLR integration) makes this task easier. Using these stored procedures we can move the typical filtering and preprocessing tasks - like getting slices or aggregates of data, - to server side. Since end-users can call these functions remotely, as we will show later in a concrete example, they can reach the same results with less code and they have to fetch much smaller amount of data from the archive.

The XML based Web service technology combined with database management systems is a good candidate to solve these issues. They allow to run either simple queries or more complicated functions that are stored on the server side, where the data is. Thus there's no need to transfer unprocessed bulk data through the network. The developer can create a client application that invokes a series of Web services through remote procedure calls. This is done seamlessly, which means that Web services can be called as if they were local functions on the client side. In this way filtering and selection of subsets of data or typical preprocessing functions, like aggregates, variance calculation, histogram, etc., used by lot of users can be done at server side, making the task of the end-user simpler and significantly reducing the data to be transferred. Data conversions into/from XML are done automatically, messages are carried through the HTTP protocol. See Figure 1 for the Web service enabled nmVO architecture.

Standard access services

To go further in exploitation of service oriented architecture, we must standardise few basic services, that can be applied to most archives. Astronomers have cone search service, which is nothing more, than give me all the data you have inside this and this radius of this and this sky coordinate. In this way X-ray, optical and infra-red observations can be matched easily. As a few suggestions for network measurements a similar query in the time domain time slice service can be quite general and useful filtering. An IP list service can return all data in the archive related to the IP addresses provided by the clients. Of course the largest flexibility can be reached if full SQL access is provided for the database, as we will show later in an example for the ETOMIC archive. Beyond direct data services a get schema service at each archive can provide the meta data describing all the datasets and the fields in each of them.

Webservice example

2hopBOur prototype nmVO is based on the technology developed for sharing astronomical data. Though it is not finalized yet, the service (EtomicService) is already working: the archive contains few billion records at the moment, the earliest data is from the January of 2006. The real strength of the service oriented architecture is that remote procedure calls and data transformations are automatically and seamlessly done, and using these functionalities from a remote client is quite easy. In the appendix we show a short piece of C# code, that can be compiled both in Windows with Visual Studio or under linux, using the Mono framework. With a simple command, the user first gets the proxy class, that defines the interface to the remote procedures and all the data structures. Then using the namespace given by the server the remote procedure can be called, like a local one, and it hides all the data transformations to XML, transferring through HTTP and transferring back to simple user space format. The code is the following:

[NMVO_DEMO.CS]
using System;
using System.Collections.Generic;
using System.Text;
using System.IO;
using System.Xml.Serialization;
using System.Data.SqlClient;
using System.Data;
namespace nmVO_demo {
	class Program {
		static void Main(string[] args) {

			// SQL QUERY STRING FROM STDIN
			string sql = "";
			for (int i = 0; i < args.Length; i++)
			sql += " " + args[i];

			// DECLARING THE WEB SERVICE
			hu.colbud.amd1.CasJobs ws = new nmVO_demo.hu.colbud.amd1.CasJobs();

			// CALLING THE SERVICE: QUERYING THE DATABASE
			DataSet ds = ws.SubmitShortJob(1849700559,sql, "nmVO", "foo", 1);

			// PARSING THE RESULT SET, RETURNING THE ROWS
			
			foreach (DataTable table in ds.Tables) {
				foreach (DataRow row in table.Rows) {
					foreach (DataColumn column in table.Columns) {
						Console.Write(row[column].ToString() + " ");
					}
					Console.WriteLine();
				}
				Console.WriteLine();
			}
		}
	}
}

To compile we need the following make file:
casjobs.wsdl: disco http://amd1.colbud.hu/casjobs/casjobs.asmx CasJobs.cs: casjobs.wsdl wsdl -n:nmVO_demo.hu.colbud.amd1 casjobs.wsdl nmVO_demo.exe: CasJobs.cs nmVO_demo.cs gmcs nmVO_demo.cs CasJobs.cs -r:System.Data -r:System.Web.Services
Since this small example code uses standard IO, it can be easily piped into gnuplot or other tools, and with commands like this:
plot [17:17.5] "< nmVO_demo.exe SELECT * FROM dbo.getOWDHistogram('UPAR', 'ELTE', '2006-11-01', '2007-12-01') ORDER BY delay" using ($1/1e3):2 w l
--

Figure 2: Schematic picture of usage of an nmVO node. Human user can access the archive through GUI, computer code via Web Service interface. Proxy classes of the available services are automatically provided through WSDL. SOA architecture takes care of remote procedure calls and seamless data transformation from archive format to NetXML and from NetXML to user digestible format.

--

Demo video about creating and using a Web services enabled client application.

© 2017 The ETOMIC project