Geographical Information System based Geo-Deflection Model For CDN


S.Manikandan
Research Scholar
Department of Computer Science and Engineering
PSG College of Technology, Coimbatore


A.Chitra
Professor
Department of Computer Science and Engineering
PSG College of Technology, Coimbatore


S.Chinnaraj
PG Student
SPA New Delhi, India



ABSTRACT
Over the past few years, making information available to a rapidly growing user population in the Internet is becoming a challenging problem. This increased Internet access traffic has lead to a poor QoS (Quality of Service). Content Distribution Network (CDN) serves to provide some special solution to these issues with aid of surrogate servers. To disseminate data effectively to the end users, surrogate servers are placed in different geographical locations to maintain large number of replicas to act on behalf of origin servers. These results to reduce the load on the origin server, traffic on the Internet, latency time and improves response time to the users considerably. Request routing is a familiar technique to progress the accessibility of web sites. This paper presents an innovative technique referred as Geo-Deflection (GD) method using Geographical Information System (GIS) for request routing in CDN that reduces the client latency significantly. In this GD method, geographical co-ordinates (longitudes & latitudes) of local DNS are used to find the nearest surrogate server with minimum computation. The Geo-Deflection method enables faster redirection without the need for traditional proximity check with RTT value from the surrogate to local DNS. Geo-Deflection method is a GIS based model blending of geographical co-ordinates and the legendary mathematics to find the nearest replica in the Internet

1. Introduction
CDNs burst onto the scene in 1999 to address the fact that the Internet was not designed to handle large transmissions of Web content over long distances. Network congestion and traffic bottlenecks, worsen by growing payloads of Web traffic degrades both individual Web site and network performance. Content delivery networks are dedicated collections of servers located strategically across the wide-area Internet. In a word, the goal is to push content as close to the user as possible to minimize content latency and jitter and to maximize available bandwidth speed. Content providers, such as web sites or streaming video sources, contract with commercial CDNs [1,2,3] to host and distribute content. Once in a CDN, content is replicated across the wide area, and hence is highly available. Since most CDNs have servers in ISP points of presence, clients can access topologically nearby replicas with low latency. The largest CDNs have thousands of servers dispersed throughout the Internet and are capable of sustaining large workloads and traffic hot spots. CDNs are tightly integrated into the existing web architecture, relying either on DNS interposition [4, 5] or on URL rewriting at origin servers to redirect HTTP requests to the nearest CDN replica. As with the web, the unit of transfer in a CDN is an object, and objects are named by URLs. Unlike the web, content providers need not manage web servers, since clients’ requests are redirected to replicas hosted by the CDN.

Application of content delivery networking, one of the hot topics in the networking and the biggest IP trends going, are quickly branching out. Content networks leverage high-layer network intelligence to efficiently manage the delivery of data, which is becoming increasingly multimedia, oriented such as images, advertisements, or media clips in nature.

Previous research has investigated the use and effectiveness of content delivery networks [6], although the proprietary and closed nature of these systems tends to hinder investigation. Two recent studies [7, 8] confirm that CDNs reduce average download response times, but that DNS redirection techniques add perceptible overhead because of DNS latencies. In another study [9], the authors argue that the true benefit of CDNs is that they help clients avoid the worst-case of badly performing replicas, rather than routing clients to a truly optimal replica.

The rest of this paper is organized as follows. Section 2 describes various request routing methods that are deployed in CDNs. Section 3 present the discussion of DNS based Request Routing method. Proposed work and analysis of results are discussed in Section 4. Section 5 concludes the paper with final remarks.

2. Request Routing
A typical CDN contains active components like Request Routing System (RRS), Distribution System (DS) and Accounting System (AS) and Surrogate server as shown in the fig1. A request-routing system [10,11] facilitates the activity of directing a client request to a suitable surrogate server. It consists of a set of network elements called Request routers. A Distribution system consists of a collection of network elements called content-distributor. It supports the activity of moving a publisher’s content from the origin server to one or more Surrogate servers (using either Push or Pull algorithm). The Accounting system supports the measurement and recording of content distribution and delivery activities.

Servers in a CDN are located at different locations in the Internet called surrogate servers. Client typically access content from surrogate servers by first contacting a request router. The request router makes a server selection decision and returns a server address to the client. The client then retrieves the content from the specified surrogate servers.

A primary issue for a request router is how to direct client requests for an object served by the surrogate servers within the network. From the database, Request routers choose the best server using static and dynamic information of various surrogate servers. Surrogate servers pass the information such as metrics of the server, network conditions and client proximity to the Request router.


Fig1: CDN Architecture.


Different techniques like DNS based, Transport layer based, Application layer based and content-based request routing are implemented in CDN request router. In this paper we depict the DNS based request routing and its limitation. A new approach called GIS based Geo-Deflection method is proposed to reduce the client latency.

3. DNS based Request Routing
Domain Name Server (DNS) based request routing is extensively deployed in the Internet at present. DNS based request routing [12] is also deployed as a directory service in CDNs for resolving the client request to appropriate surrogate server address. Specialized DNS server included in DNS system does the DNS resolution process [13-15]. DNS server is capable of resolving single or multiple surrogate address to handle the domain name of the desired website or content.

The client request for a web site or content in the Internet, subsequently the request moves towards the nearest local DNS. The local DNS request to CDN DNS or Request router and it returns the apt surrogate server address that can be either single or multiple replies. Client gets the surrogate address as a reply from the local DNS and contacts the surrogate server for website or content. Then the content is transferred from the surrogate servers to the client. If multiple surrogate server address is sent as reply to the client, Client site DNS server decides to which surrogate server it should select from the reply. RFC 2782 (DNS SRV) provides guidance on the use of DNS for load balancing [16]. The aforesaid methods are single address DNS server resolution system. In this, the most common mechanism used to insert multiple requests routing DNS server, in a single DNS resolution is by employing Name Server (NS) and Canonical NAME (CNAME).

In NS redirection, records are redirecting the authority to next hierarchical level. Here, to implement NS mechanism, Multiple DNS server is implicated in the name resolution. For example, a client site DNS server resolving sample.test.edu would eventually request a resolution of sample.test.edu from the name server, authoritative for test.edu. The name server authoritative for this domain might be a Request-Routing NS server. In this case the Request-Routing DNS server can either return a set of A records or can redirect the resolution of the request sample.test.edu to the DNS server that is authoritative for example.com using NS records.

In CNAME redirection [12], the Request-Routing DNS server returns a CNAME record to direct resolution to an exclusively new domain. In principle, the new domain might use a new set of Request-Routing DNS servers. Multiple physical DNS servers that combine request routing and metric measurement can share an anycast IP address [17]. The packet containing the DNS resolution request will reach one of these DNS servers, which is the closest to the client site DNS server. After receiving the packet, the DNS server knows that it is the closest and can use this information in making routing decision.


Fig2: DNS based Request Routing


The Request Routing System makes a server selection decision and returns a server address to the client. Here in DNS based request routing system, server selection is made by measuring the distance of the surrogate servers to the local DNS server. The metric measurements are done by computing the Round Trip Time (RTT) from the surrogate to local DNS. Then the measured upshot is given to the Request Router, the measured upshots are compared and the server having the least RTT is selected as the best Surrogate for the given request.

The measured results may vary time to time and the total time of client latency is very large in this case for measuring the surrogate network proximity.

In terms of Equation



Where,
TCL=Total Client Latency.
Do - TCP connection Establishing Time (millisecond) from client to local DNS server.
LDo - Local DNS connection Establishing Time (millisecond) to Request Routing System.
RSn - Resolving time of RRS.
Sn - TCP connection Establishing Time (millisecond) from RRS to Surrogate.
RTTn – Round Trip Time (millisecond) from Surrogate to local DNS Server.

4. GIS based Geo Deflection
The GIS based Geo-Deflection method enables faster redirection without the need for proximity check with RTT value from the surrogate to local DNS. Assign Geo-Co-ordinates for surrogate servers that are placed in different location in a CDN. This information should be stored in RRS of that CDN. Geo-co-ordinates consist of latitude and longitude for the particular geographical location in the world. When surrogate servers are placed using the replica placement algorithms geo co-ordinates are also embedded in the server. These Geo-co-ordinates (latitude & longitude) of all the surrogates are also stored in the Request Routing System. Using this information, RRS effortlessly can locate the geographical locations of the surrogate’s.


Fig3: Geo-Deflection based Request Routing

There are two steps to the Geo-Deflection process as discussed below

Step 1: Transfer of Geo Co-ordinates of Local DNS to RRS

In the first step, clients request is processed in the local DNS (nearest DNS) and the requested content (URL) is forward to RRS of the CDN with the Geo-co-ordinates of the local DNS server. All the local DNS should also be implemented with Geo co-ordinates.

Step 2: Compute the distance of Local DNS and Surrogates with Geo Co-ordinates.

After getting the Geo-co-ordinates of the local DNS server, the RRS compute the distance between the surrogates having content of the URL and the requested local DNS. This computation is done with the aid of Geo co-ordinates

4.1 Distance Calculation
Using Geographical Information System earth is divided into latitude and longitude, represented as lines on the globe.
  • Latitude lines go around the globe horizontally (parallel to the equator).
  • Longitude lines go around the globe vertically (perpendicular to the equator).

The earth can be described as a sphere. Each line of latitude or longitude makes a circle around the earth. Circles have 360 degrees (360°). There are 360 lines of latitude and 360 lines of longitude in total, represented as lines on the globe. Each line of latitude or longitude represents one degree of the circle.

We only measure up to 180° latitude/longitude, however, because the earth is divided into hemispheres. The northern hemisphere is north of the equator. The southern hemisphere is south of the equator. The eastern hemisphere is east of the Prime Meridian, which passes through Greenwich, England. The western hemisphere is west of the Prime Meridian. When we write a place’s location in latitude and longitude, we write in degrees N (north) or S (south) and degrees E (east) or W (west). We also divide degrees latitude/longitude into smaller measurements. Each degree of latitude/longitude can be divided into 60 minutes. Each minute can be divided into 60 seconds. These smaller measurements let us measure a place’s location more exactly on the surface of the earth. For example, The Capital of India, Delhi is located at 28.67N 77.21W. To find the distance between the Local DNS and Surrogates we have to use the Pythagorean Theorem [18].

According to the Pythagorean Theorem for Fig4, taking perpendiculars from Local DNS (lat0, lon0) and Surrogate1 (lat1, lon1) and Surrogate2 (lat2, lon2) we will get location A (lat1, lon0) and B (lat2, lon0) respectively.

La1 = Distance from Location A to
Local DNS


Fig4: Pythagorean Theorem based Geo-Deflection



Whenever client request the content, the request goes to Request Routing System via the Local DNS embedding its Geo Co-ordinates (lat0, lon0) with the request. So that, with the Geo-Co-ordinates of Local DNS, RRS computes the distance between the surrogates and local DNS with the equation 4 & 5. Request Routing System decides best surrogate for the request by selecting the minimum distance from the computation. The time taken for computation is CTo. For the example shown in Fig4 after the computation, the RRS selects the minimum metric surrogate either S1 or S2.
The equation for GD method is



4.2 Analysis of the Results
DNS based request routing is easy and simple to implement but some of its limitations are: Domain level resolution is only possible in DNS based request routing. Not all DNS realization is standard. Name server based DNS request routing system supports only with the information of client site server, which has the database of short time-to-live values (TTL). Some time it can cause timeouts and lead to exception handling condition. So, choosing the value of TTL is very critical. DNS based request routing techniques can suffer from serious limitations. The use of such techniques can overburden third party DNS servers, which should not be allowed [5]. RFC 2782 provides warnings on the use of DNS for load balancing [16].

In the equation (1) the Total Client Latency (TCL) is summation of TCP connection Establishing Time (millisecond) from client to local DNS server (Do), Local DNS connection Establishing Time (millisecond) to Request Routing System (LDo) and Resolving time of RRS It represents for DNS based request routing. If DNS based Request Routing’s Total Client Latency (TCL) is compared with Geo-Deflection based Request Routing, the resolving time (RSn) is absolutely eliminated and only a negligible Computational time (CTo) is added. Herewith the Total Client Latency (TCL) is reduced by 60% percent as shown table1 experimental result. Hence this can prove that GIS-GD method is efficient in terms of minimizing Client Latency



5. Conclusion
This paper compares DNS based Request-Routing and GIS based Geo-Deflection based Request-Routing used in content distribution networks. The proposed approach mainly focuses to minimize client latency compared to existing mechanisms used for request routing. The distance calculation based on GIS Geo-Co-ordinates uses Pythagorean Theorem, which proves to be very effective in computing the distance from the local DNS (client side) to nearest surrogate. The model achieves scalability and faster response to client with little overhead; provides flexibility in indicating surrogate and client location to local DNS Server.

Reference
  1. Akamai. Content Delivery Network. http://www.akamai.com.
  2. Digital Island. www.digitalisland.com.
  3. Speedera: www.speedera.com.
  4. J. Kangasharju, K.W. Ross, and J.W. Roberts. Performance evaluation of redirection schemes in content distribution networks. Computer Communications, 24(2):207–214, 2001.
  5. A. Shaikh, R. Tewari, and M. Agrawal. On the effectiveness of DNS-based server selection. In Proc. of IEEE INFOCOM 2001, Anchorage, AK, USA 2001.
  6. S. Gadde, J. Chase, and M. Rabinovich. Web caching and content distribution: A view from the interior. In Proc. of the 5th International Web Caching and Content Delivery Workshop, May 2000.
  7. M. Koletsou and G. M. Voelker. The Medusa proxy: A tool for exploring user-perceived web performance. In Proc. of the Sixth Int. Workshop on Web Caching and Content Distribution, June 2001.
  8. B. Krishnamurthy, C. Wills, and Y. Zhang. On the use and performance of content distribution networks. In Proc. of SIGCOMM IMW 2001, Nov. 2001.
  9. K. L. Johnson, J. F. Carr,M. S. Day, and M. Frans Kaashoek. The measured performance of content distribution networks. Computer Communications, 24(2), 2001.
  10. B. Cain, F. Douglis, M. Green, M. Hofmann, R. Nair, D. Potter, and O. Spatscheck, Known CDN Request-Routing Mechanisms", IETF Internet-Draft, draft-cain-cdnp-known-request-routing-03.txt November 2001.
  11. M. Day, B.Cain, and G. Tomlinson, "A Model for CDN Peering", http://www.ietf.org/internet-drafts/draft-day-cdnp-model.04.txt November 2000.
  12. Md. Humayun Kabir, Eric G. Manning, Gholamali C. Shoja., “Request-Routing Trends and Techniques in Content Distribution Network” Parallel, Networking, Distributed Applications (PANDA) Laboratory, University of Victoria , Canada.
  13. Eastlake, D. and A, Panitz, "Reserved Top Level DNS Names", BCP32, RFC 2606, June 1999.
  14. Mockapetris P.,"Domain names – concepts and facilities", STD13, RFC 1034, November 1987.
  15. Mockapetris P., "Domain names – concepts and facilities", STD13, RFC 1035, November 1987.
  16. Gulbrandsen A., Vixie, P. and L. Esibov, "A DNS RR for specifying the location of services (DNS SRV)", RFC 2782, February 2002.
  17. A.Barbir et al., “Known CN request-routing mechanisms.” Internet Draft (draft-ietf-RFC 3568, July 2003.
  18. http://www.cut-the-knot.org/ pythagoras/index.shtml