High Performance Computing

A Review of High Performance Computing



Grid computing infrastructures and applications ranges from high bandwidth networks to wireless networks.

Grid computing will reach its vast potential if and only if, the underlying networking infrastructure is able to transfer data across quite long distances in a very effective manner. Experience shows that advanced distributed applications executing in existing large scale computational grids are often able to use only a small fraction of available bandwidth. The reason for such a poor performance is the TCP, which is not designed for high bandwidth, high delay network environments. To overcome the above said problems, several new transport protocols have been introduced, but a very few are widely used in grid computing applications. We review such protocols and assess its performance and some research activities that can be done in these protocols.

Index Terms—

Protocol, Grid Computing, Bandwidth, Delay Networks, Performance


GRID computing [1], [2] is a technology for coordinating large scale resource sharing and problem solving among

various autonomous groups. Grid technologies are currently distinct from other major technical trends such as internet, enterprise distributed networks and peer to peer computing. Also it has some embracing issues in QoS, data management, scheduling, resource allocation, accounting and performance.

Grids are built by various user communities to offer a good infrastructure which helps the members to solve their specific problems which are called a grand challenge problem. A grid consists of different types of resources owned by different and typically independent organizations which results in heterogeneity of resources and policies. Because of this, grid based services and applications experience a different resource behavior than expected. Similarly, a distributed infrastructure with ambitious service put more impact on the capabilities of the interconnecting networks than other environments.

Grid High Performance Network group works on the network research, grid infrastructure development. In their document [3] the authors list a six main functional requirements, which are considered as mandatory requirements for a grid applications. They are: i) High performance transport for bulk data transfer, ii) Performance controllability, iii) Dynamic Network resource allocation and reservation, iv) Security controllability, v) High availability and vi) Multicast to efficiently distribute data to group of resources.

New trends of data communication protocols are needed for Grid, because of these reasons: i) network technology which is evolving to incorporate very high bandwidth and long delay networks (e.g. LambdaGrid [4],OptiPuter [5],CANARIE [6]), as well as wireless and sensor networks [7],ii) communication patterns which are shifting from point-to-point communication to multipoint-to-point and multipoint-to-multipoint communication and iii) each grid application has unique communication needs across a diverse range of transport characteristics, such as transmission rate, delay and loss ratio. For all these data communications, standard transport protocols such as TCP [8] and UDP [9] are not always sufficient or optimal for these emerging grid computing scenarios. For example, traditional TCP reacts adversely by increasing in bandwidth and delay [10] leading to poor performance in high bandwidth-delay product networks. It is still a challenge for grid applications to use the bandwidth available, due to the limitations of current network transport protocols. The limitations of today's network transport protocols are one of the main reasons that it is difficult to scale data intensive applications from local clusters and MAN to WAN.

For instance, a remote visualization application many require unreliable fixed rate data transfer, where as a fast

message parsing application may demand reliable data transfer with minimum end-to-end delay, and a wireless grid application may need to minimize data traffic to prolong battery life. However, since TCP is designed as a general reliable transport protocol and UDP is only a minimal transport protocol with essentially no guarantees, they cannot generally provide the optimal mix of communications attributes for each grid application. More number of transport protocols have been developed and proposed over the last 3 decades. In this paper, we review the various variants of such transport protocols based on TCP and UDP and compare the protocols in various points for grid computing. Each protocols is reviewed based on the i)operation, ii)operation mode, iii)implementation, iv)congestion control, v)fairness vi)security, vii)quality of service, viii)usage scenario.

Section II highlights the issues that has to be taken care of when designing high performance protocols and it briefs how TCP and UDP plays an important role in implementing grid computing protocols. Section III surveys the various protocols that are created for bulk data transfer and for high performance with base as TCP. Section IV surveys UDP based protocols for bulk transfer. Section V deals with application layer protocols for grid computing and in last summary of the paper.


The emerging high-performance grid encompasses a wide range of network infrastructures and communication patterns, as well as different types of scientific applications. The combination of all these factors highlights several new trends for data communication in grid environments such as networks with very high bandwidth and long delay networks, communication shifting needs diverse range of transport characteristics, such as transmission rate, delay, and loss ratio. For these applications, available standard protocols (TCP and UDP) are not sufficient or optimal because of their properties and lack of flexibility.

Performance of TCP not only depends on transfer rate, but also on the product of round-trip delay and transfer rate. This bandwidth*delay product measures the amount of data that would fill the pipe, and it is the buffer space required at sender and receiver side to obtain maximum throughput on the TCP connection over the path, i.e., to keep the pipeline full. TCP performance problems arise when the bandwidth*delay product becomes large. Three fundamental performance problems with the TCP over high bandwidth delay network paths are; i) window size limit, ii) recovery from losses, and iii) round-trip measurement.

In the grid high performance research document [3] summarizes the networking issues in grid applications and gives some considerations for designing a protocol for grid such as: i) slow start, ii) congestion control, iii) ACK clocking, iv) connection setup and teardown and in [11], the author lists some parameters which relates to TCP performance in high speed networks. They are: i) cwnd increase function, ii)responsiveness requirements, iii)scalability and iv) network dynamics.

The dramatic evolution of network infrastructures and communication patterns means that an increasing number of grid applications are demanding enhanced and diverse transport level functionality. The properties and QoS guarantees required can vary widely depending on the intended network environment (dedicated links or shared links), communication patterns (point-to-point or multipoint-to-point), and application requirements. In [12] some general set of transport attributes are listed for grid computing: i) connection oriented vs. connectionless, ii) reliability, iii) latency and jitter, iv) sequenced vs. unsequenced, v) rate shaping vs. best effort, vi) fairness and vii) congestion control.

A. Need of High Performance Computing Protocol

TCP works well on the commodity internet, but has been found to be inefficient and unfair to concurrent flows as bandwidth and delay increase [13]–[16]. Its congestion control algorithm needs a very long time to probe the bandwidth and recover from loss in high BDP links. Moreover, the existence of random loss on the physical link, the lack of a buffer on routers, and the existence of concurrent bursting flows prevent TCP from utilizing high bandwidth with a single flow. Furthermore, it exhibits a fairness problem for concurrent flows with different round trip times (RTTs) called RTT bias. The success of TCP is mainly due to its stability and the wide presence of short lived, web-like flows on the Internet. However, the usage of network resources in high performance data intensive applications is quite different from that of traditional internet applications. First, the data transfer often lasts a very long time at very high speeds. Second, the computation, memory replication, and disk I/O at the end hosts can cause bursting packet loss or time-outs in data transfer. Third, distributed applications need cooperation among multiple data connections. Finally, in grid computing over high performance networks, the abundant optical bandwidth is usually shared by a small number of bulk sources. These constraints made the researchers to design a new transport protocol for high performance domains.

B. Previous Work

Several transport protocols for high-speed data transfer have been proposed, including NETBLT [17], FAST-TCP

[18], LTCP [19] and CUBIC [20]. They all use rate-based congestion control mechanism. Researchers have continually worked to improve TCP, TCP SACK [21], TCP Westwood [22], TCP Vegas [23] and FAST TCP, HighSpeed TCP [24], Scalable TCP [25], and CUBIC.


The new challenges for TCP have been addressed by several research groups in the last 3 decades and, as a result, a number of new TCP variants have been developed. An overview and survey of protocols that are designed for high speed bulk data transfer in high bandwidth delay networks are given here, and the TCP variants analyzed in this paper are summarized in Table I.

Emerging networks bring new challenges. First, the new architectures, heterogeneous networks, mobile and wireless environments exhibit different network characteristics requiring more attention on the dynamical aspects of the operation. As a consequence of this new network environments and properties, the dynamic behavior of TCP flows has to be taken into consideration. On the one hand, it is obvious that the dynamic effects have significant impact on the performance and throughput of the TCP flows [26]. On the other hand, the fairness also needs to be reconsidered from the aspects of dynamic behavior. The performance analyses of various TCP variants are included in many papers.

This work mainly deals with the performance of a new proposal or the interaction of standard TCP and the new mechanism. In [27], [28], a simulation-based performance analysis of HighSpeed TCP is presented and the fairness to regular TCP is analyzed. In [25], the author deals with the performance of Scalable TCP and analyzes the aggregate throughput of the standard TCP and Scalable TCP based on an experimental comparison. In [29], the performance of different TCP versions, such as HighSpeed TCP, Scalable TCP, and FAST TCP, are compared in an experimental test-bed environment. In all cases, the performance among connections of the same protocol sharing a bottleneck link is analyzed and different metrics such as throughput, fairness, responsiveness, stability are presented.

A. Scalable TCP (S-TCP)

The scientific community pushes the edge of network performance with applications such as distributed simulation, remote laboratories, and frequent multi gigabyte transfers. TCP is blamed because, it is slow in capturing the available bandwidth of high performance networks, mostly because of two reasons: i) size of socket buffer at the end-hosts which limits the transfer rate and the maximum throughput and ii) Packet losses cause large window reductions, with a subsequent slow (linear) window increase rate, reducing the transfers average throughput. Researchers have focused on these problems, pursuing mostly three approaches: TCP modifications, parallel TCP transfers, and automatic buffer sizing.

Changes in TCP or new congestion control schemes, possibly with cooperation from routers, can lead to significant benefits for both applications and networks. Parallel TCP connections can increase the aggregate throughput that an application receives. This technique raises fairness issue and also, the aggregate window increase rate is N times faster than that of a single connection. Several researchers have proposed TCP modifications, mostly focusing on the congestion control algorithm, aiming to make TCP more effective in high performance paths. Kelly proposed Scalable TCP [25]. The main point in the Scalable TCP is that it uses constant window increase and decrease factors, and a multiplicative increase rule when there is no congestion.

Scalable TCP is designed or motivated by the port performance of TCP when used for bulk transfers in high speed networks. High Energy Physics, Bioinformatics and Radio astronomy communities require global distribution for the data to be analyzed effectively. Scalable TCP alters the congestion window size in sender-side and it offers a good mechanism to increase the performance in high speed wide area networks using traditional TCP receivers. Scalable TCP is designed in the way, it can be incrementally deployable and behave identically to traditional TCP stacks when small windows are sufficient. Experimental results of Scalable TCP shows that some problem persists in web traffic, which has to be taken in account when it is put in use and more detailed study is required.

B. Hamilton-TCP (H-TCP)

Several approaches have been proposed for designing protocols for high-speed and long distance networks [34] ranging from minor modifications to conventional TCP, to a complete protocol redesign. H-TCP falls in the former category and represents an evolution of conventional TCP rather than a variant from it. The motivation for this approach are: (i) TCP has proved to be remarkably effective and robust in regulating network congestion and (ii) it seems likely that TCP will continue to be deployed in a variety of networks in future and be backward compatible with conventional TCP. H-TCP is motivated by the simple observation that _i should be small in conventional networks and large in high-speed and long distance networks. H-TCP concentrates on modifying the basic TCP paradigm by adjusting the rate _i at which a source inserts packets into a network to react to the prevailing network conditions. The key innovative idea in H-TCP approach is to make the _i increase as a function of the time elapsed since the last packet drop experienced by the ith source. Specifically, H-TCP amends conventional TCP in the following manner. In the high-speed mode the increase function of source i is _Hi (_i) and in the low-speed mode is _Li . The mode switch is governed by:

where _i is the time elapsed since the last congestion even experienced by the ith source, _Li is the increase parameter

for the low-speed regime (unity for backward compatibility), _Hi (_i) is the increase function for the high speed regime, and _L is the threshold for switching from the low to high speed regimes. The increase function _Hi is a design parameter which is set according to the below equation:

C. High Speed TCP (HS-TCP)

Applications seeking more bandwidth, such as bulk-data transfer, multimedia web streaming, and computational grids are becoming more in high performance computing, these applications are often operating over a network with high BDP so the performance over these networks has become a critical issue [30]. Recent experience indicates that TCP has difficulty in utilizing high-speed connections. Because of this, network applications are not able to take full advantage of high-speed networks and are often not utilizing the available bandwidth [56].The packet drop rate needed to fill a Gigabit pipe using the present TCP protocol is beyond the limit of currently achievable fiber optic error rates, and the congestion control is not so dynamic [31].

Most users are unlikely to achieve even 5 Mbps on a single stream wide-area TCP transfer, even the underlying network infrastructure can support rates of 100 Mbps or more. HSTCP (HighSpeed TCP) is a recently proposed variant to the TCP. It is specifically designed for use in high-speed, high bandwidth networks. Congestion management allows the protocol to react and to recover from congestion and operate in a state of high throughput yet sharing the link fairly with other traffic. Van Jacobson [32] proposed the original TCP congestion management algorithms. TCPs congestion management is composed of two important algorithms. The slow-start and congestion avoidance algorithms allow TCP to increase the data transmission rate without overwhelming the network. They use a variable called cwnd(congestion window); TCP congestion window is the size of the sliding window used by the sender. TCP cannot inject more than cwnd segments of unacknowledged data into the network. TCPs algorithms are referred to as AIMD (Additive Increase Multiplicative Decrease) and are the basis for its steady-state congestion control.

The performance of a TCP connection is dependent on the network bandwidth, round trip time, and packet loss rate. At present, TCP implementations can only reach the large congestion windows necessary to fill a pipe with a high bandwidth delay product when there is an exceedingly low packet loss rate. Otherwise, random losses lead to significant throughput deterioration when the product of the loss probability and the square of the bandwidth delay are larger than one. The HighSpeed TCP for large congestion window was introduced [33] as a modified version of TCP congestion control mechanism and is designed to have a different response in environments of very low congestion event rate, and to have the standard TCP response in environments with packet loss rates low. Since, it leaves TCP behavior unchanged in environments with mild to heavy congestion, it does not increase the risk of congestion collapse. In environments with very low packet loss rates. HighSpeed TCP has aggressive response function. HighSpeed TCP introduces a new relation between the average congestion window w and the packet drop rate p. The HighSpeed TCP uses three parameters: Low Window, High Window, and High P Low Window which are used to establish a point of transition and ensure compatibility.

The HighSpeed TCP adopts the same response function as regular TCP when the current congestion window is at most Low Window, and uses the HighSpeed TCP response function when the current congestion window is greater than Low Window. High Window and High P are used to specify the upper end of the HighSpeed TCP response function. The Table II presents some sample parameters used for its variables.

The HighSpeed TCP response function is represented by new additive increase and multiplicative decrease parameters and these parameters modify the function parameters according to congestion window value. HighSpeed TCP performs well in high-speed long-distance links. It falls back to standard TCP behavior if the loss ratio is high. In case of burst traffic its link utilization is improved but at the same time there are some issues regarding fairness.

D. Fast TCP

Advances in computing, communication, and storage technologies in global grid systems started to provide the required capacities and an effective environment for computing and science [35]. The key challenge to overcome the problem in TCP using Fast TCP is the current congestion control algorithm which does not scale to this advancement. The currently deployed TCP implementation is a loss-based approach. It uses additive increase multiplicative decrease (AIMD) and it works well at low speed, where AI is too slow and MD too drastic leading to low utilization as a network scales up in capacity. Moreover, it perpetually pushes the queue to overflow. It also discriminates against flows with large RTTs. To address these problems, Fast TCP was designed, which adopts delay based approach.

Fast TCP has three key differences: i) it is an equation based algorithm , ii) for measuring congestion, it uses queuing delay as a primary value, iii) has stable flow dynamics and achieves weighted fairness in equilibrium that it does not penalize long flows. Fast TCP uses delay based congestion control approach. Using queuing delay as the congestion measure has two advantages. First, queuing delay can be more accurately estimated than loss probability because packet losses in networks with large bandwidth-delay product are rare events, and also become loss samples provide coarser information than queuing delay samples. This makes it easier for an equation-based implementation to stabilize a network into a steady state with a high throughput and high utilization. Second, the dynamics of queuing delay seems to have the right scaling with respect to network capacity. This helps maintain stability as a network scales up in capacity.

In Fast TCP congestion is avoided by using maximum link utilization, which can be attained by adjusting the source's sending rate so that resource is shared by all TCP connections. Fast TCP uses two control mechanisms for achieving its objective they are:i) dynamically adjusting the send rate and ii) using aggregate flow rate to calculate congestion measure.Fast TCP performs well in terms of throughput, fairness, stability, and responsiveness but stability analysis was limited to a single link with heterogeneous sources and feedback delay was ignored. Further, many experimental scenarios were designed to judge the properties of Fast TCP but these scenarios are not very realistic.

E. Summary

Table III presents the summary of TCP based protocols for high performance computing.


UDP-based protocols provide much better portability and are easy to install. Although implementation of user level protocols needs less time to test and debug than the in kernel implementations, it is difficult to make them as efficient, because user level implementations cannot modify the kernel code, there may be additional context switches and memory copies. At high transfer speeds, these operations are very sensitive to CPU utilization and protocol performance. In fact, one of the purposes of the standard UDP protocol is to allow new transport protocols to be built on top of it. For example, the RTP protocol is built on top of UDP and supports streaming multimedia. In this section we survey some high performance transport protocol for data intensive grid applications. Table IV gives some of the UDP variants for high performance computing that are analyzed in this paper.


Bulk data transmission is needed for more applications in various fields and is a must for grid applications. The major performance concern of a bulk data transfer protocols is high throughput. In reality, achievable end-to-end throughput over high bandwidth channels are often an order of magnitude lower than the provided bandwidth. This is because it is often limited by the transport protocols mechanism, so it is especially difficult to achieve high throughput, and reliable data transmission across long delay, unreliable network paths.

Generally speaking, errors and variable delays are two barriers to high performance for all transport protocols. A new transport protocol, NETBLT [17], [36], was designed for high throughput, bulk data transmission application and to conquer the two barriers. NETBLT was first proposed in 1985 (RFC 969). The primary goal of the NETBLT is to get high throughout and robust in a long delay and high loss of network. Seeing the problems with window flow control and timers, NETBLT builders decided that the goal will be achievable only by employing a new flow control mechanism, and to gain implementation and testing experience with rate control.

NETBLT works by opening a connection between two Clients (the sender and the receiver) transferring data in a series of large numbered blocks (buffers), and then closing the connection. NETBLT transfer works as follows: the sending client provides a buffer of data for the NETBLT layer to transfer. NETBLT breaks the buffer up into packets and sends the packets using the internet datagrams. The receiving NETBLT layer loads these packets into a matching buffer provided by the receiving client. When the last packet in that buffer has arrived, the receiving NETBLT part will check to see if all packets in buffer have been correctly received or if some packets are missing. If there are any missing packets, the receiver requests to resend the packets. When the buffer has been completely transmitted, the receiving client is notified by its NETBLT layer. The receiving client disposes the buffer and provides a new buffer to receive more data. The receiving NETBLT notifies the sender that the new buffer is created for receiving and this continues until all the data has been sent. The experimental results shows that NETBLT can provide the performance predictability than in parallel TCP.

B. Reliable Blast UDP (RBUDP)

RBUDP [37] is designed for extremely high bandwidth, dedicated or quality of service enabled networks, which require high speed bulk data transfer which is an important part. RBUDP has two goals: i) keeping the network buffer full during data transfer and ii) avoiding TCPs per packet interaction and sending acknowledgements at the end of a transmission.

1 shows the data transfer scheme of RBUDP. In the data transfer phase (1 to 2 in 1 on sender side) RBUDP sends the entire payload to the receiver at the receiver specified rate sing datagram. Since the full payload is sent using UDP which is an unreliable protocol, some datagram's may be lost, therefore the receiver has to keep a tally of datagram's received in order to determine which has to be retransmitted. At the end, the sender sends an end signal by sending 'DONE'via TCP (3 in on receiver side) to the receiver and receiver acknowledges by sending a total number of received datagrams sequence numbers to the sender (4 in on sender side). The sender checks the acknowledgement and resends the missing datagram's to the receiver.

In RBUDP, the most important input parameter is the sending rate of UDP blasts. To minimize loss, the sending

rate should not be larger than the bandwidth of the bottleneck link. Tools such as Iperf [38] and Netperf [39] are typically used to measure the bottleneck bandwidth. There are 3 version of RBUDP available:

i) First version: without scatter/gather optimization - this is naive implementation of RBUDP where each incoming packet is examined and then moved.

ii) Second version: with scatter/gather optimization this implementation takes advantage of the fact that most incoming packets are likely to arrive in order, and if transmission rates are below the maximum throughput of the network, packets are unlikely to be lost.

iii) Third version: Fake RBUDP this implementation is the same as the scheme without the scatter/gather optimization except that the incoming data is never moved to application memory. The implementation result of RBUDP shows that it performs very efficiently over high speed, high bandwidth, and Quality-of- Service- enabled networks such as optically switched networks.


A reliable transfer protocol, Tsunami [40], is designed for transferring large files fast over high-speed networks. Tsunami is a protocol that features rate control via adjustment of inter packet delay rather than a sliding-window mechanism. Data blocks are transferred via UDP and control data are transferred via TCP. The goal of Tsunami is to increase the speed of file transfer in high speed networks that use standard TCP. The size of the datagram is avoided in the initial setup and length of file is exchanged in the initial step [connection setup]. A thread is created at each end [sender and receiver], which handles both network and disk activity. The receiver sends a retransmission signal which has higher priority when there are any missing datagram's and at last it sends an end signal. In Tsunami, the main advantage is, the user can con the parameters such as size of datagram, threshold error rate, size of transmission queue and acknowledgement interval time. The initial implementation of the Tsunami protocol consists of two user-space applications, a client and a server. The structure of these applicationsl

During a file transfer, the client has two running threads. The network thread handles all network communication, maintains the retransmission queue, and places blocks that are ready for disk storage into a ring buffer. The disk thread simply moves blocks from the ring buffer to the destination file on disk. The server creates a single thread in response to each client connection that handles all disk and network activity. The client initiates a Tsunami session by connecting it to the TCP port of the server. Upon connection, the server sends a random data to the client. The client checks the random data by using XOR with a shared secret key and calculates a MD5 check sum, then transmits it to the server. The server does the same operation and checks the check sum and if both are same, the connection is setup. After the authentication and connection setup is done, the client sends the name of file to the server, in the server side, it checks whether the file is available and if it is, sends a message to client. After receiving a positive message from server, client sends its block size, transfer rate, error threshold value. The server responds with the receiver parameters and sends a time-stamp. After receiving the timestamp, client creates a port for receiving file from the server and server sends the file to the receiver.

D. Summary

Table V presents the summary of TCP based protocols for high performance computing.


The need for high performance data transfer services is becoming more and more critical in today's distributed data intensive computing applications, such as remote data analysis and distributed data mining. Although efficiency (i.e. high throughput) is one of the common design objectives in most network transport protocols, efficiency often decreases as the bandwidth delay product (BDP) increases. Other considerations, like fairness and stability, make it more difficult to realize the goal of optimum efficiency. Another factor is that many of today's popular protocols were designed when bandwidth was only counted in bytes per second, so performance was not thoroughly examined in high BDP environments. Implementation also becomes critical to the performance as the network BDP increases. A regular HTTP session may only send several messages per second, and it does not matter that the message processing is delayed for a short time. However, in data intensive applications, the packet arrival speed can be as high as 105 packets per seconds. The protocol needs to process each event in a limited amount of time and inefficient implementations can lead to packet loss or time-outs. People in the high performance computing field have been looking for application level solutions. One of the common solutions is to use parallel TCP connections and tune the TCP parameters, such as window size and number of flows. However, parallel TCP is inflexible because it needs to be tuned on each particular network scenario. Moreover, parallel TCP does not address fairness issues. In this section we review some application level protocols for high performance computing purposes especially for grid applications. Table VI gives some of the applications layer protocols analyzed in this paper for high performance computing applications.

A. Fast Object Based File Transfer System (FOBS)

A FOBS is an efficient, application level data transfer system for computer grids. TCP is able to detect and respond to network congestion; its very aggressive congestion control mechanism results in poor bandwidth utilization even when the network is lightly loaded. User level protocols such as GridFTP [43]–[45] and RBUDP [37] are also able to obtain a very large percentage of the available bandwidth; these approaches rely on the characteristics of the network that provide congestion control. Another approach is SABUL [41] which provides an application level congestion control mechanism that is closely aligned with that of TCP.

FOBS is a simple, user level communication mechanism designed for large scale data transfers in the high bandwidth, high delay network environment typical of computational Grids. It uses UDP as the data transport protocols, and provides reliability through an application level acknowledgement and retransmission mechanism. FOBS employs a simple acknowledgement and retransmission mechanism where the file to be transferred is divided into data units called as chunks. Data is read from the disk transferred to the receiver, and writer to the disk in units of chunks. Currently, chunks are 100 MBs this number being chosen based on extensive impersonation. Each chunk is subdivided into segments, and the segments are further subdivided into packets. Packets are 1470 bytes (within the MTU of most transmission medium), and a segment consists of 1000 packets. The receiver maintains bitmap for each segment in the current chunk depicting the receiver /not receiving status of each packet in the segment. These bitmaps are sent from the data receiver to the data sender at internals dictated by the protocols and triggers (at a time determined by the congestion / control flow algorithm) a retransmission of the lost packets. The bitmaps are sent over a TCP sockets.

B. Light Object Based File Transfer System (LOBS)

LOBS is a mechanism for transferring file in high performance computing networks which are optimized for high bandwidth delay network, especially for computational grids. Heart of this environment is the ability of transferring vast data in an efficient manner. LOBS rectifies the problem found in the TCP for the data transfer mechanism for grid based computations. LOBS is used to transfer large files between two computational resources in a grid and this mechanism is lightweight and it does not support all functionalities of GridFTP. It supports the primary functionality required for computation grids (i.e. fast and robust file transfer). In LOBS the increase in performance optimization is done and the order of data delivered is not considered. A LOBS is built directly on top of FOBS [46]. The TCP window size plays a vital role for achieving best performance in high bandwidth delay networks; this leads to tune the size at runtime. In LOBS, the size of TCP window is tuned using different approach, i.e. using UDP stream. UDP is used because of these reasons: i) user level not kernel level, ii) to avoid multiplexing TCP streams in kernel level and iii) to provide user level enhancements.

Two protocols closely related to LOBS are RBUDP [37] and SABUL [41]. Primary differences between these two protocols are how loss of packet is interpreted and how to minimize the packet loss impacts affecting the behavior of the protocol.

SABUL assumes that packet losses indicate congestion, and it reduces the rate based on the perceived congestion, where as in LOBS it is assumed that some packet loss is inevitable and does not make any changes in the sender rate. Primary difference between LOBS and RBUDP is based on the type of network for which the protocols is designed.

The basic working concept of LOBS is, it creates threads in sender part for controlling its data buffer, to read file from the disk and fills the data buffer. Once the buffer is full, it is transferred to the client over the network. When the data is in transfer mode, the other threads start reading the data from the file and fill the data buffer and the steps are repeated again until the full file is transferred. In LOBS, the goal is to make use of network I/O operation and the disk I/O operation to the largest extent as possible.

C. Simple Available Bandwidth Utilization Library (SABUL)

SABUL [41], [42] is an application level library which is designed for data intensive grid application over high performance networks and is designed to transport data reliably. SABUL uses UDP for the data channel and it detects and retransmits dropped packets. Using TCP as a control channel reduces the complexity of reliability mechanism. To use available bandwidth efficiently, SABUL estimates the bandwidth available and recovers from congestion events as soon as possible. To improve performance, SABUL does not acknowledge every packet, but instead acknowledges packets at constant time interval which is called as selective acknowledgement [21]. SABUL is designed to be fair, so that grid applications can employ parallelism and also it is designed in a way so that all flows ultimately reach the same rate, independent of their initial sending rates and of the network delay. SABUL is designed as an application layer library so that it can be easily deployed without any changes in operating systems network stacks or to the network infrastructure.

SABUL is a reliable transfer protocol with loss detection and retransmission mechanism. It is light in weight with small packet size and less computation overhead, hence can be deployed easily in public networks. In all SABUL flows they are independent of initial rates and network delays and reach similar rates and be TCP friendly. In SABUL, both the sender and the receiver maintain a list of the lost sequence numbers sorted in ascending order. The sender always checks the loss list first when it is time to send a packet. If it is not empty, the first packet in the list is resent and removed; otherwise the sender checks if the number of unacknowledged packets exceeds the flow control window size, and if not, it packs a new packet and sends it out. The sender then waits for the next sending time decided by the rate control. The flow window serves the job of limiting the number of packet loss upon congestion when TCP control reports about the occurrence of delay and the maximum window size is set as Bandwidth _ RT T (useSYNinsteadofRT T ifSYN >RT T )

After each constant synchronization (SYN) time, the sender triggers a rate control event that will update the inter packet time. The receiver receives and reorders data packets. The sequence numbers of loss packets are recorded in the loss list and removed when the resent packets are received. The receiver sends back ACK periodically if there is any newly received packet. The ACK interval is the same as SYN time. The higher the throughput the less ACK packets generated. NAK is sent once loss is detected. The loss will be reported again if the retransmission has not been received after k_RT T , where k is set to 2 and is incremented by 1 each time the loss is reported. The increase of k is to avoid that the sender is blocked by continuous arrival of loss report. Loss information carried in NAK is compressed, considering that loss is often continuous. In the worst case, there is 1 ACK for every received DATA packet if the packet arrival interval is not less than the SYN time; there are M/2 NAKs when every other DATA packet gets the loss for every M sent DATA packets.

D. UDP based Data Transfer Protocol(UDT)

UDT [47], [48] is a high performance data transfer and is an alternative data transfer protocol for the TCP when its performance goes down. The goal of UDT, is to overcome TCP's inefficiency in high bandwidth-delay product networks and is connection oriented unicast and duplex. The congestion control module is an open source, so that it can be used to implement and/or deploy different control algorithms. UDT also has a native/default control algorithm based on AIMD rate control. Rate control tunes the inter-packet time at every constant interval, which is called SYN. Th e value of SYN is 0.01 seconds, an empirical value reflecting a trade off among efficiency, fairness and stability. For every SYN time, when the packet loss rate during the last SYN time is less than a threshold, the maximum possible link Bit Error Rate (BER), the number of packets that will be sent in the next SYN time is increased by:

inc = max(10log10(B−C)_MTU_8 _ _/MTU, 1/MTU)

where B is the estimated bandwidth and C is the current sending rate, both in number of packets per second, _ is a

constant value of 0.0000015. MTU is the maximum transmission unit in bytes, which is the same as the UDT packet

size. The inter-packet time is then recalculated using the total estimated number of sent packets during the next SYN time. The estimated bandwidth B is probed by sampling UDT data packet pairs and UDT is designed for scenarios where a small number of bulk sources share abundant bandwidth. In other scenarios, e.g., messaging or low BDP networks, UDT can still be used but there may be no improvement in performance.

E. GridFTP

A common data transfer protocol for the grid would ideally offer all the features currently available from any of the protocols in use. At a minimum, it must offer all of the features that are queried for the types of scientific and engineering applications that we intend to support on the gird. The existing FTP standard is selected and by extending it with some features, makes a suitable candidate for the common data transfer protocol for the grid which we call 'GridFTP'. GridFTP is used as a data transfer protocol for effectively transferring a large value of data in grid Computing. It supports a feature called parallel data transfer which improves the throughput by creating multiple TCP connections in parallel and automatic negotiation of TCP socket buffer size. GridFTP uses TCP as its transport-level communication protocol. In order to get maximal data transfer throughput, it has to use optimal TCP

send and receive socket buffer sizes for the link being used. TCP congestion window never fully opens if the buffer size is too small. If the receiver buffers are too large, TCP flow control breaks, and the sender can overrun the receiver, thereby causing the TCP window to shut. This situation is likely to happen if the sending host is faster than the receiving host. The optimal buffer size is twice the bandwidth-delay product (BDP) of the link.

Buffersize = 2 _ bandwidth _ delay

The GridFTP is implemented in Globus [49] and uses multiple TCP streams for transferring file. Using multiple TCP streams improves performance because of these reasons: i) aggregate TCP buffer size which is closer to real size and ii) circumvents the congestion control. Several experiments were done for analyzing GridFTP. According to [49] globus url copy achieved a throughput very close to 95%. The windows size was set to bandwidth*RTT, when more than one TCP streams are used, then the window size was set to windows size * num streams. However, to achieve high throughput, the number of TCP connections has to be optimized according to network condition. Problems persist in the file sizes, when the end points want to transfer lots of small files, and then the throughput is reduced. The performance of GridFTP depends on the number of connections used in parallel, the best performance is achieved with 4 connections and when more connections are there, it creates too much control overhead.

F. GridCopy

GridCopy [50], or GCP, provides a simple user interface to this sophisticated functionality, and takes care of all to get optimal performance for data transfers. GCP accepts scp-style source and destination specifications. If well-connected GridFTP servers can access the source file and/or the destination file, GCP translates the filenames into the corresponding names on the GridFTP servers. In addition to translating the filenames/URLs into GridFTP URLs, GCP adds appropriate protocol parameters such as TCP buffer size and number of parallel streams, in order to attain the optimal network performance for the specific source and destination.

Tools such as ping and synack can be used to estimate end-to-end delay; and tools such as IGI [51], abing [52], pathrate [53], Iperf [38] and Spruce [54] can be used to estimate end-to-end bandwidth. Latency estimation tools need to be run on one of the two nodes between which the latency needs to be estimated. For data transfers between a client and server, the tools mentioned above can be used to estimate the bandwidth-delay product. However, in Grid environments, users often perform third-party data transfers, in which the client initiates transfers between two servers.

The end-to-end delay and bandwidth estimation tools cited above are not useful for third-party transfers. King [55], developed at the University of Washington at Seattle, makes it possible to calculate the round-trip time (RTT) between arbitrary hosts on the Internet. GCP uses King to estimate the RTT between source and destination nodes in a transfer. GCP assumes a fixed one Gbits bandwidth for all source and destination pairs. King estimates RTT between any two hosts in the internet by estimating the RTT between their domain name servers. For example, if King estimates the RTT between the source and the destination to be 50 ms, GCP sets the TCP buffer size to 0.05. GCP caches the source, destination, and buffer size in a configuration file which is available in the home directory of the user running GCP. By default, GCP uses four parallel streams for the first transfer between two sites by a user. GCP calculates the TCP buffer size for each stream as follows: BDP/max(1, streams/l f), where l f is set to 2 by default to accommodate for the fact that the streams that are hit by congestion would go slower and the streams that are not hit by congestion would go faster.

The primary design goal for GCP are i) to provide a scp-style interface for high performance, reliable, secure data transfers, ii) to calculate the optimal TCP buffer size and optimal number of parallel TCP streams to maximize throughput and iii)to support configurable URL translations to optimize throughput.

G. Summary

Table VII presents the summary of TCP based protocols for high performance computing.


A detailed review of the most recent developments on network protocols for high performance computing in high a bandwidth delay network has been presented in this paper. We reviewed the protocols based on TCP and UDP. Below are some points which has to be considered when developing high performance application level protocols i) using TCP in another transport protocol should be avoided, ii) using packet delay as indication of congestion can be hazardous to protocol reliability, iii) processing continuous loss is critical to the performance and iv) knowing how much CPU time each part of the protocol costs helps to make an efficient implementation and to address many of the problems with TCP and UDP for high-performance networking in a distributed computational grid and also we have to concentrate on three inter-related research tasks namely: [i] dynamic right-sizing, [ii] high-performance IP, and [iii] Rate-Adjusting, which can lead to efficient high performance transport protocol. The Tables VIII, IX, and Table X presents the overall comparison chart for all the protocols reviewed.


[1] I. Foster and C. Kesselman (1999),”The Grid: Blue print for a new computing infrastructure” Morgan Kaufmann.

[2] I. Foster, C. Kesselman, J. M. Nick and S. Tuecke (2003),”The physiology of the Grid: An open grid services architecture for distributed

systems integration” Grid Forum white paper.

[3] Volker Sander,”Networking Issues for Grid nfrastructure”, GFD-I.037, Nov, 22, 2004.

[4] T. DeFanti, C. D. Laat, J. Mambretti, K. Neggers and B. Arnaud (2003),”TransLight: A global-scale LambdaGrid for e-science.” Communications of the ACM, 47(11), November.

[5] L. Smarr, A. Chien, T. DeFanti, J. Leigh and P. Papadopoulos (2003),”The OptIPuter” Communications of the ACM, 47(11), November.

[6] ”CANARIE.” http://www.canarie.ca

[7] M. Gaynor, M. Welsh, S. Moulton, A. Rowan, E. LaCombe and J.Wynne (2004), ”Integrating wireless sensor networks with the Grid.” Proc. IEEE Internet Computing, Special Issue on Wireless Grids, July/August.

[8] J. Postel (1981), ”Transmission control protocol.” RFC 793, September 1981.

[9] J. Postel (1980), ”User datagram protocol.” RFC 768, September 1980

[10] B. Jacobson (1992) ”TCP extensions for high performance.” RFC 1323,May 1992.

[11] Douglas .J. Leith, Robert N. Shorten (2008), ”Next Generation TCP: Open Questions”, International Workshop on Protocols for Fast Long-

Distance Networks, PFLDNet, 5-7 March 2008, Manchester, UK.

[12] Ryan X. Wu, Andrew A. Chien et.al. (2005), ”A High performance configurable transport protocol for grid computing”, Proc. of 5th IEEE

International Symposium of Cluster Computing and the grid, Vol.2, pp 1117-1125.

[13] D. Katabi, M. Hardley, and C. Rohrs (2002),”Internet Congestion Control for Future High Bandwidth-Delay Product Environments”, ACM SIGCOMM , Pittsburgh, PA, Aug. 19 - 23, 2002, pp 89-102.

[14] J. Padhye, V. Firoiu, D. Towsley, and J. Kurose (1998), ”Modeling TCP throughput: a simple model and its Empirical validation”, ACM Technical Report, UM-CS-1998-008, 1998.

[15] Y. Zhang, E. Yan, and S. K. Dao (1998), ”A measurement of TCP over Long-Delay Network”, Proc. of 6th International Conference on Telecommunication Systems, Modeling and Analysis, Nashville, TN, March 1998

[16] W. Feng et.al. (2000), ”The Failure of TCP in High Performance Computational Grids”, Super Computing, ACM/IEEE 2000 Conference, November, pp 37.

[17] David D Clark, Mark L Lamberl, Lixia Zhang (1987), ”NETBLT: A High Throughput Transport Protocol”, ACM SIGCOMM Computer Communications Review, Vol 17, No. 5, 1987, pp 353-359.

[18] Cheng Jin et.al. (2005), ”FAST TCP: From Theory to Experiments” IEEE Network Communications, Vol. 19, No.1, January/February 2005, pp 4-11.

[19] Sumitha Bhandarkar, Saurabh Jain and A. L. Narasimha Reddy (2006), ”LTCP: Improving the Performance of TCP in HighSpeed Networks” , ACM SIGCOMMM Computer Communications Review, Vol.36, No.1, January 2006, pp 41-50.

[20] Injong Rhee et.al. (2008), ”CUBIC: A New TCP-Friendly High-Speed TCP Variant”, ACM SIGOPS Operating Systems Review, Vol.42, No.5, July 2008, pp 64-74.

[21] S. Floyd, M.Mathis, M. Podolsky (2000), ”An Extension to the Selective Acknowledgement (SACK) Option or TCP”, RFC 2883, July 2000.

[22] M. Gerla, M. Y. Sanadidi, R. Wang, A. Zanella, C. Casetti, and S. Mascolo (2001),”TCP Westwood:Congestion Window Control Using Bandwidth Estimation”, IEEE Globecom 2001,Volume: 3, pp 1698-1702.

[23] Lawrence S. Brakmo, Student Member, IEEE, and Larry L. Peterson (1995), ”TCP Vegas: End to End Congestion Avoidance on a Global Internet”, IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 13, NO. 8, OCTOBER 1995.

[24] S. Floyd ”HighSpeed TCP for Large Congestion Windows”, RFC 3629, December 2003.

[25] Tom Kelly (2003), ”Scalable TCP: Improving performance in high speed wide area networks”, ACM SIGCOMM Computing Communications Review, Vol.33, No.2, April, pp 83-91.

[26] A. Gurtov (2001), ”Effect of delays on TCP performance”, Proceedings of IFIP Personal Wireless Communications 2001 (PWC2001), Aug

2001, Lappeenranta, Finland, pp. 810.

[27] E. Souza, D. Agarwal, ”A highspeed TCP study: characteristics and deployment issues”, Tech. Rep. LBNL-53215, Lawrence Berkeley National Lab (2003). Available from:http://www-itg.lbl.gov/evandro/hstcp/hstcp-lbnl-53215.pdf.

[28] T.A. Trinh, B. Sonkoly, S. Molnr, ”A HighSpeed TCP study: observations and reevaluation” Proc. of 10th Eunice Summer School and IFIP Workshop on Advances in Fixed and Mobile Networks, Tampere, Finland, Jun. 1416, 2004.

[29] C. Jin, D.X. Wei, S.H. Low (2004), ”FAST TCP: motivation, architecture, algorithms, Performance”, Proc. IEEE Infocom 2004, vol.4, Mar.711, 2004, Hong Kong, China, pp. 24902501.

[30] M. Fisk and W. Feng (2001), ”Dynamic right-sizing in TCP” Proc.of International Conference on Computer Communication and Networks, October, pp 152-158.

[31] W. Huntoon, T. Dunigan, and B. Tierney, ”The Net100 project: Development of network-aware operating systems”. http://www.net100.org

[32] V. Jacobson (1988), ”Congestion avoidance and control” Proc. ACM SIGCOMM Conference on Communications Architectures and Protocols, vol. 18, Stanford, CA, August 1988, pp. 314329.

[33] S. Floyd (2003), ”Highspeed TCP for large congestion windows” February 2003, Internet Draft draft–floyd-tcp-highspeed-01.txt.

[34] T. Kelly, ”On engineering a stable and scalable TCP variant” Cambridge University Engineering Department Technical Report CUED/FINFENG/TR.435, 2002.

[35] Wei Steven H. Low, Cheng Jin David X. (2006), ”FAST TCP: Motivation, Architecture, Algorithms, Performance”, IEEE/ACM Trans Networking, Vol.14, No.6, December, pp 1246-1259.

[36] Mark L. Lambert , Lixia Zhang (1985), ”NETBLT: A Bulk Data Transfer Protocol”, RFC 969, December 1985.

[37] Eric He, Jason Leigh, Oliver Yu, Thomas A. DeFanti (2002), ”Reliable Blast UDP: Predictable High Performance Bulk Data Transfer”, Fourth IEEE International Conference on Cluster Computing (CLUSTER'02), pp 317.

[38] http://dast.nlanr.net/Projects/Iperf/

[39] http://netperf.org/netperf/NetperfPage.html

[40] Mark R. Meiss (2008) ”Tsunami: A High-Speed Rate Controlled Protocol for File Transfer”, Indiana University, http://steinbeck.ucs.indiana.edu/mmeiss/papers

[41] Yunhong Gu and Robert Grossman (2003), ”SABUL: A Transport Protocol for Grid Computing”, Journal of Grid Computing, Vol.1, No.4, December, pp 377-386.

[42] Phoemphun Oothongsap, Yannis Viniotis, and Mladen Vouk (2008), ”Improvements of the SABUL Congestion Control Algorithm”, Proceedings of 1st International Symposium on Communication Systems Networks and Digital Signal Processing, July.

[43] W. Allcock, J. Bester, J. Bresnahan, A. Chervenak (2003), ”GridFTP: Protocol Extensions to FTP for the Grid”, GFD-20, April 2003.

[44] John Bresnahan, Michael Link, Gaurav Khanna, Zulfikar Imani, Rajkumar Kettimuthu and Ian Foster (2007), ”Globus GridFTP: Whats New in 2007”, Proceedings of International Conference on Networks for grid applications, Article 19, October.

[45] ] John Bresnahan, Michael Link, Rajkumar Kettimuthu, Dan Fraser and Ian Foster (2007), ”GridFTP Pipelining”, Proceedings of the 2007 TeraGrid Conference, June, 2007.

[46] Phillip M. Dickens (2003), ”FOBS: A Lightweight Communication Protocol for Grid Computing”, Lecture Notes in Computer Science, V2790/2004, pp 938-946.

[47] Yunhong Gu and Robert L. Grossman (2007), ”UDT: UDP-based Data Transfer for High-Speed Wide Area Networks”, International Journal

of Computer & Telecommunications Networks, Vol.51, No.7, May.

[48] Yunhong Gu and Robert L. Grossman (2004), ”UDT: An Application Level Transport Protocol for Grid Computing”, Second International

Workshop on Protocols for Fast Long-Distance Networks, PFLDNet 2004, Feb 16-17, Argonne, Illinois, USA.

[49] Report on ”Performance of Globus Striped GridFTP Server on Tera-Grid”

[50] Rajkumar Kettimuthu et.al. (2007) ”GridCopy: Moving Data Fast on the Grid', Proceedings of Grid Computing, IEEE.

[51] www.cs.cmu.edu/_hnn/igi/

[52] www.wcisd.hpc.mil/tools

[53] http://www.cc.gatech.edu/fac/Constantinos.Dovrolis/bw-est/

[54] www.icir.org/models/tools.html

[55] K. P. Gummadi, S. Saroiu, S., and S. D. Gribble, King: Estimating latency between arbitrary internet end hosts, SIGCOMM Computer

Communication Review, vol. 32, no. 3, pp. 518, July 2002.

[56] ] S. Floyd, S. Ratnasamy, and S. Shenker, Modifying TCPs congestion control for high speeds, 2002, Preliminary Draft. URL: http://www.icir. org/floyd/papers/hstcp.pdf

Suresh Jaganathan, received his Bachelors Degree from Mepco Schlenk Engineering College, Sivakasi, in 1993 and M.E (Software Engg.) from Anna University, Chennai in 2005.Currently he is pursuing his PhD in Jawaharlal Nehru Technological University(JNTU),Hyderabad in the field of Grid Computing. He has published papers in the area of Adhoc Networks and in Grid Computing in both National & International Conferenc and totally he has 12 years of experience in Teaching and is currently working as an Assistant Professor in Department of Computer Science & Engineering in Sri Sivasubramania Nadar College of Engineering, Chennai. His research interests are Grid Computing, Distributed Computing and Neural Networks. He is a Member of IEEE and Life Member of CSI and ISTE in India.

Dr.A.Srinivasan, completed his ME, PhD in Computer Science and Engineering at Madras Institute of Technology, Anna University, Chennai. He has finished his Post doctorate at Nan yang Technological University, Singapore. He has 17 years of Teaching and Research Experience in Computer Science and Engineering field and one year of Industrial Experience. At present, he has five PhD students working under him. He has published more than 32 Research publications in National and International journals and conferences. He is Editorial board member to Journal of Computer Science and Information Technology [JCSIT] and a Reviewer to four reputed International Journals in Computer Science and Engineering field. Currently he is working as Professor in Computer Science and Engineering Department, Sri Sivasubramania Nadar College of Engineering, Anna University, Chennai, India. His field of interests are Digital Image processing and Analysis and Distributed Systems.He is a Member of IEEE and ACM, and a Life Member of CSI and ISTE in India.

Dr.A.Damodaram, has done his M.Tech (CSE) and he completed his PhD in Computer Science field and joined as Faculty of Computer Science and Engineering in 1989 at JNTU, Hyderabad. He has worked in the JNTU in various capacities since 1989. In his 19 years of service Dr.A.Damodaram assumed office as Head of the Department, Vice-Principal and presently he is the Director of UGC Academic Staff College of JNT University Hyderabad. He was board of studies chairman for JNTU Computer Science and Engineering Branch (JNTUCEH) for a period of 2 years. He is a life member in various professional bodies. He is also a UGC Nominated member in various expert/advisory committees of Universities in India. He was a member of NBA (AICTE) sectoral committee and also a member in various committees in State and Central Governments. His Research interests are in Distributed Computing and Grid Computing.

Please be aware that the free essay that you were just reading was not written by us. This essay, and all of the others available to view on the website, were provided to us by students in exchange for services that we offer. This relationship helps our students to get an even better deal while also contributing to the biggest free essay resource in the UK!