Review of "A Case for NOW"

From: Guoli Li <gli_REMOVE_THIS_FROM_EMAIL_FIRST_at_cs.toronto.edu>
Date: Wed, 14 Sep 2005 19:33:35 -0400

Cheaper computers such as PCs and workstations occupy a larger market than the expensive mainframes. This paper proposes an approach of building a large system out of small mass-produced computers to beat a single supercomputer or mainframe.

Network of Workstation (NOW) is a parallel computing system consisting of a large number of workstations. The goal of NOW is to provide powerful, low latency, high bandwidth and scalable interconnection environment for large parallel computing applications by simply connecting switches together with inexpensive components. In NOW, the workstations are connected by switches which can be configured into arbitrary topologies and provide high-speed interconnections. The hardware architecture makes the resource sharing, such as memory, file system and process management, among workstations possible.

Compare to MPP and other existing solutions, NOW offers the following advantages: first, using the aggregate DRAM of a NOW as a cache of disk improves the program running time, cache miss rate and red response time. Active clients can use the memory of idle components without disturbing other active clients. Larger parallel programs can run with the network DRAM of NOW. Second, NOW builds a software RAID, which provides better availability than a hardware RAID and avoids single point failure introduced by a central management host in a hardware RAID. The file system built on shared-memory and a software RAID provides highly available file service at low cost. Third, global services provided by GLUnix are used to manage the resources in NOW. GLUnix is OS-independent, which means it can be easily built on other OSs. Furthermore, the model is robust to failures. If a workstation in NOW fails, only programs running on the workstation is affected while other programs could continue without disturbing. Moreover, hardware and software hot swap upgrades are supported.

However, the NOW system has some limitations. First, since network process ID is attached to every message, message communication needs to invoke the operating system. This may add more overhead to low latency communication. Second, to support features of NOW, the workstation operating system need to be modified. For instance, xFS extends Solaris to provide a global, high performance, high available file system. Third, security problems are not fully addressed in NOW. NOW relies on physical security provided by the organizations to prevent malicious users and trusts individual machines connected in the system. Applications of NOW are limited by this point.

Received on Wed Sep 14 2005 - 18:38:17 EDT

This archive was generated by hypermail 2.2.0 : Wed Sep 14 2005 - 19:08:59 EDT