|
|
![]() |
Clusters of workstations and/or PCs have gained increased attention as low cost parallel computing platforms over the past years. High performance CPUs and interconnect technology make clusters an attractive hardware platform for many applications of industrial relevance that have been the domain of supercomputing for a long time. However, to make use of these cost effective platforms, application software has to be parallelized.
Simulation based design is one domain that is of particular interest to industry, especially to small to medium size companies who cannot afford access to traditional supercomputers or MPPs. Since there are a number of mature software packages in Fluid Dynamics, Finite Element Analysis, and Digital Mockup, software projects that aim at exploiting cluster power usually will be parallelizations of existing software. Parallelizing large-scale industrial software packages, however, requires appropriate software engineering methods.
In the past, quite a few projects have parallelized industrial simulation codes. An example is the EUROPORT initiative, where a number of industrial codes have been ported to parallel computers. Only very few software vendors, however, actually offer parallel execution as an option in their products. The goal of TFCC's Technical Area Software Engineering is to overcome the obstacles that prevent parallel software to come into widespread use on clusters. In particular, TFCC-SWE seeks to promote research and facilitate exchange of experience related to the issues listed below.
PVM and MPI currently are the standards in distributed
memory computing. However, Symmetric Multiprocessors have become increasingly popular. Actually,
most high-end workstations and PCs are SMPs with two to eight processors. Thus we are faced with
two levels of parallelism in future clusters. OpenMP and
POSIX threads are two standards for programming SMPs. Future research will have to adress
hierarchical parallelism. In distributed Computing, CORBA has
established as a standard that allows one to interface software packages in a heterogeneous
environment. In particular, legacy applications can be integrated as components into new
distributed environments. In order to promote exchange of experience, we provide a
list of projects that use any of the programming environments mentionned
above.
In scientific computing, most parallel software projects are parallelizations of existing
software. Due to the limitations of automatic parallelization and data parallel languages,
many software packages have to be parallelized manually. This process usually requires
interdisciplinary cooperation. To achieve high efficiency and scalability, a well-defined
software engineering process is required that is consequently applied throughout the
project. Using standardized programming model, all parallel software is developed for a
whole range of platforms. However, achieving good efficiency on clusters is a particular
challenge for a number of reasons. Interconnection networks (WANs, Ethernet based LANs)
exhibit much higher latency than MPP interconnects. A remarkable exception are SCI based
clusters. Clusters are usually heterogeneous, i.e., nodes have different types (and numbers)
of CPUs, different CPU power and memory capacity. In addition, multiple users compete for
resources. This makes resource management and dynamic load balancing a pariticularly
important issue. Distributed object oriented computing based on standards like CORBA makes
exisiting applications (including legacy systems) interoperate in a cluster
environment.
list of projects related to the software engineering issues mentionned
above.
Appropriate tools for program development and analysis are a key prerequisite to
productivity in engineering parallel and distributed software. EuroTools and PTools a two consortiums that coordinate and promote
research in tools for parallel programming. They provide information on available tools
and ongoing research projects. At LRR-TUM, a
standard for on-line monitoring, OMIS, has
been defined that provides a basis for an integrated tool environment. A reference
implementation, OCM, is
currently being implemented. On top of OCM, the parallel debugger DETOP
will be made available on a series of common parallel platforms as the first part of the
integrated tool environment THE
TOOL-SET.
Program analysis tools are a Technical Aera of its
own in the TFCC.
Contact:
Peter Luksch
LRR-TUM,
Institut für Informatik
Technische Universität München
D-80290 München
Germany
e-mail: Peter.Luksch@computer.org
phone: ++49-89-289-28164 or ++49-89-289-22382
fax: ++49-89-289-28232
This page still is under construction. Please send comments, suggestions, or
contributions to Peter.Luksch@in.tum.de