how to build and use a beowulf cluster

35
How to Build and How to Build and Use a Beowulf Cluster Use a Beowulf Cluster Prabhaker Mateti Wright State University

Upload: osias

Post on 05-Jan-2016

43 views

Category:

Documents


0 download

DESCRIPTION

How to Build and Use a Beowulf Cluster. Prabhaker Mateti Wright State University. Beowulf Cluster. Parallel computer built from commodity hardware, and open source software Beowulf Cluster characteristics Internal high speed network Commodity of the shelf hardware - PowerPoint PPT Presentation

TRANSCRIPT

How to Build and How to Build and Use a Beowulf ClusterUse a Beowulf Cluster

Prabhaker MatetiWright State University

Mateti, Beowulf Cluster2

Beowulf Cluster Beowulf Cluster Parallel computer built from commodity har

dware, and open source softwareBeowulf Cluster characteristics

– Internal high speed network– Commodity of the shelf hardware– Open source software and OS– Support parallel programming such as MPI, PV

M

Mateti, Beowulf Cluster3

Beowulf ProjectBeowulf Project

Originating from Center of Excellence and Information Systems Sciences(CESDIS) at NASA Goddard Space Center by Dr. Thomas Sterling, Donald Becker“Beowulf is a project to produce the software for off-the-shelf clustered workstations based on commodity PC-class hardware, a high-bandwidth internal network, and the Linux operating system.”

Mateti, Beowulf Cluster4

Why Why Is Is Beowulf Beowulf GGood?ood?

Low initial implementation cost– Inexpensive PCs– Standard components and Networks– Free Software: Linux, GNU, MPI, PVM

Scalability: can grow and shrinkFamiliar technology, easy for user to adopt t

he approach, use and maintain system.

Mateti, Beowulf Cluster5

Beowulf is getting biggerBeowulf is getting bigger

Size of typical Beowulf systems increasing rapidly

0

200

400

600

800

1000

1200

1994 1995 1996 1997 1998 1999

Size

Mateti, Beowulf Cluster6

Biggest BeowulfBiggest Beowulf??

1000 nodes Beowulf Cluster System

Used for genetic algorithm research by John Coza, Stanford University

http://www.genetic-programming.com/

Mateti, Beowulf Cluster7

Chiba City, Argonne National LChiba City, Argonne National Laboratoryaboratory “Chiba City is a scalability testbed for the H

igh Performance Computing communities to explore the issues of

– scalability of large scientific application to thousands of nodes

– systems software and systems management tools for large-scale systems

– scalability of commodity technology”

– http://www.mcs.anl/chiba

Mateti, Beowulf Cluster8

PCPC Components Components

Motherboard and caseCPU and MemoryHard DiskCD ROM, Floppy DiskKeyboard, monitorInterconnection network

Mateti, Beowulf Cluster9

Mother BoardMother BoardLargest cache as possible ( 512 K at least)FSB >= 100 MHzMemory expansion

– Normal board can go up to 512 Mbytes– Some server boards can expand up to 1-2 Gbyte

s – Number and type of slots

Mateti, Beowulf Cluster10

Mother BoardMother Board

Built-in options?– SCSI, IDE, FLOPPY, SOUND USB– More reliable, less costly, but inflexible

Front-side bus speed, as fast as possibleBuilt-in hardware monitor Wake-on LAN for on demand startup/shut

downCompatibility with Linux.

Mateti, Beowulf Cluster11

CPUCPU

Intel, CYRIX, 6x86, AMD – all OKCeleron processor seems to be a good alter

native in many casesAthlon is a new emerging high performanc

e processors

Mateti, Beowulf Cluster12

MemoryMemory

100MHz SDRAM is almost obsolete133 MHz commonRambus

Mateti, Beowulf Cluster13

Hard DiskHard DiskIDE

– inexpensive and fast– controller built-in on board, typically– large capacity 75GB available– ATA-66 to ATA 100

SCSI– generally faster than IDE– more expensive

Mateti, Beowulf Cluster14

RAID SystemRAID Systemss and Linux and Linux

RAID is a technology that use multiple disks simultaneously to increase reliability and performance

Many drivers available

Mateti, Beowulf Cluster15

Keyboard, MonitorKeyboard, Monitor

Compute nodes, don’t need keyboard, monitor, or mouse

Front-end needs monitor for X windows, software development, etc.

Need BIOS setup to disable keyboard on some system

Keyboard Monitor Mouse switch

Mateti, Beowulf Cluster16

Interconnection NetworkInterconnection Network

ATM – Fast (155Mbps - 622 Mbps)– Too expensive for this purpose

Myrinet– Great, offer 1.2 Gigabit bandwidth – Still expensive

Gigabit EthernetFast Ethernet: Inexpensive

Mateti, Beowulf Cluster17

Fast EthernetFast Ethernet

The most popular network for clusterGetting cheaper and cheaper fastOffer good bandwidth Limit: TCP/IP Stack can pump only about

30-60 Mbps onlyFuture technology : VIA (Virtual Interface

Architecture) by Intel, Berkeley have just released VIA implementation on Myrinet

Mateti, Beowulf Cluster18

Network Interface CardNetwork Interface Card

100 Mbps is typical100 Base-T, use CAT-5 cable. Linux Drivers

– Some cards are not supported – Some supported, but do not functi

on properly.

Mateti, Beowulf Cluster19

Performance ComparisonPerformance Comparison(from SCL Lab, Iowa State University)(from SCL Lab, Iowa State University)

Mateti, Beowulf Cluster20

Gigabit EthernetGigabit Ethernet

Very standard and easily integrate to existing system

Good support for LinuxCost drop rapidly,

expected to be much cheaper soon

http://www.syskonnect.com/

http://netgear.baynetworks.com/

Mateti, Beowulf Cluster21

MyrinetMyrinetFull-duplex 1.28+1.28 Gigabit/second links, switch ports, and interface

ports.

Flow control, error control, and "heartbeat" continuity monitoring on every link.

Low-latency, cut-through, crossbar switches, with monitoring for high-availability applications.

Any network topology is allowed. Myrinet networks can scale to tens of thousands of hosts, with network-bisection data rates in Terabits per second. Myrinet can also provide alternative communication paths between hosts.

Host interfaces that execute a control program to interact directly with host processes ("OS bypass") for low-latency communication, and directly with the network to send, receive, and buffer packets.

Mateti, Beowulf Cluster22

Quick Guide for InstallationQuick Guide for Installation

Planning the partitions

– Root filesystem ( / )– Swap file systems (twice the size of memory)– Shared directory on file server

/usr/local for global software installation /home for user home directory on all nodes

Planning IP, Netmask, Domain name, NIS domain

Mateti, Beowulf Cluster23

Basic Linux InstallationBasic Linux Installation

Make boot disk from CD or network distribution Partition harddisk according to the plan Select packages to install

– Complete installation for Front-end, fileserver

– Minimal installation on compute nodes Installation Setup network, X windows system, accounts

Mateti, Beowulf Cluster24

CautionsCautions

Linux is not fully plug-and-play. Turn it off using bios setup

Set interrupt and DMA on each card to different interrupts to avoid conflict

For nodes with two or more NIC, kernel must be recompiled to turn on IP masquerading and IP forwarding

Mateti, Beowulf Cluster25

Setup a Single System ViewSetup a Single System View

Single file structure can be achieved using NFS– Easy and reliable – Scalability to really large clusters? – Autofs system can be used to mount filesy

stem when usedIn OSIS, /cluster is shared from a singl

e NFS server

Mateti, Beowulf Cluster26

Centralized accountCentralized accountss

Centralized accounts using NIS (Network Information System)– Set NIS domain using domainname command– Start “ypserv” on NIS server (usually fileserve

r of front-end)– run make in /var/yp– add “++” at the end of /etc/password file and s

tart “ypbind” on each nodes. /etc/host.equiv lists all nodes

Mateti, Beowulf Cluster28

MPI Installation (MPICH)MPI Installation (MPICH)

MPICH is a popular implementation by Argonne National Laboratory and Missisippy State University

Installation ( in /cluster/mpich)– Unpack distribution– run configure– make – make prefix=/cluster/mpich install– set up path and environment

Mateti, Beowulf Cluster29

PVM InstallationPVM Installation

Unpack the distributionSet environment

– PVM_ROOT to pvm directory– PVM_ARCH to LINUX – Set path to $PVM_ROOT/bin;$PVM_ROOT/

libGoto pvm directory, run make file

Mateti, Beowulf Cluster30

Power requirementPower requirementss

Performance of Beowulf SystePerformance of Beowulf Systemm

Mateti, Beowulf Cluster32

Little Blue Penguin : ACL / LanlLittle Blue Penguin : ACL / Lanl

“The Little Blue Penguin (LBP) system is a parallel computer (a cluster) consisting of 64 dual Intel Pentium II/333Mhz nodes (128 CPUSs) interconnected with specialized low latency gigabit networking system called Myrinet and a 1/2 terabyte of RAID disk storage.”

Mateti, Beowulf Cluster33

Performance compared to SGI Performance compared to SGI Origin 2000Origin 2000

Mateti, Beowulf Cluster34

Beowulf Systems forBeowulf Systems for … …

HPC platform for scientific applications– This is the original purpose of Beowulf project

Storage and processing of large data– Satellites image processing– Information Retrieval, Data Mining

Scalable Internet/Intranet ServerComputing system in an academic environ

ment

Mateti, Beowulf Cluster35

MMore ore IInformationnformation on Clusters on Clusters

www.beowulf.org www.beowulf-underground.org "Unsanctioned

and unfettered information on building and using Beowulf systems."  Current events related to Beowulf.

www.extremelinux.org “Dedicated to take Linux beyond Beowulf into commodity cluster computing.”

http://www.ieeetfcc.org/ IEEE Task Force on Cluster Computing