2003 december

From angel at miami.edu Mon Dec 1 10:25:34 2003From: angel at miami.edu (Angel Li)Date: Mon, 01 Dec 2003 13:25:34 -0500Subject: [Rocks-Discuss]cluster-forkMessage-ID: <[email protected]>

Hi,

I recently installed Rocks 3.0 on a Linux cluster and when I run the command "cluster-fork" I get this error:

apple* cluster-fork lsTraceback (innermost last): File "/opt/rocks/sbin/cluster-fork", line 88, in ? import rocks.pssh File "/opt/rocks/lib/python/rocks/pssh.py", line 96, in ? import gmon.encoderImportError: Bad magic number in /usr/lib/python1.5/site-packages/gmon/encoder.pyc

Any thoughts? I'm also wondering where to find the python sources for files in /usr/lib/python1.5/site-packages/gmon.

Thanks,

Angel

From jghobrial at uh.edu Mon Dec 1 11:35:06 2003From: jghobrial at uh.edu (Joseph)Date: Mon, 1 Dec 2003 13:35:06 -0600 (CST)Subject: [Rocks-Discuss]cluster-forkIn-Reply-To: <[email protected]>References: <[email protected]>Message-ID: <[email protected]>

On Mon, 1 Dec 2003, Angel Li wrote:Hello Angel, I have the same problem and so far there is no response when I posted about this a month ago.

Is your frontend an AMD setup??

I am thinking this is an AMD problem.

Thanks,Joseph

> Hi,> > I recently installed Rocks 3.0 on a Linux cluster and when I run the > command "cluster-fork" I get this error:> > apple* cluster-fork ls> Traceback (innermost last):> File "/opt/rocks/sbin/cluster-fork", line 88, in ?> import rocks.pssh> File "/opt/rocks/lib/python/rocks/pssh.py", line 96, in ?

> import gmon.encoder> ImportError: Bad magic number in > /usr/lib/python1.5/site-packages/gmon/encoder.pyc> > Any thoughts? I'm also wondering where to find the python sources for > files in /usr/lib/python1.5/site-packages/gmon.> > Thanks,> > Angel>

From tim.carlson at pnl.gov Mon Dec 1 14:58:54 2003From: tim.carlson at pnl.gov (Tim Carlson)Date: Mon, 01 Dec 2003 14:58:54 -0800 (PST)Subject: [Rocks-Discuss]odd kickstart problemIn-Reply-To: <[email protected]>Message-ID: <[email protected]>

Trying to bring up an old dead node on a Rocks 2.3.2 cluster and I get thefollowing error in /var/log/httpd/error_log

Traceback (innermost last): File "/opt/rocks/sbin/kgen", line 530, in ? app.run() File "/opt/rocks/sbin/kgen", line 497, in run doc = FromXmlStream(file) File "/usr/lib/python1.5/site-packages/xml/dom/ext/reader/Sax2.py", line386, in FromXmlStream return reader.fromStream(stream, ownerDocument) File "/usr/lib/python1.5/site-packages/xml/dom/ext/reader/Sax2.py", line372, in fromStream self.parser.parse(s) File "/usr/lib/python1.5/site-packages/xml/sax/expatreader.py", line 58,in parse xmlreader.IncrementalParser.parse(self, source) File "/usr/lib/python1.5/site-packages/xml/sax/xmlreader.py", line 125,in parse self.close() File "/usr/lib/python1.5/site-packages/xml/sax/expatreader.py", line154, in close self.feed("", isFinal = 1) File "/usr/lib/python1.5/site-packages/xml/sax/expatreader.py", line148, in feed self._err_handler.fatalError(exc) File "/usr/lib/python1.5/site-packages/xml/dom/ext/reader/Sax2.py", line340, in fatalError raise exceptionxml.sax._exceptions.SAXParseException: <stdin>:3298:0: no element found

Doing a wget of http://frontend-0/install/kickstart.cgi\?arch=i386\&np=2\&project=rockson one of the working internal nodes yields the same error.

Any thoughts on this?

I've also done a freshrocks-dist dist

Tim

From sjenks at uci.edu Mon Dec 1 15:35:54 2003From: sjenks at uci.edu (Stephen Jenks)Date: Mon, 1 Dec 2003 15:35:54 -0800Subject: [Rocks-Discuss]cluster-forkIn-Reply-To: <[email protected]>References: <[email protected]> <[email protected]>Message-ID: <[email protected]>

FYI, I have a dual Athlon frontend and didn't have that problem. I know that doesn't exactly help you, but at least it doesn't fail on all AMD machines.

It looks like the .pyc file might be corrupt in your installation. The source .py file (encoder.py) is in the /usr/lib/python1.5/site-packages/gmon directory, so perhaps removing the .pyc file would regenerate it (if you run cluster-fork as root?)

The md5sum for encoder.pyc on my system is:459c78750fe6e065e9ed464ab23ab73d encoder.pycSo you can check if yours is different.

Steve Jenks

On Dec 1, 2003, at 11:35 AM, Joseph wrote:

> On Mon, 1 Dec 2003, Angel Li wrote:> Hello Angel, I have the same problem and so far there is no response > when> I posted about this a month ago.>> Is your frontend an AMD setup??>> I am thinking this is an AMD problem.>> Thanks,> Joseph>>>> Hi,>>>> I recently installed Rocks 3.0 on a Linux cluster and when I run the>> command "cluster-fork" I get this error:>>>> apple* cluster-fork ls>> Traceback (innermost last):>> File "/opt/rocks/sbin/cluster-fork", line 88, in ?>> import rocks.pssh>> File "/opt/rocks/lib/python/rocks/pssh.py", line 96, in ?>> import gmon.encoder>> ImportError: Bad magic number in

>> /usr/lib/python1.5/site-packages/gmon/encoder.pyc>>>> Any thoughts? I'm also wondering where to find the python sources for>> files in /usr/lib/python1.5/site-packages/gmon.>>>> Thanks,>>>> Angel>>

From mjk at sdsc.edu Mon Dec 1 19:03:16 2003From: mjk at sdsc.edu (Mason J. Katz)Date: Mon, 1 Dec 2003 19:03:16 -0800Subject: [Rocks-Discuss]odd kickstart problemIn-Reply-To: <[email protected]>References: <[email protected]>Message-ID: <[email protected]>

You'll need to run the kpp and kgen steps (what kickstart.cgi does for your) manually to find if this is an XML error.

# cd /home/install/profiles/current# kpp compute

This will generate a kickstart file for a compute nodes, although some information will be missing since it isn't specific to a node (not like what ./kickstart.cgi --client=node-name generates). But what this does do is traverse the XML graph and build a monolithic XML kickstart profile. If this step works you can then "|" pipe the output into kgen to convert the XML to kickstart syntax. Something in this procedure should fail and point to the error.

-mjk

On Dec 1, 2003, at 2:58 PM, Tim Carlson wrote:

> Trying to bring up an old dead node on a Rocks 2.3.2 cluster and I get > the> following error in /var/log/httpd/error_log>>> Traceback (innermost last):> File "/opt/rocks/sbin/kgen", line 530, in ?> app.run()> File "/opt/rocks/sbin/kgen", line 497, in run> doc = FromXmlStream(file)> File "/usr/lib/python1.5/site-packages/xml/dom/ext/reader/Sax2.py", > line> 386, in FromXmlStream> return reader.fromStream(stream, ownerDocument)> File "/usr/lib/python1.5/site-packages/xml/dom/ext/reader/Sax2.py", > line> 372, in fromStream> self.parser.parse(s)> File "/usr/lib/python1.5/site-packages/xml/sax/expatreader.py", line > 58,> in parse

> xmlreader.IncrementalParser.parse(self, source)> File "/usr/lib/python1.5/site-packages/xml/sax/xmlreader.py", line > 125,> in parse> self.close()> File "/usr/lib/python1.5/site-packages/xml/sax/expatreader.py", line> 154, in close> self.feed("", isFinal = 1)> File "/usr/lib/python1.5/site-packages/xml/sax/expatreader.py", line> 148, in feed> self._err_handler.fatalError(exc)> File "/usr/lib/python1.5/site-packages/xml/dom/ext/reader/Sax2.py", > line> 340, in fatalError> raise exception> xml.sax._exceptions.SAXParseException: <stdin>:3298:0: no element found>>> Doing a wget of > http://frontend-0/install/kickstart.cgi\? > arch=i386\&np=2\&project=rocks> on one of the working internal nodes yields the same error.>> Any thoughts on this?>> I've also done a fresh> rocks-dist dist>> Tim

From tim.carlson at pnl.gov Mon Dec 1 20:42:51 2003From: tim.carlson at pnl.gov (Tim Carlson)Date: Mon, 01 Dec 2003 20:42:51 -0800 (PST)Subject: [Rocks-Discuss]odd kickstart problemIn-Reply-To: <[email protected]>Message-ID: <[email protected]>

On Mon, 1 Dec 2003, Mason J. Katz wrote:

> You'll need to run the kpp and kgen steps (what kickstart.cgi does for> your) manually to find if this is an XML error.>> # cd /home/install/profiles/current> # kpp compute

That was the trick. This sent me down the correct path. I had uninstalledSGE on the frontend (I was having problems with SGE and wanted to startfrom scratch)

Adding the 2 SGE XML files back to /home/install/profiles/2.3.2/nodes/fixed everything

Thanks!

Tim

From landman at scalableinformatics.com Tue Dec 2 04:15:07 2003From: landman at scalableinformatics.com (Joe Landman)Date: Tue, 02 Dec 2003 07:15:07 -0500Subject: [Rocks-Discuss]supermicro based MB'sMessage-ID: <[email protected]>

Folks:

Working on integrating a Supermicro MB based cluster. Discovered early on that all of the compute nodes have an Intel based NIC that RedHat doesn't know anything about (any version of RH). Some of the administrative nodes have other similar issues. I am seeing simply a suprising number of mis/un detected hardware across the collection of MBs.

Anyone have advice on where to get modules/module source for Redhat for these things? It looks like I will need to rebuild the boot CD, though the several times I have tried this previously have failed to produce a working/bootable system. It looks like new modules need to be created/inserted into the boot process (head node and cluster nodes) kernels, as well as into the installable kernels.

Has anyone done this for a Supermicro MB based system? Thanks .

Joe

-- Joseph Landman, Ph.DScalable Informatics LLC,email: landman at scalableinformatics.comweb : http://scalableinformatics.comphone: +1 734 612 4615

From jghobrial at uh.edu Tue Dec 2 08:28:08 2003From: jghobrial at uh.edu (Joseph)Date: Tue, 2 Dec 2003 10:28:08 -0600 (CST)Subject: [Rocks-Discuss]cluster-forkIn-Reply-To: <[email protected]>References: <[email protected]> <[email protected]> <[email protected]>Message-ID: <[email protected]>

Indeed my md5sum is different for encoder.pyc. However, when I pulled the file and run "cluster-fork" python responds about an import problem. So it seems that regeneration did not occur. Is there a flag I need to pass?

I have also tried to figure out what package provides encoder and reinstall the package, but an rpm query reveals nothing.

If this is a generated file, what generates it?

It seems that an rpm file query on ganglia show that files in the directory belong to the package, but encoder.pyc does not.

Thanks,

Joseph

On Mon, 1 Dec 2003, Stephen Jenks wrote:> FYI, I have a dual Athlon frontend and didn't have that problem. I know > that doesn't exactly help you, but at least it doesn't fail on all AMD > machines.> > It looks like the .pyc file might be corrupt in your installation. The > source .py file (encoder.py) is in the > /usr/lib/python1.5/site-packages/gmon directory, so perhaps removing > the .pyc file would regenerate it (if you run cluster-fork as root?)> > The md5sum for encoder.pyc on my system is:> 459c78750fe6e065e9ed464ab23ab73d encoder.pyc> So you can check if yours is different.> > Steve Jenks> > > On Dec 1, 2003, at 11:35 AM, Joseph wrote:> > > On Mon, 1 Dec 2003, Angel Li wrote:> > Hello Angel, I have the same problem and so far there is no response > > when> > I posted about this a month ago.> >> > Is your frontend an AMD setup??> >> > I am thinking this is an AMD problem.> >> > Thanks,> > Joseph> >> >> >> Hi,> >>> >> I recently installed Rocks 3.0 on a Linux cluster and when I run the> >> command "cluster-fork" I get this error:> >>> >> apple* cluster-fork ls> >> Traceback (innermost last):> >> File "/opt/rocks/sbin/cluster-fork", line 88, in ?> >> import rocks.pssh> >> File "/opt/rocks/lib/python/rocks/pssh.py", line 96, in ?> >> import gmon.encoder> >> ImportError: Bad magic number in> >> /usr/lib/python1.5/site-packages/gmon/encoder.pyc> >>> >> Any thoughts? I'm also wondering where to find the python sources for> >> files in /usr/lib/python1.5/site-packages/gmon.> >>> >> Thanks,> >>> >> Angel> >>>

From angel at miami.edu Tue Dec 2 09:02:55 2003From: angel at miami.edu (Angel Li)Date: Tue, 02 Dec 2003 12:02:55 -0500Subject: [Rocks-Discuss]cluster-forkIn-Reply-To: <[email protected]>References: <[email protected]> <[email protected]> <[email protected]> <[email protected]>Message-ID: <[email protected]>

Joseph wrote:

>Indeed my md5sum is different for encoder.pyc. However, when I pulled the >file and run "cluster-fork" python responds about an import problem. So it >seems that regeneration did not occur. Is there a flag I need to pass?>>I have also tried to figure out what package provides encoder and >reinstall the package, but an rpm query reveals nothing.>>If this is a generated file, what generates it?>>It seems that an rpm file query on ganglia show that files in the >directory belong to the package, but encoder.pyc does not.>>Thanks,>Joseph>>> >I have finally found the python sources in the HPC rolls CD, filename ganglia-python-3.0.0-2.i386.rpm. I'm not familiar with python but it seems python "compiles" the .py files to ".pyc" and then deletes the source file the first time they are referenced? I also noticed that there are two versions of python installed. Maybe the pyc files from one version won't load into the other one?

Angel

From mjk at sdsc.edu Tue Dec 2 15:52:52 2003From: mjk at sdsc.edu (Mason J. Katz)Date: Tue, 2 Dec 2003 15:52:52 -0800Subject: [Rocks-Discuss]cluster-forkIn-Reply-To: <[email protected]>References: <[email protected]> <[email protected]> <[email protected]> <[email protected]> <[email protected]>Message-ID: <[email protected]>

Python creates the .pyc files for you, and does not remove the original .py file. I would be extremely surprised it two "identical" .pyc files had the same md5 checksum. I'd expect this to be more like C .o file which always contain random data to pad out to the end of a page and

32/64 bit word sizes. Still this is just a guess, the real point is you can always remove the .pyc files and the .py will regenerate it when imported (although standard UNIX file/dir permission still apply).

What is the import error you get from cluster-fork?

-mjk

On Dec 2, 2003, at 9:02 AM, Angel Li wrote:

> Joseph wrote:>>> Indeed my md5sum is different for encoder.pyc. However, when I pulled >> the file and run "cluster-fork" python responds about an import >> problem. So it seems that regeneration did not occur. Is there a flag >> I need to pass?>>>> I have also tried to figure out what package provides encoder and >> reinstall the package, but an rpm query reveals nothing.>>>> If this is a generated file, what generates it?>>>> It seems that an rpm file query on ganglia show that files in the >> directory belong to the package, but encoder.pyc does not.>>>> Thanks,>> Joseph>>>>>>> I have finally found the python sources in the HPC rolls CD, filename > ganglia-python-3.0.0-2.i386.rpm. I'm not familiar with python but it > seems python "compiles" the .py files to ".pyc" and then deletes the > source file the first time they are referenced? I also noticed that > there are two versions of python installed. Maybe the pyc files from > one version won't load into the other one?>> Angel>>

From vrowley at ucsd.edu Mon Dec 1 14:27:03 2003From: vrowley at ucsd.edu (V. Rowley)Date: Mon, 01 Dec 2003 14:27:03 -0800Subject: [Rocks-Discuss]PXE boot problemsMessage-ID: <[email protected]>

We have installed a ROCKS 3.0.0 frontend on a DL380 and are trying to install a compute node via PXE. We are getting an error similar to the one mentioned in the archives, e.g.

> Loading initrd.img....> Ready> > Failed to free base memory>

We have upgraded to syslinux-2.07-1, per the suggestion in the archives, but continue to get the same error. Any ideas?

-- Vicky Rowley email: vrowley at ucsd.eduBiomedical Informatics Research Network work: (858) 536-5980University of California, San Diego fax: (858) 822-08289500 Gilman DriveLa Jolla, CA 92093-0715

See pictures from our trip to China at http://www.sagacitech.com/Chinaweb

From naihh at imcb.a-star.edu.sg Tue Dec 2 18:50:55 2003From: naihh at imcb.a-star.edu.sg (Nai Hong Hwa Francis)Date: Wed, 3 Dec 2003 10:50:55 +0800Subject: [Rocks-Discuss]RE: When will Sun Grid Engine be included inRocks 3 for Itanium?Message-ID: <5E118EED7CC277468A275F11EEEC39B94CCC22@EXIMCB2.imcb.a-star.edu.sg>

Hi Laurence,

I just downloaded the Rocks3.0 for IA32 and installed it but SGE isstill not working.

Any idea?

Nai Hong Hwa FrancisInstitute of Molecular and Cell Biology (A*STAR)30 Medical DriveSingapore 117609.DID: (65) 6874-6196

-----Original Message-----From: Laurence Liew [mailto:laurence at scalablesys.com] Sent: Thursday, November 20, 2003 2:53 PMTo: Nai Hong Hwa FrancisCc: npaci-rocks-discussion at sdsc.eduSubject: Re: [Rocks-Discuss]RE: When will Sun Grid Engine be includedinRocks 3 for Itanium?

Hi Francis

GridEngine roll is ready for ia32. We will get a ia64 native versionready as soon as we get back from SC2003. It will be released in a fewweeks time.

Globus GT2.4 is included in the Grid Roll

Cheers!Laurence

On Thu, 2003-11-20 at 10:13, Nai Hong Hwa Francis wrote:> > Hi,

> > Does anyone have any idea when will Sun Grid Engine be included aspart> of Rocks 3 distribution.> > I am a newbie to Grid Computing.> Anyone have any idea on how to invoke Globus in Rocks to setup a Grid?> > Regards> > Nai Hong Hwa Francis> > Institute of Molecular and Cell Biology (A*STAR)> 30 Medical Drive> Singapore 117609> DID: 65-6874-6196> > -----Original Message-----> From: npaci-rocks-discussion-request at sdsc.edu> [mailto:npaci-rocks-discussion-request at sdsc.edu] > Sent: Thursday, November 20, 2003 4:01 AM> To: npaci-rocks-discussion at sdsc.edu> Subject: npaci-rocks-discussion digest, Vol 1 #613 - 3 msgs> > Send npaci-rocks-discussion mailing list submissions to> npaci-rocks-discussion at sdsc.edu> > To subscribe or unsubscribe via the World Wide Web, visit> > http://lists.sdsc.edu/mailman/listinfo.cgi/npaci-rocks-discussion> or, via email, send a message with subject or body 'help' to> npaci-rocks-discussion-request at sdsc.edu> > You can reach the person managing the list at> npaci-rocks-discussion-admin at sdsc.edu> > When replying, please edit your Subject line so it is more specific> than "Re: Contents of npaci-rocks-discussion digest..."> > > Today's Topics:> > 1. top500 cluster installation movie (Greg Bruno)> 2. Re: Running Normal Application on Rocks Cluster -> Newbie Question (Laurence Liew)> > --__--__--> > Message: 1> To: npaci-rocks-discussion at sdsc.edu> From: Greg Bruno <bruno at rocksclusters.org>> Date: Tue, 18 Nov 2003 13:41:15 -0800> Subject: [Rocks-Discuss]top500 cluster installation movie> > here's a crew of 7, installing the 201st fastest supercomputer in the > world in under two hours on the showroom floor at SC 03:> > http://www.rocksclusters.org/rocks.mov>

> warning: the above file is ~65MB.> > - gb> > > --__--__--> > Message: 2> Subject: Re: [Rocks-Discuss]Running Normal Application on RocksCluster> -> Newbie Question> From: Laurence Liew <laurenceliew at yahoo.com.sg>> To: Leong Chee Shian <chee-shian.leong at schenker.com>> Cc: npaci-rocks-discussion at sdsc.edu> Date: Wed, 19 Nov 2003 12:31:18 +0800> > Chee Shian,> > Thanks for your call. We will take this off list and visit you nextweek> in your office as you requested.> > Cheers!> laurence> > > > On Tue, 2003-11-18 at 17:29, Leong Chee Shian wrote:> > I have just installed Rocks 3.0 with one frontend and two compute> > node. > > > > A normal file based application is installed on the frontend and is> > NFS shared to the compute nodes . > > > > Question is : When run 5 sessions of my applications , the CPU> > utilization is all concentrated on the frontend node , nothing is> > being passed on to the compute nodes . How do I make these 3computers> > to function as one and share the load ?> > > > Thanks everyone as I am really new to this clustering stuff..> > > > PS : The idea of exploring rocks cluster is to use a few inexpensive> > intel machines to replace our existing multi CPU sun server,> > suggestions and recommendations are greatly appreciated.> > > > > > Leong> > > > > > > > > > --__--__--> > _______________________________________________> npaci-rocks-discussion mailing list

> npaci-rocks-discussion at sdsc.edu> http://lists.sdsc.edu/mailman/listinfo.cgi/npaci-rocks-discussion> > > End of npaci-rocks-discussion Digest> > > DISCLAIMER:> This email is confidential and may be privileged. If you are not theintended recipient, please delete it and notify us immediately. Pleasedo not copy or use it for any purpose, or disclose its contents to anyother person as it may be an offence under the Official Secrets Act.Thank you.-- Laurence LiewCTO, Scalable Systems Pte Ltd7 Bedok South RoadSingapore 469272Tel : 65 6827 3953Fax : 65 6827 3922Mobile: 65 9029 4312Email : laurence at scalablesys.com http://www.scalablesys.com

DISCLAIMER:This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately. Please do not copy or use it for any purpose, or disclose its contents to any other person as it may be an offence under the Official Secrets Act. Thank you.

From laurence at scalablesys.com Tue Dec 2 19:10:08 2003From: laurence at scalablesys.com (Laurence Liew)Date: Wed, 03 Dec 2003 11:10:08 +0800Subject: [Rocks-Discuss]RE: When will Sun Grid Engine be included

inRocks 3 for Itanium?In-Reply-To: <5E118EED7CC277468A275F11EEEC39B94CCC22@EXIMCB2.imcb.a-star.edu.sg>References:

<5E118EED7CC277468A275F11EEEC39B94CCC22@EXIMCB2.imcb.a-star.edu.sg>Message-ID: <1070421007.2452.51.camel@scalable>

Hi,

SGE is in the SGE roll.

You need to download the base, hpc and sge roll.

The install is now different from V2.3.x

Cheers!laurence

On Wed, 2003-12-03 at 10:50, Nai Hong Hwa Francis wrote:> Hi Laurence,>

> I just downloaded the Rocks3.0 for IA32 and installed it but SGE is> still not working.> > Any idea?> > Nai Hong Hwa Francis> Institute of Molecular and Cell Biology (A*STAR)> 30 Medical Drive> Singapore 117609.> DID: (65) 6874-6196> > -----Original Message-----> From: Laurence Liew [mailto:laurence at scalablesys.com] > Sent: Thursday, November 20, 2003 2:53 PM> To: Nai Hong Hwa Francis> Cc: npaci-rocks-discussion at sdsc.edu> Subject: Re: [Rocks-Discuss]RE: When will Sun Grid Engine be included> inRocks 3 for Itanium?> > Hi Francis> > GridEngine roll is ready for ia32. We will get a ia64 native version> ready as soon as we get back from SC2003. It will be released in a few> weeks time.> > Globus GT2.4 is included in the Grid Roll> > Cheers!> Laurence> > > On Thu, 2003-11-20 at 10:13, Nai Hong Hwa Francis wrote:> > > > Hi,> > > > Does anyone have any idea when will Sun Grid Engine be included as> part> > of Rocks 3 distribution.> > > > I am a newbie to Grid Computing.> > Anyone have any idea on how to invoke Globus in Rocks to setup a Grid?> > > > Regards> > > > Nai Hong Hwa Francis> > > > Institute of Molecular and Cell Biology (A*STAR)> > 30 Medical Drive> > Singapore 117609> > DID: 65-6874-6196> > > > -----Original Message-----> > From: npaci-rocks-discussion-request at sdsc.edu> > [mailto:npaci-rocks-discussion-request at sdsc.edu] > > Sent: Thursday, November 20, 2003 4:01 AM> > To: npaci-rocks-discussion at sdsc.edu> > Subject: npaci-rocks-discussion digest, Vol 1 #613 - 3 msgs> > > > Send npaci-rocks-discussion mailing list submissions to

> > npaci-rocks-discussion at sdsc.edu> > > > To subscribe or unsubscribe via the World Wide Web, visit> > > > http://lists.sdsc.edu/mailman/listinfo.cgi/npaci-rocks-discussion> > or, via email, send a message with subject or body 'help' to> > npaci-rocks-discussion-request at sdsc.edu> > > > You can reach the person managing the list at> > npaci-rocks-discussion-admin at sdsc.edu> > > > When replying, please edit your Subject line so it is more specific> > than "Re: Contents of npaci-rocks-discussion digest..."> > > > > > Today's Topics:> > > > 1. top500 cluster installation movie (Greg Bruno)> > 2. Re: Running Normal Application on Rocks Cluster -> > Newbie Question (Laurence Liew)> > > > --__--__--> > > > Message: 1> > To: npaci-rocks-discussion at sdsc.edu> > From: Greg Bruno <bruno at rocksclusters.org>> > Date: Tue, 18 Nov 2003 13:41:15 -0800> > Subject: [Rocks-Discuss]top500 cluster installation movie> > > > here's a crew of 7, installing the 201st fastest supercomputer in the > > world in under two hours on the showroom floor at SC 03:> > > > http://www.rocksclusters.org/rocks.mov> > > > warning: the above file is ~65MB.> > > > - gb> > > > > > --__--__--> > > > Message: 2> > Subject: Re: [Rocks-Discuss]Running Normal Application on Rocks> Cluster> > -> > Newbie Question> > From: Laurence Liew <laurenceliew at yahoo.com.sg>> > To: Leong Chee Shian <chee-shian.leong at schenker.com>> > Cc: npaci-rocks-discussion at sdsc.edu> > Date: Wed, 19 Nov 2003 12:31:18 +0800> > > > Chee Shian,> > > > Thanks for your call. We will take this off list and visit you next> week> > in your office as you requested.> > > > Cheers!> > laurence

> > > > > > > > On Tue, 2003-11-18 at 17:29, Leong Chee Shian wrote:> > > I have just installed Rocks 3.0 with one frontend and two compute> > > node. > > > > > > A normal file based application is installed on the frontend and is> > > NFS shared to the compute nodes . > > > > > > Question is : When run 5 sessions of my applications , the CPU> > > utilization is all concentrated on the frontend node , nothing is> > > being passed on to the compute nodes . How do I make these 3> computers> > > to function as one and share the load ?> > > > > > Thanks everyone as I am really new to this clustering stuff..> > > > > > PS : The idea of exploring rocks cluster is to use a few inexpensive> > > intel machines to replace our existing multi CPU sun server,> > > suggestions and recommendations are greatly appreciated.> > > > > > > > > Leong> > > > > > > > > > > > > > > > > --__--__--> > > > _______________________________________________> > npaci-rocks-discussion mailing list> > npaci-rocks-discussion at sdsc.edu> > http://lists.sdsc.edu/mailman/listinfo.cgi/npaci-rocks-discussion> > > > > > End of npaci-rocks-discussion Digest> > > > > > DISCLAIMER:> > This email is confidential and may be privileged. If you are not the> intended recipient, please delete it and notify us immediately. Please> do not copy or use it for any purpose, or disclose its contents to any> other person as it may be an offence under the Official Secrets Act.> Thank you.-- Laurence LiewCTO, Scalable Systems Pte Ltd7 Bedok South RoadSingapore 469272Tel : 65 6827 3953Fax : 65 6827 3922Mobile: 65 9029 4312Email : laurence at scalablesys.com http://www.scalablesys.com

From DGURGUL at PARTNERS.ORG Wed Dec 3 07:24:29 2003From: DGURGUL at PARTNERS.ORG (Gurgul, Dennis J.)Date: Wed, 3 Dec 2003 10:24:29 -0500 Subject: [Rocks-Discuss]RE: When will Sun Grid Engine be included inRo

cks 3 for Itanium?Message-ID: <BC447F1AD529D311B4DE0008C71BF2EB0AE157F7@phsexch7.mgh.harvard.edu>

Where do we find the SGE roll? Under Lhoste at http://rocks.npaci.edu/Rocks/there is a "Grid" roll listed. Is SGE in that? The userguide doesn't mentionSGE.

Dennis J. GurgulPartners Health Care SystemResearch ManagementResearch Computing Core617.724.3169

-----Original Message-----From: npaci-rocks-discussion-admin at sdsc.edu[mailto:npaci-rocks-discussion-admin at sdsc.edu]On Behalf Of Laurence LiewSent: Tuesday, December 02, 2003 10:10 PMTo: Nai Hong Hwa FrancisCc: npaci-rocks-discussion at sdsc.eduSubject: RE: [Rocks-Discuss]RE: When will Sun Grid Engine be includedinRocks 3 for Itanium?

Hi,

SGE is in the SGE roll.

You need to download the base, hpc and sge roll.

The install is now different from V2.3.x

Cheers!laurence

On Wed, 2003-12-03 at 10:50, Nai Hong Hwa Francis wrote:> Hi Laurence,> > I just downloaded the Rocks3.0 for IA32 and installed it but SGE is> still not working.> > Any idea?> > Nai Hong Hwa Francis> Institute of Molecular and Cell Biology (A*STAR)> 30 Medical Drive> Singapore 117609.> DID: (65) 6874-6196> > -----Original Message-----> From: Laurence Liew [mailto:laurence at scalablesys.com] > Sent: Thursday, November 20, 2003 2:53 PM

> To: Nai Hong Hwa Francis> Cc: npaci-rocks-discussion at sdsc.edu> Subject: Re: [Rocks-Discuss]RE: When will Sun Grid Engine be included> inRocks 3 for Itanium?> > Hi Francis> > GridEngine roll is ready for ia32. We will get a ia64 native version> ready as soon as we get back from SC2003. It will be released in a few> weeks time.> > Globus GT2.4 is included in the Grid Roll> > Cheers!> Laurence> > > On Thu, 2003-11-20 at 10:13, Nai Hong Hwa Francis wrote:> > > > Hi,> > > > Does anyone have any idea when will Sun Grid Engine be included as> part> > of Rocks 3 distribution.> > > > I am a newbie to Grid Computing.> > Anyone have any idea on how to invoke Globus in Rocks to setup a Grid?> > > > Regards> > > > Nai Hong Hwa Francis> > > > Institute of Molecular and Cell Biology (A*STAR)> > 30 Medical Drive> > Singapore 117609> > DID: 65-6874-6196> > > > -----Original Message-----> > From: npaci-rocks-discussion-request at sdsc.edu> > [mailto:npaci-rocks-discussion-request at sdsc.edu] > > Sent: Thursday, November 20, 2003 4:01 AM> > To: npaci-rocks-discussion at sdsc.edu> > Subject: npaci-rocks-discussion digest, Vol 1 #613 - 3 msgs> > > > Send npaci-rocks-discussion mailing list submissions to> > npaci-rocks-discussion at sdsc.edu> > > > To subscribe or unsubscribe via the World Wide Web, visit> > > > http://lists.sdsc.edu/mailman/listinfo.cgi/npaci-rocks-discussion> > or, via email, send a message with subject or body 'help' to> > npaci-rocks-discussion-request at sdsc.edu> > > > You can reach the person managing the list at> > npaci-rocks-discussion-admin at sdsc.edu> > > > When replying, please edit your Subject line so it is more specific> > than "Re: Contents of npaci-rocks-discussion digest..."> >

> > > > Today's Topics:> > > > 1. top500 cluster installation movie (Greg Bruno)> > 2. Re: Running Normal Application on Rocks Cluster -> > Newbie Question (Laurence Liew)> > > > --__--__--> > > > Message: 1> > To: npaci-rocks-discussion at sdsc.edu> > From: Greg Bruno <bruno at rocksclusters.org>> > Date: Tue, 18 Nov 2003 13:41:15 -0800> > Subject: [Rocks-Discuss]top500 cluster installation movie> > > > here's a crew of 7, installing the 201st fastest supercomputer in the > > world in under two hours on the showroom floor at SC 03:> > > > http://www.rocksclusters.org/rocks.mov> > > > warning: the above file is ~65MB.> > > > - gb> > > > > > --__--__--> > > > Message: 2> > Subject: Re: [Rocks-Discuss]Running Normal Application on Rocks> Cluster> > -> > Newbie Question> > From: Laurence Liew <laurenceliew at yahoo.com.sg>> > To: Leong Chee Shian <chee-shian.leong at schenker.com>> > Cc: npaci-rocks-discussion at sdsc.edu> > Date: Wed, 19 Nov 2003 12:31:18 +0800> > > > Chee Shian,> > > > Thanks for your call. We will take this off list and visit you next> week> > in your office as you requested.> > > > Cheers!> > laurence> > > > > > > > On Tue, 2003-11-18 at 17:29, Leong Chee Shian wrote:> > > I have just installed Rocks 3.0 with one frontend and two compute> > > node. > > > > > > A normal file based application is installed on the frontend and is> > > NFS shared to the compute nodes . > > > > > > Question is : When run 5 sessions of my applications , the CPU> > > utilization is all concentrated on the frontend node , nothing is> > > being passed on to the compute nodes . How do I make these 3> computers

> > > to function as one and share the load ?> > > > > > Thanks everyone as I am really new to this clustering stuff..> > > > > > PS : The idea of exploring rocks cluster is to use a few inexpensive> > > intel machines to replace our existing multi CPU sun server,> > > suggestions and recommendations are greatly appreciated.> > > > > > > > > Leong> > > > > > > > > > > > > > > > > --__--__--> > > > _______________________________________________> > npaci-rocks-discussion mailing list> > npaci-rocks-discussion at sdsc.edu> > http://lists.sdsc.edu/mailman/listinfo.cgi/npaci-rocks-discussion> > > > > > End of npaci-rocks-discussion Digest> > > > > > DISCLAIMER:> > This email is confidential and may be privileged. If you are not the> intended recipient, please delete it and notify us immediately. Please> do not copy or use it for any purpose, or disclose its contents to any> other person as it may be an offence under the Official Secrets Act.> Thank you.-- Laurence LiewCTO, Scalable Systems Pte Ltd7 Bedok South RoadSingapore 469272Tel : 65 6827 3953Fax : 65 6827 3922Mobile: 65 9029 4312Email : laurence at scalablesys.com http://www.scalablesys.com

From bruno at rocksclusters.org Wed Dec 3 07:32:14 2003From: bruno at rocksclusters.org (Greg Bruno)Date: Wed, 3 Dec 2003 07:32:14 -0800Subject: [Rocks-Discuss]RE: When will Sun Grid Engine be included inRo cks 3 for Itanium?In-Reply-To: <BC447F1AD529D311B4DE0008C71BF2EB0AE157F7@phsexch7.mgh.harvard.edu>References: <BC447F1AD529D311B4DE0008C71BF2EB0AE157F7@phsexch7.mgh.harvard.edu>Message-ID: <[email protected]>

> Where do we find the SGE roll? Under Lhoste at > http://rocks.npaci.edu/Rocks/> there is a "Grid" roll listed. Is SGE in that? The userguide doesn't > mention> SGE.

the SGE roll will be available in the upcoming v3.1.0 release. scheduled release date is december 15th.

- gb

From jlkaiser at fnal.gov Wed Dec 3 08:35:18 2003From: jlkaiser at fnal.gov (Joe Kaiser)Date: Wed, 03 Dec 2003 10:35:18 -0600Subject: [Rocks-Discuss]supermicro based MB'sIn-Reply-To: <[email protected]>References: <[email protected]>Message-ID: <[email protected]>

Hi,

You don't say what version of Rocks you are using. The following is forthe X5DPA-GG board and Rocks 3.0. It requires modifying only thepcitable in the boot image on the tftp server. I believe the procedurefor 2.3.2 requires a heck of a lot more work, (but it may not). I wouldhave to dig deep for the notes about the changing 2.3.2.

This should be done on the frontend:

cd /tftpboot/X86PC/UNDI/pxelinux/cp initrd.img initrd.img.origcp initrd.img /tmpcd /tmpmv initrd.img initrd.gzgunzip initrd.gzmkdir /mnt/loopmount -o loop initrd /mnt/loopcd /mnt/loop/modules/vi pcitable

Search for the e1000 drivers and add the following line:

0x8086 0x1013 "e1000" "Intel Corp.|82546EB Gigabit EthernetController"

write the file

cd /tmpumount /mnt/loopgzip initrdmv initrd.gz initrd.imgmv initrd.img /tftpboot/X86PC/UNDI/pxelinux/

Then boot the node.

Hope this helps.

Thanks,

Joe

On Tue, 2003-12-02 at 06:15, Joe Landman wrote:

> Folks:> > Working on integrating a Supermicro MB based cluster. Discovered early > on that all of the compute nodes have an Intel based NIC that RedHat > doesn't know anything about (any version of RH). Some of the > administrative nodes have other similar issues. I am seeing simply a > suprising number of mis/un detected hardware across the collection of MBs. > > Anyone have advice on where to get modules/module source for Redhat > for these things? It looks like I will need to rebuild the boot CD, > though the several times I have tried this previously have failed to > produce a working/bootable system. It looks like new modules need to be > created/inserted into the boot process (head node and cluster nodes) > kernels, as well as into the installable kernels.> > Has anyone done this for a Supermicro MB based system? Thanks .> > Joe-- ===================================================================Joe Kaiser - Systems Administrator

Fermi Lab CD/OSS-SCS Never laugh at live dragons.630-840-6444jlkaiser at fnal.gov ===================================================================

From jghobrial at uh.edu Wed Dec 3 08:59:15 2003From: jghobrial at uh.edu (Joseph)Date: Wed, 3 Dec 2003 10:59:15 -0600 (CST)Subject: [Rocks-Discuss]cluster-forkIn-Reply-To: <[email protected]>References: <[email protected]> <[email protected]> <[email protected]> <[email protected]> <[email protected]> <[email protected]>Message-ID: <[email protected]>

Here is the error I receive when I remove the file encoder.pyc and run the command cluster-fork

Traceback (innermost last): File "/opt/rocks/sbin/cluster-fork", line 88, in ? import rocks.pssh File "/opt/rocks/lib/python/rocks/pssh.py", line 96, in ? import gmon.encoderImportError: No module named encoder

Thanks,Joseph

On Tue, 2 Dec 2003, Mason J. Katz wrote:

> Python creates the .pyc files for you, and does not remove the original

> .py file. I would be extremely surprised it two "identical" .pyc files > had the same md5 checksum. I'd expect this to be more like C .o file > which always contain random data to pad out to the end of a page and > 32/64 bit word sizes. Still this is just a guess, the real point is > you can always remove the .pyc files and the .py will regenerate it > when imported (although standard UNIX file/dir permission still apply).> > What is the import error you get from cluster-fork?> > -mjk> > On Dec 2, 2003, at 9:02 AM, Angel Li wrote:> > > Joseph wrote:> >> >> Indeed my md5sum is different for encoder.pyc. However, when I pulled > >> the file and run "cluster-fork" python responds about an import > >> problem. So it seems that regeneration did not occur. Is there a flag > >> I need to pass?> >>> >> I have also tried to figure out what package provides encoder and > >> reinstall the package, but an rpm query reveals nothing.> >>> >> If this is a generated file, what generates it?> >>> >> It seems that an rpm file query on ganglia show that files in the > >> directory belong to the package, but encoder.pyc does not.> >>> >> Thanks,> >> Joseph> >>> >>> >>> > I have finally found the python sources in the HPC rolls CD, filename > > ganglia-python-3.0.0-2.i386.rpm. I'm not familiar with python but it > > seems python "compiles" the .py files to ".pyc" and then deletes the > > source file the first time they are referenced? I also noticed that > > there are two versions of python installed. Maybe the pyc files from > > one version won't load into the other one?> >> > Angel> >> >>

From mjk at sdsc.edu Wed Dec 3 15:19:38 2003From: mjk at sdsc.edu (Mason J. Katz)Date: Wed, 3 Dec 2003 15:19:38 -0800Subject: [Rocks-Discuss]cluster-forkIn-Reply-To: <[email protected]>References: <[email protected]> <[email protected]> <[email protected]> <[email protected]> <[email protected]> <[email protected]> <[email protected]>Message-ID: <[email protected]>

This file come from a ganglia package, what does

# rpm -q ganglia-receptor

Return?

-mjk


> Here is the error I receive when I remove the file encoder.pyc and run > the> command cluster-fork>> Traceback (innermost last):> File "/opt/rocks/sbin/cluster-fork", line 88, in ?> import rocks.pssh> File "/opt/rocks/lib/python/rocks/pssh.py", line 96, in ?> import gmon.encoder> ImportError: No module named encoder>> Thanks,> Joseph>>> On Tue, 2 Dec 2003, Mason J. Katz wrote:>>> Python creates the .pyc files for you, and does not remove the >> original>> .py file. I would be extremely surprised it two "identical" .pyc >> files>> had the same md5 checksum. I'd expect this to be more like C .o file>> which always contain random data to pad out to the end of a page and>> 32/64 bit word sizes. Still this is just a guess, the real point is>> you can always remove the .pyc files and the .py will regenerate it>> when imported (although standard UNIX file/dir permission still >> apply).>>>> What is the import error you get from cluster-fork?>>>> -mjk>>>> On Dec 2, 2003, at 9:02 AM, Angel Li wrote:>>>>> Joseph wrote:>>>>>>> Indeed my md5sum is different for encoder.pyc. However, when I >>>> pulled>>>> the file and run "cluster-fork" python responds about an import>>>> problem. So it seems that regeneration did not occur. Is there a >>>> flag>>>> I need to pass?>>>>>>>> I have also tried to figure out what package provides encoder and>>>> reinstall the package, but an rpm query reveals nothing.>>>>>>>> If this is a generated file, what generates it?>>>>>>>> It seems that an rpm file query on ganglia show that files in the

>>>> directory belong to the package, but encoder.pyc does not.>>>>>>>> Thanks,>>>> Joseph>>>>>>>>>>>>>>> I have finally found the python sources in the HPC rolls CD, filename>>> ganglia-python-3.0.0-2.i386.rpm. I'm not familiar with python but it>>> seems python "compiles" the .py files to ".pyc" and then deletes the>>> source file the first time they are referenced? I also noticed that>>> there are two versions of python installed. Maybe the pyc files from>>> one version won't load into the other one?>>>>>> Angel>>>>>>>>

From csamuel at vpac.org Wed Dec 3 18:09:26 2003From: csamuel at vpac.org (Chris Samuel)Date: Thu, 4 Dec 2003 13:09:26 +1100Subject: [Rocks-Discuss]Confirmation of Rocks 3.1.0 Opteron support & RHEL trademark removal ?Message-ID: <[email protected]>

-----BEGIN PGP SIGNED MESSAGE-----Hash: SHA1

Hi folks,

Can someone confirm that the next Rocks release will support Opteron please ?

Also, I noticed that the current Rocks release on Itanium based on RHEL still has a lot of mentions of RedHat in it, which from my reading of their trademark guidelines is not permitted, is that fixed in the new version ?

cheers!Chris- -- Christopher Samuel - (03)9925 4751 - VPAC Systems & Network Admin Victorian Partnership for Advanced Computing http://www.vpac.org/ Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia

-----BEGIN PGP SIGNATURE-----Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/zpdWO2KABBYQAh8RAqB8AJ9FG+IjIeem21qlFS6XYIHamIMPmwCghVTVAgjAlVHWgdv/KzYQinHGPxs==IAWU-----END PGP SIGNATURE-----

From bruno at rocksclusters.org Wed Dec 3 18:46:30 2003From: bruno at rocksclusters.org (Greg Bruno)Date: Wed, 3 Dec 2003 18:46:30 -0800

Subject: [Rocks-Discuss]Confirmation of Rocks 3.1.0 Opteron support & RHEL trademark removal ?In-Reply-To: <[email protected]>References: <[email protected]>Message-ID: <[email protected]>

> Can someone confirm that the next Rocks release will support Opteron > please ?

yes, it will support opteron.

> Also, I noticed that the current Rocks release on Itanium based on > RHEL still> has a lot of mentions of RedHat in it, which from my reading of their> trademark guidelines is not permitted, is that fixed in the new > version ?

and yes, (even though it doesn't feel like the right thing to do, as redhat has offered to the community some outstanding technologies that we'd like to credit), all redhat trademarks will be removed from 3.1.0.

- gb

From fds at sdsc.edu Thu Dec 4 06:46:32 2003From: fds at sdsc.edu (Federico Sacerdoti)Date: Thu, 4 Dec 2003 06:46:32 -0800Subject: [Rocks-Discuss]cluster-forkIn-Reply-To: <[email protected]>References: <[email protected]> <[email protected]> <[email protected]> <[email protected]> <[email protected]> <[email protected]> <[email protected]>Message-ID: <[email protected]>

Please install the http://www.rocksclusters.org/errata/3.0.0/ganglia-python-3.0.1 -2.i386.rpm package, which includes the correct encoder.py file. (This package is listed on the 3.0.0 errata page)

-Federico


> Here is the error I receive when I remove the file encoder.pyc and run > the> command cluster-fork>> Traceback (innermost last):> File "/opt/rocks/sbin/cluster-fork", line 88, in ?> import rocks.pssh> File "/opt/rocks/lib/python/rocks/pssh.py", line 96, in ?> import gmon.encoder> ImportError: No module named encoder>> Thanks,> Joseph

>>> On Tue, 2 Dec 2003, Mason J. Katz wrote:>>> Python creates the .pyc files for you, and does not remove the >> original>> .py file. I would be extremely surprised it two "identical" .pyc >> files>> had the same md5 checksum. I'd expect this to be more like C .o file>> which always contain random data to pad out to the end of a page and>> 32/64 bit word sizes. Still this is just a guess, the real point is>> you can always remove the .pyc files and the .py will regenerate it>> when imported (although standard UNIX file/dir permission still >> apply).>>>> What is the import error you get from cluster-fork?>>>> -mjk>>>> On Dec 2, 2003, at 9:02 AM, Angel Li wrote:>>>>> Joseph wrote:>>>>>>> Indeed my md5sum is different for encoder.pyc. However, when I >>>> pulled>>>> the file and run "cluster-fork" python responds about an import>>>> problem. So it seems that regeneration did not occur. Is there a >>>> flag>>>> I need to pass?>>>>>>>> I have also tried to figure out what package provides encoder and>>>> reinstall the package, but an rpm query reveals nothing.>>>>>>>> If this is a generated file, what generates it?>>>>>>>> It seems that an rpm file query on ganglia show that files in the>>>> directory belong to the package, but encoder.pyc does not.>>>>>>>> Thanks,>>>> Joseph>>>>>>>>>>>>>>> I have finally found the python sources in the HPC rolls CD, filename>>> ganglia-python-3.0.0-2.i386.rpm. I'm not familiar with python but it>>> seems python "compiles" the .py files to ".pyc" and then deletes the>>> source file the first time they are referenced? I also noticed that>>> there are two versions of python installed. Maybe the pyc files from>>> one version won't load into the other one?>>>>>> Angel>>>>>>>>>>Federico

Rocks Cluster Group, San Diego Supercomputing Center, CA

From jghobrial at uh.edu Thu Dec 4 07:14:21 2003From: jghobrial at uh.edu (Joseph)Date: Thu, 4 Dec 2003 09:14:21 -0600 (CST)Subject: [Rocks-Discuss]cluster-forkIn-Reply-To: <[email protected]>References: <[email protected]> <[email protected]> <[email protected]> <[email protected]> <[email protected]> <[email protected]> <[email protected]> <[email protected]>Message-ID: <[email protected]>

Thank you very much this solved the problem.

Joseph

On Thu, 4 Dec 2003, Federico Sacerdoti wrote:

> Please install the > http://www.rocksclusters.org/errata/3.0.0/ganglia-python-3.0.1 > -2.i386.rpm package, which includes the correct encoder.py file. (This > package is listed on the 3.0.0 errata page)> > -Federico> > On Dec 3, 2003, at 8:59 AM, Joseph wrote:> > > Here is the error I receive when I remove the file encoder.pyc and run > > the> > command cluster-fork> >> > Traceback (innermost last):> > File "/opt/rocks/sbin/cluster-fork", line 88, in ?> > import rocks.pssh> > File "/opt/rocks/lib/python/rocks/pssh.py", line 96, in ?> > import gmon.encoder> > ImportError: No module named encoder> >> > Thanks,> > Joseph> >> >> > On Tue, 2 Dec 2003, Mason J. Katz wrote:> >> >> Python creates the .pyc files for you, and does not remove the > >> original> >> .py file. I would be extremely surprised it two "identical" .pyc > >> files> >> had the same md5 checksum. I'd expect this to be more like C .o file> >> which always contain random data to pad out to the end of a page and> >> 32/64 bit word sizes. Still this is just a guess, the real point is> >> you can always remove the .pyc files and the .py will regenerate it> >> when imported (although standard UNIX file/dir permission still > >> apply).

> >>> >> What is the import error you get from cluster-fork?> >>> >> -mjk> >>> >> On Dec 2, 2003, at 9:02 AM, Angel Li wrote:> >>> >>> Joseph wrote:> >>>> >>>> Indeed my md5sum is different for encoder.pyc. However, when I > >>>> pulled> >>>> the file and run "cluster-fork" python responds about an import> >>>> problem. So it seems that regeneration did not occur. Is there a > >>>> flag> >>>> I need to pass?> >>>>> >>>> I have also tried to figure out what package provides encoder and> >>>> reinstall the package, but an rpm query reveals nothing.> >>>>> >>>> If this is a generated file, what generates it?> >>>>> >>>> It seems that an rpm file query on ganglia show that files in the> >>>> directory belong to the package, but encoder.pyc does not.> >>>>> >>>> Thanks,> >>>> Joseph> >>>>> >>>>> >>>>> >>> I have finally found the python sources in the HPC rolls CD, filename> >>> ganglia-python-3.0.0-2.i386.rpm. I'm not familiar with python but it> >>> seems python "compiles" the .py files to ".pyc" and then deletes the> >>> source file the first time they are referenced? I also noticed that> >>> there are two versions of python installed. Maybe the pyc files from> >>> one version won't load into the other one?> >>>> >>> Angel> >>>> >>>> >>> >>> Federico> > Rocks Cluster Group, San Diego Supercomputing Center, CA>

From vrowley at ucsd.edu Thu Dec 4 12:29:55 2003From: vrowley at ucsd.edu (V. Rowley)Date: Thu, 04 Dec 2003 12:29:55 -0800Subject: [Rocks-Discuss]Re: PXE boot problemsIn-Reply-To: <[email protected]>References: <[email protected]>Message-ID: <[email protected]>

Uh, nevermind. We had upgraded syslinux on our frontend, not the node we were trying to PXE boot. Sigh.

V. Rowley wrote:

> We have installed a ROCKS 3.0.0 frontend on a DL380 and are trying to > install a compute node via PXE. We are getting an error similar to the > one mentioned in the archives, e.g.> >> Loading initrd.img....>> Ready>>>> Failed to free base memory>>> > We have upgraded to syslinux-2.07-1, per the suggestion in the archives, > but continue to get the same error. Any ideas?>



From cdwan at mail.ahc.umn.edu Fri Dec 5 08:16:07 2003From: cdwan at mail.ahc.umn.edu (Chris Dwan (CCGB))Date: Fri, 5 Dec 2003 10:16:07 -0600 (CST)Subject: [Rocks-Discuss]Private NIS masterMessage-ID: <[email protected]>

Hello all. Long time listener, first time caller. Thanks for all thegreat work.

I'm integrating a Rocks cluster into an existing NIS domain. I noticedthat while the cluster database now supports a PrivateNISMaster, thatvariable doesn't make it into the /etc/yp.conf on the compute nodes. Theyremain broadcast.

Assume that, for whatever reason, I don't want to set up a repeater(slave) ypserv process on my frontend. I added the option "--nisserver<var name="Kickstart_PrivateNISMaster"/>" to the"profiles/3.0.0/nodes/nis-client.xml" file, removed the ypserver on myfrontend, and it works like I want it to.

Am I missing anything fundamental here?

-Chris Dwan University of Minnesota

From wyzhong78 at msn.com Mon Dec 8 06:18:34 2003From: wyzhong78 at msn.com (zhong wenyu)Date: Mon, 08 Dec 2003 22:18:34 +0800Subject: [Rocks-Discuss]3.0.0 problem: not able to boot upMessage-ID: <[email protected]>

Hi,everyone!

I installed rocks 3.0.0 defautly, There wasn't any trouble in the installing. But I haven't be able to boot,it stopped at the beginning,the message "GRUB" showed on the screen,and waiting.... my hardware are double Xeon 2.4G,MSI 9138,Seagate SCSI disk. Any appreciate is welcome!

_________________________________________________________________???? MSN Explorer: http://explorer.msn.com/lccn/

From angelini at vki.ac.be Mon Dec 8 06:20:45 2003From: angelini at vki.ac.be (Angelini Giuseppe)Date: Mon, 08 Dec 2003 15:20:45 +0100Subject: [Rocks-Discuss]How to use MPICH with sshMessage-ID: <[email protected]>

Dear rocks folk,

I have recently installed mpich with Lahay Fortran and now that I cancompile and link,I would like to run but it seems that I have another problem. In fact Ihave the followingerror message when I try to run:

[panara at compute-0-7 ~]$ mpirun -np $NPROC -machinefile $PBS_NODEFILE$DPT/hybflowp0_13226: p4_error: Path to program is invalid while starting/dc_03_04/panara/PREPRO_TESTS/hybflow with /usr/bin/rsh on compute-0-7:-1 p4_error: latest msg from perror: No such file or directoryp0_13226: p4_error: Child process exited while making connection toremote process on compute-0-6: 0p0_13226: (6.025133) net_send: could not write to fd=4, errno = 32p0_13226: (6.025231) net_send: could not write to fd=4, errno = 32

I am wondering why it is looking for /usr/bin/rsh for the communication,

I expected to use ssh and not rsh.

Any help will be welcome.

Regards.

Giuseppe Angelini

From casuj at cray.com Mon Dec 8 07:31:21 2003From: casuj at cray.com (John Casu)Date: Mon, 8 Dec 2003 07:31:21 -0800Subject: [Rocks-Discuss]How to use MPICH with sshIn-Reply-To: <[email protected]>; from Angelini Giuseppe on Mon, Dec 08, 2003 at 03:20:45PM +0100References: <[email protected]>Message-ID: <[email protected]>

On Mon, Dec 08, 2003 at 03:20:45PM +0100, Angelini Giuseppe wrote:> > Dear rocks folk,> > > I have recently installed mpich with Lahay Fortran and now that I can> compile and link,> I would like to run but it seems that I have another problem. In fact I> have the following> error message when I try to run:> > [panara at compute-0-7 ~]$ mpirun -np $NPROC -machinefile $PBS_NODEFILE> $DPT/hybflow> p0_13226: p4_error: Path to program is invalid while starting> /dc_03_04/panara/PREPRO_TESTS/hybflow with /usr/bin/rsh on compute-0-7:> -1> p4_error: latest msg from perror: No such file or directory> p0_13226: p4_error: Child process exited while making connection to> remote process on compute-0-6: 0> p0_13226: (6.025133) net_send: could not write to fd=4, errno = 32> p0_13226: (6.025231) net_send: could not write to fd=4, errno = 32> > I am wondering why it is looking for /usr/bin/rsh for the communication,> > I expected to use ssh and not rsh.> > Any help will be welcome.>

build mpich thus:

RSHCOMMAND=ssh ./configure .....

> > Regards.> > > Giuseppe Angelini

-- "Roses are red, Violets are blue, You lookin' at me ? YOU LOOKIN' AT ME ?!" -- Get Fuzzy.=======================================================================John CasuCray Inc. casuj at cray.com411 First Avenue South, Suite 600 Tel: (206) 701-2173Seattle, WA 98104-2860 Fax: (206) 701-2500=======================================================================

From davidow at molbio.mgh.harvard.edu Mon Dec 8 08:12:53 2003From: davidow at molbio.mgh.harvard.edu (Lance Davidow)Date: Mon, 8 Dec 2003 11:12:53 -0500Subject: [Rocks-Discuss]How to use MPICH with sshIn-Reply-To: <[email protected]>

References: <[email protected]>Message-ID: <p06002001bbfa51fea005@[132.183.190.222]>

Giuseppe,

Here's an answer from a newbie who just faced the same problem.

You are using the wrong flavor of mpich (and mpirun). There are several different distributions which work differently in ROCKS. the one you are using in the default path expects serv_p4 demons and .rhosts files in your home directory. The different flavors may be more compatible with different compilers as well.

[lance at rescluster2 lance]$ which mpirun/opt/mpich-mpd/gnu/bin/mpirun

the one you probably want is/opt/mpich/gnu/bin/mpirun

[lance at rescluster2 lance]$ locate mpirun.../opt/mpich-mpd/gnu/bin/mpirun.../opt/mpich/myrinet/gnu/bin/mpirun.../opt/mpich/gnu/bin/mpirun

Cheers,Lance

At 3:20 PM +0100 12/8/03, Angelini Giuseppe wrote:>Dear rocks folk,>>>I have recently installed mpich with Lahay Fortran and now that I can>compile and link,>I would like to run but it seems that I have another problem. In fact I>have the following>error message when I try to run:>>[panara at compute-0-7 ~]$ mpirun -np $NPROC -machinefile $PBS_NODEFILE>$DPT/hybflow>p0_13226: p4_error: Path to program is invalid while starting>/dc_03_04/panara/PREPRO_TESTS/hybflow with /usr/bin/rsh on compute-0-7:>-1> p4_error: latest msg from perror: No such file or directory>p0_13226: p4_error: Child process exited while making connection to>remote process on compute-0-6: 0>p0_13226: (6.025133) net_send: could not write to fd=4, errno = 32>p0_13226: (6.025231) net_send: could not write to fd=4, errno = 32>>I am wondering why it is looking for /usr/bin/rsh for the communication,>>I expected to use ssh and not rsh.>>Any help will be welcome.>>

>Regards.>>Giuseppe Angelini

-- Lance Davidow, PhDDirector of BioinformaticsDept of Molecular BiologyMass General HospitalBoston MA 02114davidow at molbio.mgh.harvard.edu617.726-5955Fax: 617.726-6893

From rscarce at caci.com Fri Dec 5 16:43:00 2003From: rscarce at caci.com (Reed Scarce)Date: Fri, 5 Dec 2003 19:43:00 -0500Subject: [Rocks-Discuss]PXE and system imagesMessage-ID: <OFF783DCCA.8F016562-ON85256DF3.008001FC-85256DF7.00043E45@caci.com>

We want to initialize new hardware with a known good image from identical hardware currently in use. The process imagined would be to PXE boot to a disk image server, PXE would create a RAM system that would request the system disk image from the server, which would push the desired system disk image to the requesting system. Upon completion the system would be available as a cluster member.

The lab configuration is a PC grade frontend with two 3Com 905s and a single server grade cluster node with integrated Intel 82551 (10/100)(the only PXE interface) and two integrated Intel 82546 (10/100/1000). The cluster node is one of the stock of nodes for the expansion. The stock of nodes have a Linux OS pre-installed, which would be eliminated in the process.

Currently the node will PXE boot from the 10/100 and pickup an installation boot from one of the g-bit interfaces. From there kickstart wants to take over.

Any recommendations how to get kickstart to push an image to the disk?

Thanks,

Reed Scarce-------------- next part --------------An HTML attachment was scrubbed...URL: https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20031205/dad04521/attachment-0001.html

From wyzhong78 at msn.com Mon Dec 8 05:36:37 2003From: wyzhong78 at msn.com (zhong wenyu)Date: Mon, 08 Dec 2003 21:36:37 +0800Subject: [Rocks-Discuss]Rocks 3.0.0 problem:not able to boot upMessage-ID: <[email protected]>

Hi,everyone!I have installed Rocks 3.0.0 with default options successful,there was not any trouble.But I boot it up,it stopped at beginning,just show "GRUB" on

the screen and waiting...Thanks for your help!

_________________________________________________________________???? MSN Explorer: http://explorer.msn.com/lccn/

From daniel.kidger at quadrics.com Mon Dec 8 09:54:53 2003From: daniel.kidger at quadrics.com (daniel.kidger at quadrics.com)Date: Mon, 8 Dec 2003 17:54:53 -0000Subject: [Rocks-Discuss]custom-kernels : naming conventions ? (Rocks 3.0.0)Message-ID: <[email protected]>

Dear all, Previously I have been installing a custom kernel on the compute nodes with an "extend-compute.xml" and an "/etc/init.d/qsconfigure" (to fix grub.conf).

However I am now trying to do it the 'proper' way. So I do (on :# cp qsnet-RedHat-kernel-2.4.18-27.3.10qsnet.i686.rpm \ /home/install/rocks-dist/7.3/en/os/i386/force/RPMS# cd /home/install# rocks-dist dist# SSH_NO_PASSWD=1 shoot-node compute-0-0

Hence:# find /home/install/ |xargs -l grep -nH qsnetshows me that hdlist and hdlist2 now contain this RPM. (and indeed If I duplicate my rpm in that directory rocks-dist notices this and warns me.)

However the node always ends up with "2.4.20-20.7smp" again.anaconda-ks.cfg contains just "kernel-smp" and install.log has "Installing kernel-smp-2.4.20-20.7."

So my question is: It looks like my RPM has a name that Rocks doesn't understand properly. What is wrong with my name ? and what are the rules for getting the correct name ? (.i686.rpm is of course correct, but I don't have -smp. in the name Is this the problem ?)

cf. Greg Bruno's wisdom: https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/2003-April/001770.html

Yours,Daniel.

--------------------------------------------------------------Dr. Dan Kidger, Quadrics Ltd. daniel.kidger at quadrics.comOne Bridewell St., Bristol, BS1 2AA, UK 0117 915 5505----------------------- www.quadrics.com --------------------

>

From DGURGUL at PARTNERS.ORG Mon Dec 8 11:09:27 2003From: DGURGUL at PARTNERS.ORG (Gurgul, Dennis J.)Date: Mon, 8 Dec 2003 14:09:27 -0500

Subject: [Rocks-Discuss]cluster-fork --mpd strangenessMessage-ID: <BC447F1AD529D311B4DE0008C71BF2EB0AE15840@phsexch7.mgh.harvard.edu>

I just did "cluster-fork -Uvh /sourcedir/ganglia-python-3.0.1-2.i386.rpm" andthen "cluster-fork service gschedule restart" (not sure I had to do the last).I also put 3.0.1-2 and restarted gschedule on the frontend.

Now I run "cluster-fork --mpd w".

I currently have a user who ssh'd to compute-0-8 from the frontend and one whossh'd into compute-0-17 from the front end.

But the return shows the users on lines for 17 (for the user on 0-8) and 10 (forthe user on 0-17):

17: 1:58pm up 24 days, 3:20, 1 user, load average: 0.00, 0.00, 0.0317: USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT17: lance pts/0 rescluster2.mgh. 1:31pm 40.00s 0.02s 0.02s -bash

10: 1:58pm up 24 days, 3:21, 1 user, load average: 0.02, 0.04, 0.0710: USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT10: dennis pts/0 rescluster2.mgh. 1:57pm 17.00s 0.02s 0.02s -bash

When I do "cluster-fork w" (without the --mpd) the users show up on the correctnodes.

Do the numbers on the left of the -mpd output correspond to the node names?

Thanks.

Dennis


From DGURGUL at PARTNERS.ORG Mon Dec 8 11:28:30 2003From: DGURGUL at PARTNERS.ORG (Gurgul, Dennis J.)Date: Mon, 8 Dec 2003 14:28:30 -0500 Subject: [Rocks-Discuss]cluster-fork --mpd strangenessMessage-ID: <BC447F1AD529D311B4DE0008C71BF2EB0AE15843@phsexch7.mgh.harvard.edu>

Maybe this is a better description of the "strangeness".

I did "cluster-fork --mpd hostname":

1: compute-0-0.local2: compute-0-1.local3: compute-0-3.local4: compute-0-13.local5: compute-0-11.local6: compute-0-15.local7: compute-0-16.local8: compute-0-19.local9: compute-0-21.local

10: compute-0-17.local11: compute-0-5.local12: compute-0-20.local13: compute-0-18.local14: compute-0-12.local15: compute-0-9.local16: compute-0-4.local17: compute-0-8.local18: compute-0-14.local19: compute-0-2.local20: compute-0-6.local0: compute-0-7.local21: compute-0-10.local


-----Original Message-----From: npaci-rocks-discussion-admin at sdsc.edu[mailto:npaci-rocks-discussion-admin at sdsc.edu]On Behalf Of Gurgul,Dennis J.Sent: Monday, December 08, 2003 2:09 PMTo: npaci-rocks-discussion at sdsc.eduSubject: [Rocks-Discuss]cluster-fork --mpd strangeness

I just did "cluster-fork -Uvh /sourcedir/ganglia-python-3.0.1-2.i386.rpm"andthen "cluster-fork service gschedule restart" (not sure I had to do thelast).I also put 3.0.1-2 and restarted gschedule on the frontend.

Now I run "cluster-fork --mpd w".

I currently have a user who ssh'd to compute-0-8 from the frontend and onewhossh'd into compute-0-17 from the front end.

But the return shows the users on lines for 17 (for the user on 0-8) and 10(forthe user on 0-17):

17: 1:58pm up 24 days, 3:20, 1 user, load average: 0.00, 0.00, 0.0317: USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT17: lance pts/0 rescluster2.mgh. 1:31pm 40.00s 0.02s 0.02s -bash

10: 1:58pm up 24 days, 3:21, 1 user, load average: 0.02, 0.04, 0.0710: USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT10: dennis pts/0 rescluster2.mgh. 1:57pm 17.00s 0.02s 0.02s -bash

When I do "cluster-fork w" (without the --mpd) the users show up on thecorrectnodes.

Do the numbers on the left of the -mpd output correspond to the node names?

Thanks.

Dennis


From tim.carlson at pnl.gov Mon Dec 8 12:35:16 2003From: tim.carlson at pnl.gov (Tim Carlson)Date: Mon, 08 Dec 2003 12:35:16 -0800 (PST)Subject: [Rocks-Discuss]PXE and system imagesIn-Reply-To: <OFF783DCCA.8F016562-ON85256DF3.008001FC-85256DF7.00043E45@caci.com>Message-ID: <[email protected]>

On Fri, 5 Dec 2003, Reed Scarce wrote:

> We want to initialize new hardware with a known good image from identical> hardware currently in use. The process imagined would be to PXE boot to a> disk image server, PXE would create a RAM system that would request the> system disk image from the server, which would push the desired system> disk image to the requesting system. Upon completion the system would be> available as a cluster member.>> The lab configuration is a PC grade frontend with two 3Com 905s and a> single server grade cluster node with integrated Intel 82551 (10/100)(the> only PXE interface) and two integrated Intel 82546 (10/100/1000). The> cluster node is one of the stock of nodes for the expansion. The stock of> nodes have a Linux OS pre-installed, which would be eliminated in the> process.>> Currently the node will PXE boot from the 10/100 and pickup an> installation boot from one of the g-bit interfaces. From there kickstart> wants to take over.>> Any recommendations how to get kickstart to push an image to the disk?

This sounds like you want to use Oscar instead of ROCKS.

http://oscar.openclustergroup.org/tiki-index.php

I'm not exactly sure why you think that the kickstart process won't giveyou exactly the same image on ever machine. If the hardware is the same,you'll get the same image on each machine.

We have boxes with the same setup, 10/100 PXE, and then dual gigabit. Ourmethod for installing ROCKS on this type of hardware is the following

1) Run insert-ethers and choose "manager" type of node.2) Connect all the PXE interfaces to the switch and boot them all. Do not connect the gigabit interface3) Once all of the nodes have PXE booted, exit insert-ethers. Start insert-ethers again and this time choose compute node4) Hook up the gigabit interface and the PXE interface to your nodes. All

of your machines will now install.5) In our case, we now quickly disconnect the PXE interface because we don't want to have the machine continually install. The real ROCKS method would have you choose (HD/net) for booting in the BIOS, but if you already have an OS on your machine, you would have to go into the BIOS twice before the compute nodes were installed. We disable rocks-grub and just connect up the PXE cable if we need to reinstall.

Tim

Tim CarlsonVoice: (509) 376 3423Email: Tim.Carlson at pnl.govEMSL UNIX System Support

From tim.carlson at pnl.gov Mon Dec 8 12:42:23 2003From: tim.carlson at pnl.gov (Tim Carlson)Date: Mon, 08 Dec 2003 12:42:23 -0800 (PST)Subject: [Rocks-Discuss]custom-kernels : naming conventions ? (Rocks 3.0.0)In-Reply-To: <[email protected]>Message-ID: <[email protected]>

On Mon, 8 Dec 2003 daniel.kidger at quadrics.com wrote:

I've gotten confused from time to time as to where to place custom RPMS(it's changed between releases), so my not-so-clean method is to just ripout the kernels in /home/install/rocks-dist/7.3/en/os/i386/Redhat/RPMSand drop my own in. Then do a

cd /home/installrocks-dist distshoot-node

You are probably running into an issue where the "force" directory is moreof an "in addition to" directory and your 2.4.18 kernel is being noted,but ignored since the 2.4.20 kernel is newer. I assume you nodes get bothand SMP and UP version of 2.4.20 and that your custom 2.4.18 is nowhere tobe found on the compute node.


> Previously I have been installing a custom kernel on the compute nodes> with an "extend-compute.xml" and an "/etc/init.d/qsconfigure" (to fix grub.conf).>> However I am now trying to do it the 'proper' way. So I do (on :> # cp qsnet-RedHat-kernel-2.4.18-27.3.10qsnet.i686.rpm \> /home/install/rocks-dist/7.3/en/os/i386/force/RPMS> # cd /home/install> # rocks-dist dist> # SSH_NO_PASSWD=1 shoot-node compute-0-0>> Hence:> # find /home/install/ |xargs -l grep -nH qsnet

> shows me that hdlist and hdlist2 now contain this RPM. (and indeed If I duplicate my rpm in that directory rocks-dist notices this and warns me.)>> However the node always ends up with "2.4.20-20.7smp" again.> anaconda-ks.cfg contains just "kernel-smp" and install.log has "Installing kernel-smp-2.4.20-20.7.">> So my question is:> It looks like my RPM has a name that Rocks doesn't understand properly.> What is wrong with my name ?> and what are the rules for getting the correct name ?> (.i686.rpm is of course correct, but I don't have -smp. in the name Is this the problem ?)>> cf. Greg Bruno's wisdom:> https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/2003-April/001770.html>>> Yours,> Daniel.

From fds at sdsc.edu Mon Dec 8 12:51:12 2003From: fds at sdsc.edu (Federico Sacerdoti)Date: Mon, 8 Dec 2003 12:51:12 -0800Subject: [Rocks-Discuss]cluster-fork --mpd strangenessIn-Reply-To: <BC447F1AD529D311B4DE0008C71BF2EB0AE15843@phsexch7.mgh.harvard.edu>References: <BC447F1AD529D311B4DE0008C71BF2EB0AE15843@phsexch7.mgh.harvard.edu>Message-ID: <[email protected]>

You are right, and I think this is a shortcoming of MPD. There is no obvious way to force the MPD numbering to correspond to the order the nodes were called out on the command line (cluster-fork --mpd actually makes a shell call to mpirun and it calls out all the node names explicitly). MPD seems to number the output differently, as you found out.

So mpd for now may be more useful for jobs that are not sensitive to this. If enough of you find this shortcoming to be a real annoyance, we could work on putting the node name label on the output by explicitly calling "hostname" or similar.

Good ideas are welcome :)-Federico

On Dec 8, 2003, at 11:28 AM, Gurgul, Dennis J. wrote:

> Maybe this is a better description of the "strangeness".>> I did "cluster-fork --mpd hostname":>> 1: compute-0-0.local> 2: compute-0-1.local> 3: compute-0-3.local> 4: compute-0-13.local> 5: compute-0-11.local> 6: compute-0-15.local> 7: compute-0-16.local

> 8: compute-0-19.local> 9: compute-0-21.local> 10: compute-0-17.local> 11: compute-0-5.local> 12: compute-0-20.local> 13: compute-0-18.local> 14: compute-0-12.local> 15: compute-0-9.local> 16: compute-0-4.local> 17: compute-0-8.local> 18: compute-0-14.local> 19: compute-0-2.local> 20: compute-0-6.local> 0: compute-0-7.local> 21: compute-0-10.local>> Dennis J. Gurgul> Partners Health Care System> Research Management> Research Computing Core> 617.724.3169>>> -----Original Message-----> From: npaci-rocks-discussion-admin at sdsc.edu> [mailto:npaci-rocks-discussion-admin at sdsc.edu]On Behalf Of Gurgul,> Dennis J.> Sent: Monday, December 08, 2003 2:09 PM> To: npaci-rocks-discussion at sdsc.edu> Subject: [Rocks-Discuss]cluster-fork --mpd strangeness>>> I just did "cluster-fork -Uvh > /sourcedir/ganglia-python-3.0.1-2.i386.rpm"> and> then "cluster-fork service gschedule restart" (not sure I had to do the> last).> I also put 3.0.1-2 and restarted gschedule on the frontend.>> Now I run "cluster-fork --mpd w".>> I currently have a user who ssh'd to compute-0-8 from the frontend and > one> who> ssh'd into compute-0-17 from the front end.>> But the return shows the users on lines for 17 (for the user on 0-8) > and 10> (for> the user on 0-17):>> 17: 1:58pm up 24 days, 3:20, 1 user, load average: 0.00, 0.00, > 0.03> 17: USER TTY FROM LOGIN@ IDLE JCPU PCPU > WHAT> 17: lance pts/0 rescluster2.mgh. 1:31pm 40.00s 0.02s 0.02s > -bash>> 10: 1:58pm up 24 days, 3:21, 1 user, load average: 0.02, 0.04,

> 0.07> 10: USER TTY FROM LOGIN@ IDLE JCPU PCPU > WHAT> 10: dennis pts/0 rescluster2.mgh. 1:57pm 17.00s 0.02s 0.02s > -bash>> When I do "cluster-fork w" (without the --mpd) the users show up on the> correct> nodes.>> Do the numbers on the left of the -mpd output correspond to the node > names?>> Thanks.>> Dennis>> Dennis J. Gurgul> Partners Health Care System> Research Management> Research Computing Core> 617.724.3169>Federico


From DGURGUL at PARTNERS.ORG Mon Dec 8 12:55:13 2003From: DGURGUL at PARTNERS.ORG (Gurgul, Dennis J.)Date: Mon, 8 Dec 2003 15:55:13 -0500 Subject: [Rocks-Discuss]cluster-fork --mpd strangenessMessage-ID: <BC447F1AD529D311B4DE0008C71BF2EB0AE15847@phsexch7.mgh.harvard.edu>

Thanks.

On a related note, when I did "cluster-fork service gschedule restart" gschedulestarted with the "OK" output, but then the fork process hung on each node and Ihad to ^c out for it to go on to the next node.

I tried to ssh to a node and then did the gschedule restart. Even then, after Itried to "exit" out of the node, the session hung and I had to log back in andkill it from the frontend.


-----Original Message-----From: Federico Sacerdoti [mailto:fds at sdsc.edu]Sent: Monday, December 08, 2003 3:51 PMTo: Gurgul, Dennis J.Cc: npaci-rocks-discussion at sdsc.eduSubject: Re: [Rocks-Discuss]cluster-fork --mpd strangeness

You are right, and I think this is a shortcoming of MPD. There is no obvious way to force the MPD numbering to correspond to the order the nodes were called out on the command line (cluster-fork --mpd actually makes a shell call to mpirun and it calls out all the node names explicitly). MPD seems to number the output differently, as you found out.

So mpd for now may be more useful for jobs that are not sensitive to this. If enough of you find this shortcoming to be a real annoyance, we could work on putting the node name label on the output by explicitly calling "hostname" or similar.

Good ideas are welcome :)-Federico

On Dec 8, 2003, at 11:28 AM, Gurgul, Dennis J. wrote:

> Maybe this is a better description of the "strangeness".>> I did "cluster-fork --mpd hostname":>> 1: compute-0-0.local> 2: compute-0-1.local> 3: compute-0-3.local> 4: compute-0-13.local> 5: compute-0-11.local> 6: compute-0-15.local> 7: compute-0-16.local> 8: compute-0-19.local> 9: compute-0-21.local> 10: compute-0-17.local> 11: compute-0-5.local> 12: compute-0-20.local> 13: compute-0-18.local> 14: compute-0-12.local> 15: compute-0-9.local> 16: compute-0-4.local> 17: compute-0-8.local> 18: compute-0-14.local> 19: compute-0-2.local> 20: compute-0-6.local> 0: compute-0-7.local> 21: compute-0-10.local>> Dennis J. Gurgul> Partners Health Care System> Research Management> Research Computing Core> 617.724.3169>>> -----Original Message-----> From: npaci-rocks-discussion-admin at sdsc.edu> [mailto:npaci-rocks-discussion-admin at sdsc.edu]On Behalf Of Gurgul,> Dennis J.> Sent: Monday, December 08, 2003 2:09 PM> To: npaci-rocks-discussion at sdsc.edu

> Subject: [Rocks-Discuss]cluster-fork --mpd strangeness>>> I just did "cluster-fork -Uvh > /sourcedir/ganglia-python-3.0.1-2.i386.rpm"> and> then "cluster-fork service gschedule restart" (not sure I had to do the> last).> I also put 3.0.1-2 and restarted gschedule on the frontend.>> Now I run "cluster-fork --mpd w".>> I currently have a user who ssh'd to compute-0-8 from the frontend and > one> who> ssh'd into compute-0-17 from the front end.>> But the return shows the users on lines for 17 (for the user on 0-8) > and 10> (for> the user on 0-17):>> 17: 1:58pm up 24 days, 3:20, 1 user, load average: 0.00, 0.00, > 0.03> 17: USER TTY FROM LOGIN@ IDLE JCPU PCPU > WHAT> 17: lance pts/0 rescluster2.mgh. 1:31pm 40.00s 0.02s 0.02s > -bash>> 10: 1:58pm up 24 days, 3:21, 1 user, load average: 0.02, 0.04, > 0.07> 10: USER TTY FROM LOGIN@ IDLE JCPU PCPU > WHAT> 10: dennis pts/0 rescluster2.mgh. 1:57pm 17.00s 0.02s 0.02s > -bash>> When I do "cluster-fork w" (without the --mpd) the users show up on the> correct> nodes.>> Do the numbers on the left of the -mpd output correspond to the node > names?>> Thanks.>> Dennis>> Dennis J. Gurgul> Partners Health Care System> Research Management> Research Computing Core> 617.724.3169>Federico


From mjk at sdsc.edu Mon Dec 8 12:58:22 2003

From: mjk at sdsc.edu (Mason J. Katz)Date: Mon, 8 Dec 2003 12:58:22 -0800Subject: [Rocks-Discuss]PXE and system imagesIn-Reply-To: <[email protected]>References: <[email protected]>Message-ID: <[email protected]>


> 5) In our case, we now quickly disconnect the PXE interface because we> don't want to have the machine continually install. The real ROCKS> method would have you choose (HD/net) for booting in the BIOS, but > if you already> have an OS on your machine, you would have to go into the BIOS twice> before the compute nodes were installed. We disable rocks-grub and > just> connect up the PXE cable if we need to reinstall.>

For most boxes we've seen that support PXE there is an option to hit <F12> to force a network PXE boot, this allows you to force a PXE even when a valid OS/Boot block exists on your hard disk. If you don't have this you do indeed need to go into BIOS twice -- a pain.

-mjk

From fds at sdsc.edu Mon Dec 8 13:26:46 2003From: fds at sdsc.edu (Federico Sacerdoti)Date: Mon, 8 Dec 2003 13:26:46 -0800Subject: [Rocks-Discuss]cluster-fork --mpd strangenessIn-Reply-To: <BC447F1AD529D311B4DE0008C71BF2EB0AE15847@phsexch7.mgh.harvard.edu>References: <BC447F1AD529D311B4DE0008C71BF2EB0AE15847@phsexch7.mgh.harvard.edu>Message-ID: <[email protected]>

I've seen this before as well. I believe it has something to do with the way the color "[ OK ]" characters are interacting with the ssh session from the normal cluster-fork. We have yet to characterize this bug adequately.

-Federico

On Dec 8, 2003, at 12:55 PM, Gurgul, Dennis J. wrote:

> Thanks.>> On a related note, when I did "cluster-fork service gschedule restart" > gschedule> started with the "OK" output, but then the fork process hung on each > node and I> had to ^c out for it to go on to the next node.>> I tried to ssh to a node and then did the gschedule restart. Even > then, after I> tried to "exit" out of the node, the session hung and I had to log > back in and> kill it from the frontend.

>>> Dennis J. Gurgul> Partners Health Care System> Research Management> Research Computing Core> 617.724.3169>>> -----Original Message-----> From: Federico Sacerdoti [mailto:fds at sdsc.edu]> Sent: Monday, December 08, 2003 3:51 PM> To: Gurgul, Dennis J.> Cc: npaci-rocks-discussion at sdsc.edu> Subject: Re: [Rocks-Discuss]cluster-fork --mpd strangeness>>> You are right, and I think this is a shortcoming of MPD. There is no> obvious way to force the MPD numbering to correspond to the order the> nodes were called out on the command line (cluster-fork --mpd actually> makes a shell call to mpirun and it calls out all the node names> explicitly). MPD seems to number the output differently, as you found> out.>> So mpd for now may be more useful for jobs that are not sensitive to> this. If enough of you find this shortcoming to be a real annoyance, we> could work on putting the node name label on the output by explicitly> calling "hostname" or similar.>> Good ideas are welcome :)> -Federico>> On Dec 8, 2003, at 11:28 AM, Gurgul, Dennis J. wrote:>>> Maybe this is a better description of the "strangeness".>>>> I did "cluster-fork --mpd hostname":>>>> 1: compute-0-0.local>> 2: compute-0-1.local>> 3: compute-0-3.local>> 4: compute-0-13.local>> 5: compute-0-11.local>> 6: compute-0-15.local>> 7: compute-0-16.local>> 8: compute-0-19.local>> 9: compute-0-21.local>> 10: compute-0-17.local>> 11: compute-0-5.local>> 12: compute-0-20.local>> 13: compute-0-18.local>> 14: compute-0-12.local>> 15: compute-0-9.local>> 16: compute-0-4.local>> 17: compute-0-8.local>> 18: compute-0-14.local>> 19: compute-0-2.local>> 20: compute-0-6.local>> 0: compute-0-7.local

>> 21: compute-0-10.local>>>> Dennis J. Gurgul>> Partners Health Care System>> Research Management>> Research Computing Core>> 617.724.3169>>>>>> -----Original Message----->> From: npaci-rocks-discussion-admin at sdsc.edu>> [mailto:npaci-rocks-discussion-admin at sdsc.edu]On Behalf Of Gurgul,>> Dennis J.>> Sent: Monday, December 08, 2003 2:09 PM>> To: npaci-rocks-discussion at sdsc.edu>> Subject: [Rocks-Discuss]cluster-fork --mpd strangeness>>>>>> I just did "cluster-fork -Uvh>> /sourcedir/ganglia-python-3.0.1-2.i386.rpm">> and>> then "cluster-fork service gschedule restart" (not sure I had to do >> the>> last).>> I also put 3.0.1-2 and restarted gschedule on the frontend.>>>> Now I run "cluster-fork --mpd w".>>>> I currently have a user who ssh'd to compute-0-8 from the frontend and>> one>> who>> ssh'd into compute-0-17 from the front end.>>>> But the return shows the users on lines for 17 (for the user on 0-8)>> and 10>> (for>> the user on 0-17):>>>> 17: 1:58pm up 24 days, 3:20, 1 user, load average: 0.00, 0.00,>> 0.03>> 17: USER TTY FROM LOGIN@ IDLE JCPU PCPU>> WHAT>> 17: lance pts/0 rescluster2.mgh. 1:31pm 40.00s 0.02s 0.02s>> -bash>>>> 10: 1:58pm up 24 days, 3:21, 1 user, load average: 0.02, 0.04,>> 0.07>> 10: USER TTY FROM LOGIN@ IDLE JCPU PCPU>> WHAT>> 10: dennis pts/0 rescluster2.mgh. 1:57pm 17.00s 0.02s 0.02s>> -bash>>>> When I do "cluster-fork w" (without the --mpd) the users show up on >> the>> correct>> nodes.>>>> Do the numbers on the left of the -mpd output correspond to the node>> names?

>>>> Thanks.>>>> Dennis>>>> Dennis J. Gurgul>> Partners Health Care System>> Research Management>> Research Computing Core>> 617.724.3169>>> Federico>> Rocks Cluster Group, San Diego Supercomputing Center, CA>Federico


From bruno at rocksclusters.org Mon Dec 8 15:31:08 2003From: bruno at rocksclusters.org (Greg Bruno)Date: Mon, 8 Dec 2003 15:31:08 -0800Subject: [Rocks-Discuss]Rocks 3.0.0 problem:not able to boot upIn-Reply-To: <[email protected]>References: <[email protected]>Message-ID: <[email protected]>

> I have installed Rocks 3.0.0 with default options successful,there was > not any trouble.But I boot it up,it stopped at beginning,just show > "GRUB" on the screen and waiting...

when you built the frontend, did you start with the rocks base CD then add the HPC roll?

- gb

From bruno at rocksclusters.org Mon Dec 8 15:37:46 2003From: bruno at rocksclusters.org (Greg Bruno)Date: Mon, 8 Dec 2003 15:37:46 -0800Subject: [Rocks-Discuss]custom-kernels : naming conventions ? (Rocks 3.0.0)In-Reply-To: <[email protected]>References: <[email protected]>Message-ID: <[email protected]>

> Previously I have been installing a custom kernel on the compute > nodes> with an "extend-compute.xml" and an "/etc/init.d/qsconfigure" (to fix > grub.conf).>> However I am now trying to do it the 'proper' way. So I do (on :> # cp qsnet-RedHat-kernel-2.4.18-27.3.10qsnet.i686.rpm \> /home/install/rocks-dist/7.3/en/os/i386/force/RPMS> # cd /home/install> # rocks-dist dist> # SSH_NO_PASSWD=1 shoot-node compute-0-0

>> Hence:> # find /home/install/ |xargs -l grep -nH qsnet> shows me that hdlist and hdlist2 now contain this RPM. (and indeed If > I duplicate my rpm in that directory rocks-dist notices this and warns > me.)>> However the node always ends up with "2.4.20-20.7smp" again.> anaconda-ks.cfg contains just "kernel-smp" and install.log has > "Installing kernel-smp-2.4.20-20.7.">> So my question is:> It looks like my RPM has a name that Rocks doesn't understand > properly.> What is wrong with my name ?> and what are the rules for getting the correct name ?> (.i686.rpm is of course correct, but I don't have -smp. in the > name Is this the problem ?)

the anaconda installer looks for kernel packages with a specific format:

kernel-<kernel ver>-<redhat ver>.i686.rpm

and for smp nodes:

kernel-smp-<kernel ver>-<redhat ver>.i686.rpm

we have made the necessary patches to files under /usr/src/linux-2.4 in order to produce redhat-compliant kernels. see:

http://www.rocksclusters.org/rocks-documentation/3.0.0/customization- kernel.html

also, would you be interested in making your changes for the quadrics interconnect available to the general rocks community?

- gb

From purikk at hotmail.com Mon Dec 8 20:23:35 2003From: purikk at hotmail.com (purushotham komaravolu)Date: Mon, 8 Dec 2003 23:23:35 -0500Subject: [Rocks-Discuss]AMD OpteronReferences: <[email protected]>Message-ID: <[email protected]>

Hello, I am a newbie to ROCKS cluster. I wanted to setup clusters on32-bit Architectures( Intel and AMD) and 64-bit Architecture( Intel andAMD).I found the 64-bit download for Intel on the website but not for AMD. Doesit work for AMD opteron? if not what is the ETA for AMD-64.We are planning to but AMD-64 bit machines shortly, and I would like tovolunteer for the beta testing if needed.ThanksRegards,Puru

From mjk at sdsc.edu Tue Dec 9 07:28:51 2003From: mjk at sdsc.edu (Mason J. Katz)Date: Tue, 9 Dec 2003 07:28:51 -0800Subject: [Rocks-Discuss]AMD OpteronIn-Reply-To: <[email protected]>References: <[email protected]> <[email protected]>Message-ID: <[email protected]>

We have a beta right now that we have sent to a few people. We plan on a release this month, and AMD_64 will be part of this release along with the usual x86, IA64 support.

If you want to help accelerate this process please talk to your vendor about loaning/giving us some hardware for testing. Having access to a variety of Opteron hardware (we own two boxes) is the only way we can have good support for this chip.

-mjk

On Dec 8, 2003, at 8:23 PM, purushotham komaravolu wrote:

> Hello,> I am a newbie to ROCKS cluster. I wanted to setup clusters > on> 32-bit Architectures( Intel and AMD) and 64-bit Architecture( Intel > and> AMD).> I found the 64-bit download for Intel on the website but not for AMD. > Does> it work for AMD opteron? if not what is the ETA for AMD-64.> We are planning to but AMD-64 bit machines shortly, and I would like to> volunteer for the beta testing if needed.> Thanks> Regards,> Puru

From cdmaest at sandia.gov Tue Dec 9 07:48:31 2003From: cdmaest at sandia.gov (Christopher D. Maestas)Date: Tue, 09 Dec 2003 08:48:31 -0700Subject: [Rocks-Discuss]AMD OpteronIn-Reply-To: <[email protected]>References: <[email protected]> <[email protected]> <[email protected]>Message-ID: <[email protected]>

What do I have to do to sign up to test? We have opteron systems we cantest on here.

On Tue, 2003-12-09 at 08:28, Mason J. Katz wrote:> We have a beta right now that we have sent to a few people. We plan on > a release this month, and AMD_64 will be part of this release along > with the usual x86, IA64 support.>

> If you want to help accelerate this process please talk to your vendor > about loaning/giving us some hardware for testing. Having access to a > variety of Opteron hardware (we own two boxes) is the only way we can > have good support for this chip.> > -mjk> > > On Dec 8, 2003, at 8:23 PM, purushotham komaravolu wrote:> > > Hello,> > I am a newbie to ROCKS cluster. I wanted to setup clusters > > on> > 32-bit Architectures( Intel and AMD) and 64-bit Architecture( Intel > > and> > AMD).> > I found the 64-bit download for Intel on the website but not for AMD. > > Does> > it work for AMD opteron? if not what is the ETA for AMD-64.> > We are planning to but AMD-64 bit machines shortly, and I would like to> > volunteer for the beta testing if needed.> > Thanks> > Regards,> > Puru>

From vincent_b_fox at yahoo.com Tue Dec 9 11:10:40 2003From: vincent_b_fox at yahoo.com (Vincent Fox)Date: Tue, 9 Dec 2003 11:10:40 -0800 (PST)Subject: [Rocks-Discuss]ATLAS rpm build problems on PII platformMessage-ID: <[email protected]>

I tried doing a rebuild of the ATLAS libraries on aPII test cluster and no go. Did an exportPATH=/opt/gcc32/bin:$PATH first to make it easy onmyself.

The "make rpm" appears to get stuck in a loop on thexconfig part. I pause it and it seems like the promptis defining f77=-O and f77 FLAGS=y which doesn't workof course. My guess is the spec file doesn't have ananswer for a previous question, so the /usr/bin/g77answer is getting set for the previous prompt, andsince no f77 is defined, it gets stuck.

Anyhow thought I would note this problem on the listfor those more qualified to address it.

__________________________________Do you Yahoo!?New Yahoo! Photos - easier uploading and sharing.http://photos.yahoo.com/

From bryan at UCLAlumni.net Tue Dec 9 12:14:16 2003

From: bryan at UCLAlumni.net (Bryan Littlefield)Date: Tue, 09 Dec 2003 12:14:16 -0800Subject: [Rocks-Discuss]Rocks-Discuss] AMD Opteron - Contact ApproIn-Reply-To: <[email protected]>References: <[email protected]>Message-ID: <[email protected]>

Hi Mason,

I suggest contacting Appro. We are using Rocks on our Opteron cluster and Appro would likely love to help. I will contact them as well to see if they could help getting a opteron machine for testing. Contact info below:

Thanks --Bryan

Jian Chang - Regional Sales Manager(408) 941-8100 x 202(800) 927-5464 x 202(408) 941-8111 Faxjian at appro.comhttp://www.appro.com

npaci-rocks-discussion-request at sdsc.edu wrote:

>From: "Mason J. Katz" <mjk at sdsc.edu>>Subject: Re: [Rocks-Discuss]AMD Opteron>Date: Tue, 9 Dec 2003 07:28:51 -0800>To: "purushotham komaravolu" <purikk at hotmail.com>>>We have a beta right now that we have sent to a few people. We plan on >a release this month, and AMD_64 will be part of this release along >with the usual x86, IA64 support.>>If you want to help accelerate this process please talk to your vendor >about loaning/giving us some hardware for testing. Having access to a >variety of Opteron hardware (we own two boxes) is the only way we can >have good support for this chip.>> -mjk>>>On Dec 8, 2003, at 8:23 PM, purushotham komaravolu wrote:>> >> Cc: <npaci-rocks-discussion at sdsc.edu>>>>Hello,>> I am a newbie to ROCKS cluster. I wanted to setup clusters >>on>>32-bit Architectures( Intel and AMD) and 64-bit Architecture( Intel >>and>>AMD).>>I found the 64-bit download for Intel on the website but not for AMD. >>Does>>it work for AMD opteron? if not what is the ETA for AMD-64.>>We are planning to but AMD-64 bit machines shortly, and I would like to>>volunteer for the beta testing if needed.

>>Thanks>>Regards,>>Puru>> >>>>_______________________________________________>npaci-rocks-discussion mailing list>npaci-rocks-discussion at sdsc.edu>http://lists.sdsc.edu/mailman/listinfo.cgi/npaci-rocks-discussion>>>End of npaci-rocks-discussion Digest> >-------------- next part --------------An HTML attachment was scrubbed...URL: https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20031209/611e65b4/attachment-0001.html

From vincent_b_fox at yahoo.com Tue Dec 9 13:22:59 2003From: vincent_b_fox at yahoo.com (Vincent Fox)Date: Tue, 9 Dec 2003 13:22:59 -0800 (PST)Subject: [Rocks-Discuss]ATLAS rpm build problems on PII platformMessage-ID: <[email protected]>

Okay, came up my own quick hack:

Edit atlas.spec.in, go to "other x86" section, remove2 lines right above "linux", seems to make rpm now.

A more formal patch would be put in a section forcpuid eq 4 with this correction I suppose.

__________________________________Do you Yahoo!?New Yahoo! Photos - easier uploading and sharing.http://photos.yahoo.com/

From landman at scalableinformatics.com Tue Dec 9 13:49:06 2003From: landman at scalableinformatics.com (Joe Landman)Date: Tue, 09 Dec 2003 16:49:06 -0500Subject: [Rocks-Discuss]Has anyone tried Gaussian binary only on the ROCKS 3.1.0 beta?Message-ID: <[email protected]>

Hi Folks

Working on building the same cluster from last week. The admin nodesare up and functional (plain old RH9+XFS).

I want to get the head nodes up, with one of the requirements beingrunning the Gaussian binary-only code. Gaussian's page lists RH9.0support, so I wanted to see if someone has tried the beta with thiscode.

Thanks.

Joe


From landman at scalableinformatics.com Tue Dec 9 13:59:37 2003From: landman at scalableinformatics.com (Joe Landman)Date: Tue, 09 Dec 2003 16:59:37 -0500Subject: [Rocks-Discuss]a name for pain ... modules/kernels/ethernets ...Message-ID: <[email protected]>

Folks:

As indicated previously, I am wrestling with a Supermicro basedcluster. None of the RH distributions come with the correct E1000driver, so a new kernel is needed (in the boot CD, and forinstallation).

The problem I am running into is that it isn't at all obvious/easy howto install a new kernel/modules into ROCKS (3.0 or otherwise) to enablethis thing to work. Following the examples in the documentation havenot met with success. Running "rocks-dist cdrom" with the new kernels(2.4.23 works nicely on the nodes) in the force/RPMS directory generatesa bootable CD with the original 2.4.18BOOT kernel.

What I (and I think others) need, is a simple/easy to follow methodthat will generate a bootable CD with the correct linux kernel, and thecorrect modules.

Is this in process somewhere? What would be tremendously helpful isif we can generate a binary module, and put that into the boot processby placing it into the force/modules/binary directory (assuming oneexists) with the appropriate entry of a similar name in theforce/modules/meta directory as a simple XML document giving pci-ids,description, name, etc.

Anything close to this coming? Modules are killing future ROCKSinstalls, the inability to easily inject a new module in there hascreated a problem whereby ROCKS does not function (as the underlying RHdoes not function).


From tim.carlson at pnl.gov Tue Dec 9 14:11:43 2003From: tim.carlson at pnl.gov (Tim Carlson)Date: Tue, 09 Dec 2003 14:11:43 -0800 (PST)Subject: [Rocks-Discuss]a name for pain ... modules/kernels/ethernets ...In-Reply-To: <[email protected]>Message-ID: <[email protected]>

On Tue, 9 Dec 2003, Joe Landman wrote:

> The problem I am running into is that it isn't at all obvious/easy how> to install a new kernel/modules into ROCKS (3.0 or otherwise) to enable> this thing to work. Following the examples in the documentation have> not met with success. Running "rocks-dist cdrom" with the new kernels> (2.4.23 works nicely on the nodes) in the force/RPMS directory generates> a bootable CD with the original 2.4.18BOOT kernel.

So you built a 2.4.23BOOT rpm? The problem people have is with the namingconvention of kernels. A kernel.org spec file isn't going to generateproper kernel rpms IMHO. What you really want to do (and maybe you arealready doing this) is steal the bit of the Redhat spec building scriptsthat generage the -smp .i686 and BOOT rpms.

New hardware is tough for any distro.

Tim


From tmartin at physics.ucsd.edu Tue Dec 9 15:57:17 2003From: tmartin at physics.ucsd.edu (Terrence Martin)Date: Tue, 09 Dec 2003 15:57:17 -0800Subject: [Rocks-Discuss]Intel MT based Gigabit controllersMessage-ID: <[email protected]>

Does Rocks 3.0 support the Intel MT based Gigabit controllers (PCI 8086:1013) without any modifications? My new cluster has these new controllers.

Rocks 2.3.1 does not seem detect/drive these cards correctly (install failes to detect and the e1000 driver does not seem to work). So I was going to go ahead and move my new head node to 3.0.0 and was wondering if I am going to have to do additional work to get the intel drivers on the boot image (for cluster nodes) to have the working Intel driver with these cards.

Terrence

From tmartin at physics.ucsd.edu Tue Dec 9 15:59:29 2003From: tmartin at physics.ucsd.edu (Terrence Martin)Date: Tue, 09 Dec 2003 15:59:29 -0800Subject: [Rocks-Discuss]how to include custom driverIn-Reply-To: <[email protected]>

References: <[email protected]>Message-ID: <[email protected]>

Tim Carlson wrote:> On Mon, 9 Jun 2003, Greg Bruno wrote:> > >>what driver did you have to add?>>>>we may be able to provide a patch for your compute nodes.> > > Ah!!!.. I didn't see this repsonse before I sent off my reply to Matthew.> Can I please have the aic79xx driver and while your at it can I get a> module-info file that has this entry for gigabit? Not sure if it is> already in there? ;)> > 0x8086 0x100f "e1000" "Intel Corp. 82545EM Gigabit Ethernet Controller rev (01)"> > It is also quite possible that I burned the 2.3.0 media instead of> 2.3.2. It was late in the day when I tried to do my install.> > Tim> > Tim Carlson> Voice: (509) 376 3423> Email: Tim.Carlson at pnl.gov> EMSL UNIX System Support

I would also like to request that this driver/change be made. I have a cluster with these newer Intel gigabit chipsets.

Terrence

From tmartin at physics.ucsd.edu Tue Dec 9 16:33:18 2003From: tmartin at physics.ucsd.edu (Terrence Martin)Date: Tue, 09 Dec 2003 16:33:18 -0800Subject: [Rocks-Discuss]a name for pain ... modules/kernels/ethernets ...In-Reply-To: <[email protected]>References: <[email protected]>Message-ID: <[email protected]>

Tim Carlson wrote:> On Tue, 9 Dec 2003, Joe Landman wrote:> > >> The problem I am running into is that it isn't at all obvious/easy how>>to install a new kernel/modules into ROCKS (3.0 or otherwise) to enable>>this thing to work. Following the examples in the documentation have>>not met with success. Running "rocks-dist cdrom" with the new kernels>>(2.4.23 works nicely on the nodes) in the force/RPMS directory generates>>a bootable CD with the original 2.4.18BOOT kernel.> > > So you built a 2.4.23BOOT rpm? The problem people have is with the naming

> convention of kernels. A kernel.org spec file isn't going to generate> proper kernel rpms IMHO. What you really want to do (and maybe you are> already doing this) is steal the bit of the Redhat spec building scripts> that generage the -smp .i686 and BOOT rpms.> > New hardware is tough for any distro.> > Tim> > Tim Carlson> Voice: (509) 376 3423> Email: Tim.Carlson at pnl.gov> EMSL UNIX System Support>

Where do you start if you want to update the PXE boot image to support a new kernel?

Terrence

From tmartin at physics.ucsd.edu Tue Dec 9 16:58:08 2003From: tmartin at physics.ucsd.edu (Terrence Martin)Date: Tue, 09 Dec 2003 16:58:08 -0800Subject: [Rocks-Discuss]Could not allocate requested partitionsMessage-ID: <[email protected]>

I am getting the following error when trying to install a Rocks 3.0.0 headnode. The headnode works find in rocks 2.3.2.

Could not allocate requested partitions: Partitioning failed: Could not allocate partitions as primary partitions

What is also odd is when I alt-f2 and run fdisk /dev/hda it tells me it cannot find that device (unable to open /dev/hda). However when I watch the boot messages hda definitely comes up. Also the headnode works fine with 2.3.2.

Any ideas?

Terrence

From tmartin at physics.ucsd.edu Tue Dec 9 17:33:24 2003From: tmartin at physics.ucsd.edu (Terrence Martin)Date: Tue, 09 Dec 2003 17:33:24 -0800Subject: [Rocks-Discuss]Could not allocate requested partitionsIn-Reply-To: <[email protected]>References: <[email protected]>Message-ID: <[email protected]>

Terrence Martin wrote:> I am getting the following error when trying to install a Rocks 3.0.0 > headnode. The headnode works find in rocks 2.3.2.>

> Could not allocate requested partitions: Partitioning failed: Could not > allocate partitions as primary partitions> > What is also odd is when I alt-f2 and run fdisk /dev/hda it tells me it > cannot find that device (unable to open /dev/hda). However when I watch > the boot messages hda definitely comes up. Also the headnode works fine > with 2.3.2.> > Any ideas?> > Terrence> > >

Figured it out, aparently rocks 3.0.0 did not like my partitions from rocks 2.3.2. I booted knoppix, blew away the partition table and so far so good on the head node.

Terrence

From mjk at sdsc.edu Tue Dec 9 17:54:01 2003From: mjk at sdsc.edu (Mason J. Katz)Date: Tue, 9 Dec 2003 17:54:01 -0800Subject: [Rocks-Discuss]a name for pain ... modules/kernels/ethernets ...In-Reply-To: <[email protected]>References: <[email protected]>Message-ID: <[email protected]>

If the underlying RedHat doesn't support your hardware you are pretty much dead in the water. We do at times include drivers that RH does not but this is an exception and only for hardware we physically have access to. The rocks-boot (rocks/src/rock/boot in CVS) package controls the boot kernel and module selection. You can look into this to see what it would take to add your own module. We do plan on refining and documenting this not for several months. We also have some very good idea on how we can track this faster than RH, but again nothing coming in the next few months.

To continue my earlier rant for today, until more hardware vendors start taking the linux market place seriously buying bleeding edge hardware and CPUs is asking for problems. It takes several months for any new hardware to become supported by RedHat and several years for any new CPU to be supported well. This isn't killing future Rocks installs, it's just correctly delaying them until the underlying OS supports the hardware.

-mjk

On Dec 9, 2003, at 1:59 PM, Joe Landman wrote:

> Folks:>> As indicated previously, I am wrestling with a Supermicro based> cluster. None of the RH distributions come with the correct E1000> driver, so a new kernel is needed (in the boot CD, and for

> installation).>> The problem I am running into is that it isn't at all obvious/easy > how> to install a new kernel/modules into ROCKS (3.0 or otherwise) to enable> this thing to work. Following the examples in the documentation have> not met with success. Running "rocks-dist cdrom" with the new kernels> (2.4.23 works nicely on the nodes) in the force/RPMS directory > generates> a bootable CD with the original 2.4.18BOOT kernel.>> What I (and I think others) need, is a simple/easy to follow method> that will generate a bootable CD with the correct linux kernel, and the> correct modules.>> Is this in process somewhere? What would be tremendously helpful is> if we can generate a binary module, and put that into the boot process> by placing it into the force/modules/binary directory (assuming one> exists) with the appropriate entry of a similar name in the> force/modules/meta directory as a simple XML document giving pci-ids,> description, name, etc.>> Anything close to this coming? Modules are killing future ROCKS> installs, the inability to easily inject a new module in there has> created a problem whereby ROCKS does not function (as the underlying RH> does not function).>>>> -- > Joseph Landman, Ph.D> Scalable Informatics LLC,> email: landman at scalableinformatics.com> web : http://scalableinformatics.com> phone: +1 734 612 4615

From gotero at linuxprophet.com Tue Dec 9 18:02:23 2003From: gotero at linuxprophet.com (gotero at linuxprophet.com)Date: Tue, 09 Dec 2003 18:02:23 -0800 (PST)Subject: [Rocks-Discuss]custom-kernels : naming conventions ? (Rocks 3.0.0)Message-ID: <20031209180224.24711.h014.c001.wm@mail.linuxprophet.com.criticalpath.net>

Daniel-

I recently had the same problem when building a quadrics cluster on Rocks 2.3.2with the qsnet-RedHat-kernel-2.4.18-27.3.4qsnet.i686.rpms. The problem isdefinitely in the naming of the rpms, in that anaconda running on the computenodes is not going to recognize kernel rpms that begin with 'qsnet' as potentialboot options. Unfortunately, being under a severe time contraint, I resorted tomanually installing the qsnet kernel on all nodes of the cluster, which isn'tthe Rocks way. The long term solution is to mangle the kernel makefiles so thatthe qsnet kernel rpms have conventional kernel rpm names, which is what Greg'spost referred to.

Glen

On Mon, 8 Dec 2003 17:54:53 -0000, daniel.kidger at quadrics.com wrote:

> > Dear all,> Previously I have been installing a custom kernel on the compute nodes > with an "extend-compute.xml" and an "/etc/init.d/qsconfigure" (to fixgrub.conf).> > However I am now trying to do it the 'proper' way. So I do (on :> # cp qsnet-RedHat-kernel-2.4.18-27.3.10qsnet.i686.rpm \> /home/install/rocks-dist/7.3/en/os/i386/force/RPMS> # cd /home/install> # rocks-dist dist> # SSH_NO_PASSWD=1 shoot-node compute-0-0> > Hence:> # find /home/install/ |xargs -l grep -nH qsnet> shows me that hdlist and hdlist2 now contain this RPM. (and indeed If I> duplicate my rpm in that directory rocks-dist notices this and warns me.)> > However the node always ends up with "2.4.20-20.7smp" again.> anaconda-ks.cfg contains just "kernel-smp" and install.log has "Installing> kernel-smp-2.4.20-20.7."> > So my question is:> It looks like my RPM has a name that Rocks doesn't understand properly. > What is wrong with my name ?> and what are the rules for getting the correct name ?> (.i686.rpm is of course correct, but I don't have -smp. in the name Isthis> the problem ?)> > cf. Greg Bruno's wisdom:> https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/2003-April/001770.html> > > Yours,> Daniel.> > --------------------------------------------------------------> Dr. Dan Kidger, Quadrics Ltd. daniel.kidger at quadrics.com> One Bridewell St., Bristol, BS1 2AA, UK 0117 915 5505> ----------------------- www.quadrics.com --------------------> > >

Glen Otero, Ph.D.Linux Prophet

From gotero at linuxprophet.com Tue Dec 9 18:05:04 2003From: gotero at linuxprophet.com (gotero at linuxprophet.com)Date: Tue, 09 Dec 2003 18:05:04 -0800 (PST)Subject: [Rocks-Discuss]Could not allocate requested partitionsMessage-ID: <20031209180504.716.h014.c001.wm@mail.linuxprophet.com.criticalpath.net>

On Tue, 09 Dec 2003 17:33:24 -0800, Terrence Martin wrote:

> > Terrence Martin wrote:> > I am getting the following error when trying to install a Rocks 3.0.0 > > headnode. The headnode works find in rocks 2.3.2.> > > > Could not allocate requested partitions: Partitioning failed: Could not > > allocate partitions as primary partitions> > > > What is also odd is when I alt-f2 and run fdisk /dev/hda it tells me it > > cannot find that device (unable to open /dev/hda). However when I watch > > the boot messages hda definitely comes up. Also the headnode works fine > > with 2.3.2.> > > > Any ideas?> > > > Terrence> > > > > > > > Figured it out, aparently rocks 3.0.0 did not like my partitions from > rocks 2.3.2. I booted knoppix, blew away the partition table and so far > so good on the head node.

I had the same problem with moving from 2.3.2 to 3.1. I'll try your solution.

Glen

> > Terrence

Glen Otero, Ph.D.Linux Prophet

From jorge at phys.ufl.edu Tue Dec 9 18:55:02 2003From: jorge at phys.ufl.edu (Jorge L. Rodriguez)Date: Tue, 09 Dec 2003 21:55:02 -0500Subject: [Rocks-Discuss]Adding partitions that are not reformatted under hard boots or shoot-nodeMessage-ID: <[email protected]>

Hi,

How do I add an extra partition to my compute nodes and retain the data on all non / partitions when system hard boots or is shot?I tried the suggestion in the documentation under "Customizing your ROCKS Installation" where you replace the auto-partition.xml but hard boots or shoot-nodes on these reformat all partitions instead of just the /. I have also tried to modify the installclass.xml so that an extra partition is added into the python code see below. This does mostly what I want but now I can't shoot-node even though a hard boot reinstalls without reformatting all but /. Is this the right approach? I'd rather avoid having to replace installclass since I don't really want to partition all nodes this way but if I must I will.

Jorge

# # set up the root partition # args = [ "/" , "--size" , "4096", "--fstype", "&fstype;", "--ondisk", devnames[0] ] KickstartBase.definePartition(self, id, args)

# ---- Jorge, I added this args args = [ "/state/partition1" , "--size" , "55000", "--fstype", "&fstype;", "--ondisk", devnames[0] ] KickstartBase.definePartition(self, id, args)# ----- args = [ "swap" , "--size" , "1000", "--ondisk", devnames[0] ] KickstartBase.definePartition(self, id, args)

# # greedy partitioning ## ----- Jorge, I change this from i = 1 i = 2# ----- for devname in devnames: partname = "/state/partition%d" % (i) args = [ partname, "--size", "1", "--fstype", "&fstype;", "--grow", "--ondisk", devname ] KickstartBase.definePartition(self, id, args)

i = i + 1

From bruno at rocksclusters.org Tue Dec 9 22:43:04 2003From: bruno at rocksclusters.org (Greg Bruno)Date: Tue, 9 Dec 2003 22:43:04 -0800Subject: [Rocks-Discuss]ATLAS rpm build problems on PII platformIn-Reply-To: <[email protected]>References: <[email protected]>Message-ID: <[email protected]>

> Okay, came up my own quick hack:>> Edit atlas.spec.in, go to "other x86" section, remove> 2 lines right above "linux", seems to make rpm now.>> A more formal patch would be put in a section for> cpuid eq 4 with this correction I suppose.

if you provide the patch, we'll include it in our next release.

- gb

From tlw at cs.unm.edu Tue Dec 9 23:23:43 2003From: tlw at cs.unm.edu (Tiffani Williams)Date: Wed, 10 Dec 2003 00:23:43 -0700Subject: [Rocks-Discuss]PBS errorsMessage-ID: <[email protected]>

Hello,

I am trying to submit a job through PBS, but I receive 2 errors. The first error is Job cannot be executed See job standard error file

The second error is that the standard error file cannot be written into my home directory.

I downloaded the sample script at http://rocks.npaci.edu/papers/rocks-documentation/launching-batch-jobs.htmland have tried a more simple script with PBS directives and echo commands.

I do not know what I am doing wrong? I have used PBS successfully on other clusters.

Does anyone have any suggestions?

Tiffani

From bruno at rocksclusters.org Tue Dec 9 23:35:59 2003From: bruno at rocksclusters.org (Greg Bruno)Date: Tue, 9 Dec 2003 23:35:59 -0800Subject: [Rocks-Discuss]PBS errorsIn-Reply-To: <[email protected]>References: <[email protected]>Message-ID: <[email protected]>

> I am trying to submit a job through PBS, but I receive 2 errors. The > first error is> Job cannot be executed> See job standard error file>> The second error is that the standard error file cannot be written > into my home directory.> I downloaded the sample script at> > http://rocks.npaci.edu/papers/rocks-documentation/launching-batch- > jobs.html> and have tried a more simple script with PBS directives and echo > commands.>

> I do not know what I am doing wrong? I have used PBS successfully on > other clusters.>> Does anyone have any suggestions?

can you login to the compute nodes successfully?

if not, try restarting autofs on all the compute nodes. on the frontend, execute:

# ssh-agent $SHELL# ssh-add

# cluster-fork "/etc/rc.d/init.d/autofs restart"

we've found the startup of autofs to be flaky at times.

- gb

From tlw at cs.unm.edu Wed Dec 10 00:03:13 2003From: tlw at cs.unm.edu (Tiffani Williams)Date: Wed, 10 Dec 2003 01:03:13 -0700Subject: [Rocks-Discuss]PBS errorsIn-Reply-To: <[email protected]>References: <[email protected]> <[email protected]>Message-ID: <[email protected]>

>> I am trying to submit a job through PBS, but I receive 2 errors. >> The first error is>> Job cannot be executed>> See job standard error file>>>> The second error is that the standard error file cannot be written >> into my home directory.>> I downloaded the sample script at>> >> http://rocks.npaci.edu/papers/rocks-documentation/launching-batch- >> jobs.html>> and have tried a more simple script with PBS directives and echo >> commands.>>>> I do not know what I am doing wrong? I have used PBS successfully >> on other clusters.>>>> Does anyone have any suggestions?>>> can you login to the compute nodes successfully?>> if not, try restarting autofs on all the compute nodes. on the > frontend, execute:>> # ssh-agent $SHELL> # ssh-add>> # cluster-fork "/etc/rc.d/init.d/autofs restart"

>> we've found the startup of autofs to be flaky at times.>> - gb

Do these commands have to be run by an administrator? If so, I do not have such privileges. I can ssh to the compute nodes, but I am denied entry. Am I supposed to be able to login to a compute node as a user.

Tiffani

From bruno at rocksclusters.org Wed Dec 10 06:37:05 2003From: bruno at rocksclusters.org (Greg Bruno)Date: Wed, 10 Dec 2003 06:37:05 -0800Subject: [Rocks-Discuss]PBS errorsIn-Reply-To: <[email protected]>References: <[email protected]> <[email protected]> <[email protected]>Message-ID: <[email protected]>

On Dec 10, 2003, at 12:03 AM, Tiffani Williams wrote:

>>>> I am trying to submit a job through PBS, but I receive 2 errors. >>> The first error is>>> Job cannot be executed>>> See job standard error file>>>>>> The second error is that the standard error file cannot be written >>> into my home directory.>>> I downloaded the sample script at>>> >>> http://rocks.npaci.edu/papers/rocks-documentation/launching-batch- >>> jobs.html>>> and have tried a more simple script with PBS directives and echo >>> commands.>>>>>> I do not know what I am doing wrong? I have used PBS successfully >>> on other clusters.>>>>>> Does anyone have any suggestions?>>>>>> can you login to the compute nodes successfully?>>>> if not, try restarting autofs on all the compute nodes. on the >> frontend, execute:>>>> # ssh-agent $SHELL>> # ssh-add>>>> # cluster-fork "/etc/rc.d/init.d/autofs restart">>>> we've found the startup of autofs to be flaky at times.>>

>> - gb>>> Do these commands have to be run by an administrator? If so, I do not > have such privileges. I can ssh to the compute nodes, but I am denied > entry. Am I supposed to be able to login to a compute node as a user.

yes, you need to be 'root'.

it appears your home directory is not being mounted when you login -- have your administrator run the commands above.

- gb

From mjk at sdsc.edu Wed Dec 10 07:20:47 2003From: mjk at sdsc.edu (Mason J. Katz)Date: Wed, 10 Dec 2003 07:20:47 -0800Subject: [Rocks-Discuss]PBS errorsIn-Reply-To: <[email protected]>References: <[email protected]> <[email protected]> <[email protected]> <[email protected]>Message-ID: <[email protected]>

This is most likely the dreaded NIS-crash. You'll need to restart the ypserver on the frontend and the ypbind daemon on all the nodes. We've seen this on our clusters maybe 4 times (on production systems) in the last several years. Others have seen this on a weekly basis. This is why NIS is dead in Rocks 3.1 - it served us reasonably well but never matured to a stable system.

-mjk

On Dec 10, 2003, at 6:37 AM, Greg Bruno wrote:

>> On Dec 10, 2003, at 12:03 AM, Tiffani Williams wrote:>>>>>>> I am trying to submit a job through PBS, but I receive 2 errors. >>>> The first error is>>>> Job cannot be executed>>>> See job standard error file>>>>>>>> The second error is that the standard error file cannot be written >>>> into my home directory.>>>> I downloaded the sample script at>>>> >>>> http://rocks.npaci.edu/papers/rocks-documentation/launching-batch- >>>> jobs.html>>>> and have tried a more simple script with PBS directives and echo >>>> commands.>>>>>>>> I do not know what I am doing wrong? I have used PBS successfully >>>> on other clusters.>>>>

>>>> Does anyone have any suggestions?>>>>>>>>> can you login to the compute nodes successfully?>>>>>> if not, try restarting autofs on all the compute nodes. on the >>> frontend, execute:>>>>>> # ssh-agent $SHELL>>> # ssh-add>>>>>> # cluster-fork "/etc/rc.d/init.d/autofs restart">>>>>> we've found the startup of autofs to be flaky at times.>>>>>> - gb>>>>>> Do these commands have to be run by an administrator? If so, I do not >> have such privileges. I can ssh to the compute nodes, but I am >> denied entry. Am I supposed to be able to login to a compute node as >> a user.>> yes, you need to be 'root'.>> it appears your home directory is not being mounted when you login -- > have your administrator run the commands above.>> - gb

From vincent_b_fox at yahoo.com Wed Dec 10 07:59:14 2003From: vincent_b_fox at yahoo.com (Vincent Fox)Date: Wed, 10 Dec 2003 07:59:14 -0800 (PST)Subject: [Rocks-Discuss]one node short in "labels"Message-ID: <[email protected]>

So I go to the "labels" selection on the web page to print out the pretty labels. What a nice idea by the way! EXCEPT....it's one node short! I go up to 0-13 and this stops at 0-12. Any ideas where I should check to fix this?

---------------------------------Do you Yahoo!?New Yahoo! Photos - easier uploading and sharing-------------- next part --------------An HTML attachment was scrubbed...URL: https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20031210/c5bf5e79/attachment-0001.html

From cdwan at mail.ahc.umn.edu Wed Dec 10 12:04:53 2003From: cdwan at mail.ahc.umn.edu (Chris Dwan (CCGB))Date: Wed, 10 Dec 2003 14:04:53 -0600 (CST)Subject: [Rocks-Discuss]Non-homogenous legacy hardwareMessage-ID: <[email protected]>

I am integrating legacy systems into a ROCKS cluster, and have hit asnag with the auto-partition configuration: The new (old) systems haveSCSI disks, while old (new) ones contain IDE. This is a non-issue solong as the initial install does its default partitioning. However, Ihave a "replace-auto-partition.xml" file which is unworkable for the SCSIbased systems since it makes specific reference to "hda" rather than"sda."

I would like to have a site-nodes/replace-auto-partition.xml file with aconditional such that "hda" or "sda" is used, based on the name of thenode (or some other criterion).

Is this possible?

Thanks, in advance. If this is out there on the mailing list archives, apointer would be greatly appreciated.

-Chris Dwan The University of Minnesota

From tmartin at physics.ucsd.edu Wed Dec 10 12:09:11 2003From: tmartin at physics.ucsd.edu (Terrence Martin)Date: Wed, 10 Dec 2003 12:09:11 -0800Subject: [Rocks-Discuss]Error during Make when building a new install floppyMessage-ID: <[email protected]>

I get the following error when I try to rebuild a boot floppy for rocks.

This is with the default CVS checkout with an update today according to the rocks userguide. I have not actually attempted to make any changes.

make[3]: Leaving directory `/home/install/rocks/src/rocks/boot/7.3/loader/anaconda-7.3/loader'make[2]: Leaving directory `/home/install/rocks/src/rocks/boot/7.3/loader/anaconda-7.3'strip -o loader anaconda-7.3/loader/loaderstrip: anaconda-7.3/loader/loader: No such file or directorymake[1]: *** [loader] Error 1make[1]: Leaving directory `/home/install/rocks/src/rocks/boot/7.3/loader'make: *** [loader] Error 2

Of course I could avoid all of this together and just put my binary module into the appropriate location in the boot image.

Would it be correct to modify the following image file with my changes and then write it to a floppy via dd?

/home/install/ftp.rocksclusters.org/pub/rocks/rocks-3.0.0/rocks-dist/7.3/en/os/i386/images/bootnet.img

Basically I am injecting an updated e1000 driver with changes to pcitable to support the address of my gigabit cards.

Terrence

From tim.carlson at pnl.gov Wed Dec 10 12:40:41 2003From: tim.carlson at pnl.gov (Tim Carlson)Date: Wed, 10 Dec 2003 12:40:41 -0800 (PST)Subject: [Rocks-Discuss]Error during Make when building a new install floppyIn-Reply-To: <[email protected]>Message-ID: <[email protected]>

On Wed, 10 Dec 2003, Terrence Martin wrote:

> I get the following error when I try to rebuild a boot floppy for rocks.>

You can't make a boot floppy with Rocks 3.0. That isn't supported. Or atleast it wasn't the last time I checked

> Of course I could avoid all of this together and just put my binary> module into the appropriate location in the boot image.>> Would it be correct to modify the following image file with my changes> and then write it to a floppy via dd?>> /home/install/ftp.rocksclusters.org/pub/rocks/rocks-3.0.0/rocks-dist/7.3/en/os/i386/images/bootnet.img>> Basically I am injecting an updated e1000 driver with changes to> pcitable to support the address of my gigabit cards.

Modifiying the bootnet.img is about 1/3 of what you need to do if you godown that path. You also need to work on netstg1.img and you'll need toupdate the drive in the kernel rpm that gets installed on the box. None ofthis is trivial.

If it were me, I would go down the same path I took for updating theAIC79XX driver

https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/2003-October/003533.html

Tim


From tim.carlson at pnl.gov Wed Dec 10 12:52:38 2003From: tim.carlson at pnl.gov (Tim Carlson)Date: Wed, 10 Dec 2003 12:52:38 -0800 (PST)Subject: [Rocks-Discuss]Non-homogenous legacy hardwareIn-Reply-To: <[email protected]>Message-ID: <[email protected]>

On Wed, 10 Dec 2003, Chris Dwan (CCGB) wrote:

>> I am integrating legacy systems into a ROCKS cluster, and have hit a> snag with the auto-partition configuration: The new (old) systems have> SCSI disks, while old (new) ones contain IDE. This is a non-issue so

> long as the initial install does its default partitioning. However, I> have a "replace-auto-partition.xml" file which is unworkable for the SCSI> based systems since it makes specific reference to "hda" rather than> "sda."

If you have just a single drive, then you should be able to skip the"--ondisk" bits of your "part" command

Otherwise, you would have first to do something ugly like the following:

http://penguin.epfl.ch/slides/kickstart/ks.cfg

You could probably (maybe) wrap most of that in an<eval sh="bash"></eval>

block in the <main> block.

Just guessing.. haven't tried this.

Tim


From agrajag at dragaera.net Wed Dec 10 10:21:07 2003From: agrajag at dragaera.net (Jag)Date: Wed, 10 Dec 2003 13:21:07 -0500Subject: [Rocks-Discuss]ssh_known_hosts and gangliaMessage-ID: <1071080467.4693.6.camel@pel>

I noticed a previous post on this list(https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/2003-May/001934.html) indicating that Rocks distributes ssh keys for all the nodes overganglia. Can anyone enlighten me as to how this is done?

I looked through the ganglia docs and didn't see anything indicating howto do this, so I'm assuming Rocks made some changes. Unfortunately therocks iso images don't seem to contain srpms, so I'm now coming here. What did Rocks do to ganglia to make the distribution of ssh keys work?

Also, does anyone know where Rocks SRPMs can be found? I've done quitea bit of searching, but haven't found them anywhere.

From mjk at sdsc.edu Wed Dec 10 14:39:15 2003From: mjk at sdsc.edu (Mason J. Katz)Date: Wed, 10 Dec 2003 14:39:15 -0800Subject: [Rocks-Discuss]ssh_known_hosts and gangliaIn-Reply-To: <1071080467.4693.6.camel@pel>References: <1071080467.4693.6.camel@pel>Message-ID: <[email protected]>

Most of the SRPMS are on our FTP site, but we've screwed this up

before. The SRPMS are entirely Rocks specific so they are of little value outside of Rocks. You can also checkout our CVS tree (cvs.rocksclusters.org) where rocks/src/ganglia shows what we add. We have a ganglia-python package we created to allow us to write our own metrics at a high level than the provide gmetric application. We've also moved from this method to a single cluster-wide ssh key for Rocks 3.1.

-mjk

On Dec 10, 2003, at 10:21 AM, Jag wrote:

> I noticed a previous post on this list> (https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/2003-May/ > 001934.html) indicating that Rocks distributes ssh keys for all the > nodes over> ganglia. Can anyone enlighten me as to how this is done?>> I looked through the ganglia docs and didn't see anything indicating > how> to do this, so I'm assuming Rocks made some changes. Unfortunately the> rocks iso images don't seem to contain srpms, so I'm now coming here.> What did Rocks do to ganglia to make the distribution of ssh keys work?>> Also, does anyone know where Rocks SRPMs can be found? I've done quite> a bit of searching, but haven't found them anywhere.

From vrowley at ucsd.edu Wed Dec 10 14:43:49 2003From: vrowley at ucsd.edu (V. Rowley)Date: Wed, 10 Dec 2003 14:43:49 -0800Subject: [Rocks-Discuss]"TypeError: loop over non-sequence" when trying to build CD distroMessage-ID: <[email protected]>

When I run this:

[root at rocks14 install]# rocks-dist mirror ; rocks-dist dist ; rocks-dist --dist=cdrom cdrom

on a server installed with ROCKS 3.0.0, I eventually get this:

> Cleaning distribution> Resolving versions (RPMs)> Resolving versions (SRPMs)> Adding support for rebuild distribution from source> Creating files (symbolic links - fast)> Creating symlinks to kickstart files> Fixing Comps Database> Generating hdlist (rpm database)> Patching second stage loader (eKV, partioning, ...)> patching "rocks-ekv" into distribution ...> patching "rocks-piece-pipe" into distribution ...> patching "PyXML" into distribution ...> patching "expat" into distribution ...> patching "rocks-pylib" into distribution ...> patching "MySQL-python" into distribution ...> patching "rocks-kickstart" into distribution ...

> patching "rocks-kickstart-profiles" into distribution ...> patching "rocks-kickstart-dtds" into distribution ...> building CRAM filesystem ...> Cleaning distribution> Resolving versions (RPMs)> Resolving versions (SRPMs)> Creating symlinks to kickstart files> Generating hdlist (rpm database)> Segregating RPMs (rocks, non-rocks)> sh: ./kickstart.cgi: No such file or directory> sh: ./kickstart.cgi: No such file or directory> Traceback (innermost last):> File "/opt/rocks/bin/rocks-dist", line 807, in ?> app.run()> File "/opt/rocks/bin/rocks-dist", line 623, in run> eval('self.command_%s()' % (command))> File "<string>", line 0, in ?> File "/opt/rocks/bin/rocks-dist", line 736, in command_cdrom> builder.build()> File "/opt/rocks/lib/python/rocks/build.py", line 1223, in build> (rocks, nonrocks) = self.segregateRPMS()> File "/opt/rocks/lib/python/rocks/build.py", line 1107, in segregateRPMS> for pkg in ks.getSection('packages'):> TypeError: loop over non-sequence

Any ideas?



From bruno at rocksclusters.org Wed Dec 10 15:12:49 2003From: bruno at rocksclusters.org (Greg Bruno)Date: Wed, 10 Dec 2003 15:12:49 -0800Subject: [Rocks-Discuss]one node short in "labels"In-Reply-To: <[email protected]>References: <[email protected]>Message-ID: <[email protected]>

> So I go to the "labels" selection on the web page to print out the > pretty labels. What a nice idea by the way!> ?> EXCEPT....it's one node short! I go up to 0-13 and this stops at > 0-12.? Any ideas where I should check to fix this?

yeah, we found this corner case -- it'll be fixed in the next release.

thanks for bug report.

- gb

From mjk at sdsc.edu Wed Dec 10 15:16:27 2003From: mjk at sdsc.edu (Mason J. Katz)Date: Wed, 10 Dec 2003 15:16:27 -0800Subject: [Rocks-Discuss]"TypeError: loop over non-sequence" when trying to build CD distroIn-Reply-To: <[email protected]>References: <[email protected]>Message-ID: <[email protected]>

It looks like someone moved the profiles directory to profiles.orig.

-mjk

[root at rocks14 install]# ls -ltotal 56drwxr-sr-x 3 root wheel 4096 Dec 10 21:16 cdromdrwxrwsr-x 5 root wheel 4096 Dec 10 20:38 contrib.origdrwxr-sr-x 3 root wheel 4096 Dec 10 21:07 ftp.rocksclusters.orgdrwxr-sr-x 3 root wheel 4096 Dec 10 20:38 ftp.rocksclusters.org.orig-r-xrwsr-x 1 root wheel 19254 Sep 3 12:40 kickstart.cgidrwxr-xr-x 3 root root 4096 Dec 10 20:38 profiles.origdrwxr-sr-x 3 root wheel 4096 Dec 10 21:15 rocks-distdrwxrwsr-x 3 root wheel 4096 Dec 10 20:38 rocks-dist.origdrwxr-sr-x 3 root wheel 4096 Dec 10 21:02 srcdrwxr-sr-x 4 root wheel 4096 Dec 10 20:49 src.fooOn Dec 10, 2003, at 2:43 PM, V. Rowley wrote:

> When I run this:>> [root at rocks14 install]# rocks-dist mirror ; rocks-dist dist ; > rocks-dist --dist=cdrom cdrom>> on a server installed with ROCKS 3.0.0, I eventually get this:>>> Cleaning distribution>> Resolving versions (RPMs)>> Resolving versions (SRPMs)>> Adding support for rebuild distribution from source>> Creating files (symbolic links - fast)>> Creating symlinks to kickstart files>> Fixing Comps Database>> Generating hdlist (rpm database)>> Patching second stage loader (eKV, partioning, ...)>> patching "rocks-ekv" into distribution ...>> patching "rocks-piece-pipe" into distribution ...>> patching "PyXML" into distribution ...>> patching "expat" into distribution ...>> patching "rocks-pylib" into distribution ...>> patching "MySQL-python" into distribution ...>> patching "rocks-kickstart" into distribution ...>> patching "rocks-kickstart-profiles" into distribution ...>> patching "rocks-kickstart-dtds" into distribution ...>> building CRAM filesystem ...>> Cleaning distribution

>> Resolving versions (RPMs)>> Resolving versions (SRPMs)>> Creating symlinks to kickstart files>> Generating hdlist (rpm database)>> Segregating RPMs (rocks, non-rocks)>> sh: ./kickstart.cgi: No such file or directory>> sh: ./kickstart.cgi: No such file or directory>> Traceback (innermost last):>> File "/opt/rocks/bin/rocks-dist", line 807, in ?>> app.run()>> File "/opt/rocks/bin/rocks-dist", line 623, in run>> eval('self.command_%s()' % (command))>> File "<string>", line 0, in ?>> File "/opt/rocks/bin/rocks-dist", line 736, in command_cdrom>> builder.build()>> File "/opt/rocks/lib/python/rocks/build.py", line 1223, in build>> (rocks, nonrocks) = self.segregateRPMS()>> File "/opt/rocks/lib/python/rocks/build.py", line 1107, in >> segregateRPMS>> for pkg in ks.getSection('packages'):>> TypeError: loop over non-sequence>> Any ideas?>> -- > Vicky Rowley email: vrowley at ucsd.edu> Biomedical Informatics Research Network work: (858) 536-5980> University of California, San Diego fax: (858) 822-0828> 9500 Gilman Drive> La Jolla, CA 92093-0715>>> See pictures from our trip to China at > http://www.sagacitech.com/Chinaweb

From vrowley at ucsd.edu Wed Dec 10 16:50:16 2003From: vrowley at ucsd.edu (V. Rowley)Date: Wed, 10 Dec 2003 16:50:16 -0800Subject: [Rocks-Discuss]"TypeError: loop over non-sequence" when trying to build CD distroIn-Reply-To: <[email protected]>References: <[email protected]> <[email protected]>Message-ID: <[email protected]>

Yep, I did that, but only *AFTER* getting the error. [Thought it was generated by the rocks-dist sequence, but apparently not.] Go ahead. Move it back. Same difference.

Vicky

Mason J. Katz wrote:> It looks like someone moved the profiles directory to profiles.orig.> > -mjk> >

> [root at rocks14 install]# ls -l> total 56> drwxr-sr-x 3 root wheel 4096 Dec 10 21:16 cdrom> drwxrwsr-x 5 root wheel 4096 Dec 10 20:38 contrib.orig> drwxr-sr-x 3 root wheel 4096 Dec 10 21:07 > ftp.rocksclusters.org> drwxr-sr-x 3 root wheel 4096 Dec 10 20:38 > ftp.rocksclusters.org.orig> -r-xrwsr-x 1 root wheel 19254 Sep 3 12:40 kickstart.cgi> drwxr-xr-x 3 root root 4096 Dec 10 20:38 profiles.orig> drwxr-sr-x 3 root wheel 4096 Dec 10 21:15 rocks-dist> drwxrwsr-x 3 root wheel 4096 Dec 10 20:38 rocks-dist.orig> drwxr-sr-x 3 root wheel 4096 Dec 10 21:02 src> drwxr-sr-x 4 root wheel 4096 Dec 10 20:49 src.foo> On Dec 10, 2003, at 2:43 PM, V. Rowley wrote:> >> When I run this:>>>> [root at rocks14 install]# rocks-dist mirror ; rocks-dist dist ; >> rocks-dist --dist=cdrom cdrom>>>> on a server installed with ROCKS 3.0.0, I eventually get this:>>>>> Cleaning distribution>>> Resolving versions (RPMs)>>> Resolving versions (SRPMs)>>> Adding support for rebuild distribution from source>>> Creating files (symbolic links - fast)>>> Creating symlinks to kickstart files>>> Fixing Comps Database>>> Generating hdlist (rpm database)>>> Patching second stage loader (eKV, partioning, ...)>>> patching "rocks-ekv" into distribution ...>>> patching "rocks-piece-pipe" into distribution ...>>> patching "PyXML" into distribution ...>>> patching "expat" into distribution ...>>> patching "rocks-pylib" into distribution ...>>> patching "MySQL-python" into distribution ...>>> patching "rocks-kickstart" into distribution ...>>> patching "rocks-kickstart-profiles" into distribution ...>>> patching "rocks-kickstart-dtds" into distribution ...>>> building CRAM filesystem ...>>> Cleaning distribution>>> Resolving versions (RPMs)>>> Resolving versions (SRPMs)>>> Creating symlinks to kickstart files>>> Generating hdlist (rpm database)>>> Segregating RPMs (rocks, non-rocks)>>> sh: ./kickstart.cgi: No such file or directory>>> sh: ./kickstart.cgi: No such file or directory>>> Traceback (innermost last):>>> File "/opt/rocks/bin/rocks-dist", line 807, in ?>>> app.run()>>> File "/opt/rocks/bin/rocks-dist", line 623, in run>>> eval('self.command_%s()' % (command))>>> File "<string>", line 0, in ?>>> File "/opt/rocks/bin/rocks-dist", line 736, in command_cdrom>>> builder.build()>>> File "/opt/rocks/lib/python/rocks/build.py", line 1223, in build

>>> (rocks, nonrocks) = self.segregateRPMS()>>> File "/opt/rocks/lib/python/rocks/build.py", line 1107, in >>> segregateRPMS>>> for pkg in ks.getSection('packages'):>>> TypeError: loop over non-sequence>>>>>> Any ideas?>>>> -- >> Vicky Rowley email: vrowley at ucsd.edu>> Biomedical Informatics Research Network work: (858) 536-5980>> University of California, San Diego fax: (858) 822-0828>> 9500 Gilman Drive>> La Jolla, CA 92093-0715>>>>>> See pictures from our trip to China at http://www.sagacitech.com/Chinaweb> > >



From tim.carlson at pnl.gov Wed Dec 10 17:23:25 2003From: tim.carlson at pnl.gov (Tim Carlson)Date: Wed, 10 Dec 2003 17:23:25 -0800 (PST)Subject: [Rocks-Discuss]"TypeError: loop over non-sequence" when trying to build CD distroIn-Reply-To: <[email protected]>Message-ID: <[email protected]>

On Wed, 10 Dec 2003, V. Rowley wrote:

Did you remove python by chance? kickstart.cgi calls python directly in/usr/bin/python while rocks-dist does an "env python"

Tim

> Yep, I did that, but only *AFTER* getting the error. [Thought it was> generated by the rocks-dist sequence, but apparently not.] Go ahead.> Move it back. Same difference.>> Vicky>> Mason J. Katz wrote:> > It looks like someone moved the profiles directory to profiles.orig.> >> > -mjk

> >> >> > [root at rocks14 install]# ls -l> > total 56> > drwxr-sr-x 3 root wheel 4096 Dec 10 21:16 cdrom> > drwxrwsr-x 5 root wheel 4096 Dec 10 20:38 contrib.orig> > drwxr-sr-x 3 root wheel 4096 Dec 10 21:07> > ftp.rocksclusters.org> > drwxr-sr-x 3 root wheel 4096 Dec 10 20:38> > ftp.rocksclusters.org.orig> > -r-xrwsr-x 1 root wheel 19254 Sep 3 12:40 kickstart.cgi> > drwxr-xr-x 3 root root 4096 Dec 10 20:38 profiles.orig> > drwxr-sr-x 3 root wheel 4096 Dec 10 21:15 rocks-dist> > drwxrwsr-x 3 root wheel 4096 Dec 10 20:38 rocks-dist.orig> > drwxr-sr-x 3 root wheel 4096 Dec 10 21:02 src> > drwxr-sr-x 4 root wheel 4096 Dec 10 20:49 src.foo> > On Dec 10, 2003, at 2:43 PM, V. Rowley wrote:> >> >> When I run this:> >>> >> [root at rocks14 install]# rocks-dist mirror ; rocks-dist dist ;> >> rocks-dist --dist=cdrom cdrom> >>> >> on a server installed with ROCKS 3.0.0, I eventually get this:> >>> >>> Cleaning distribution> >>> Resolving versions (RPMs)> >>> Resolving versions (SRPMs)> >>> Adding support for rebuild distribution from source> >>> Creating files (symbolic links - fast)> >>> Creating symlinks to kickstart files> >>> Fixing Comps Database> >>> Generating hdlist (rpm database)> >>> Patching second stage loader (eKV, partioning, ...)> >>> patching "rocks-ekv" into distribution ...> >>> patching "rocks-piece-pipe" into distribution ...> >>> patching "PyXML" into distribution ...> >>> patching "expat" into distribution ...> >>> patching "rocks-pylib" into distribution ...> >>> patching "MySQL-python" into distribution ...> >>> patching "rocks-kickstart" into distribution ...> >>> patching "rocks-kickstart-profiles" into distribution ...> >>> patching "rocks-kickstart-dtds" into distribution ...> >>> building CRAM filesystem ...> >>> Cleaning distribution> >>> Resolving versions (RPMs)> >>> Resolving versions (SRPMs)> >>> Creating symlinks to kickstart files> >>> Generating hdlist (rpm database)> >>> Segregating RPMs (rocks, non-rocks)> >>> sh: ./kickstart.cgi: No such file or directory> >>> sh: ./kickstart.cgi: No such file or directory> >>> Traceback (innermost last):> >>> File "/opt/rocks/bin/rocks-dist", line 807, in ?> >>> app.run()> >>> File "/opt/rocks/bin/rocks-dist", line 623, in run> >>> eval('self.command_%s()' % (command))> >>> File "<string>", line 0, in ?> >>> File "/opt/rocks/bin/rocks-dist", line 736, in command_cdrom

> >>> builder.build()> >>> File "/opt/rocks/lib/python/rocks/build.py", line 1223, in build> >>> (rocks, nonrocks) = self.segregateRPMS()> >>> File "/opt/rocks/lib/python/rocks/build.py", line 1107, in> >>> segregateRPMS> >>> for pkg in ks.getSection('packages'):> >>> TypeError: loop over non-sequence> >>> >>> >> Any ideas?> >>> >> --> >> Vicky Rowley email: vrowley at ucsd.edu> >> Biomedical Informatics Research Network work: (858) 536-5980> >> University of California, San Diego fax: (858) 822-0828> >> 9500 Gilman Drive> >> La Jolla, CA 92093-0715> >>> >>> >> See pictures from our trip to China at http://www.sagacitech.com/Chinaweb> >> >> >>> --> Vicky Rowley email: vrowley at ucsd.edu> Biomedical Informatics Research Network work: (858) 536-5980> University of California, San Diego fax: (858) 822-0828> 9500 Gilman Drive> La Jolla, CA 92093-0715>>> See pictures from our trip to China at http://www.sagacitech.com/Chinaweb>>

From naihh at imcb.a-star.edu.sg Wed Dec 10 17:45:18 2003From: naihh at imcb.a-star.edu.sg (Nai Hong Hwa Francis)Date: Thu, 11 Dec 2003 09:45:18 +0800Subject: [Rocks-Discuss]RE: Do you have a list of the various models of Gigabit Ethernet Interfaces compatible to Rocks 3?Message-ID: <5E118EED7CC277468A275F11EEEC39B94CCD66@EXIMCB2.imcb.a-star.edu.sg>

Hi All,

Do you have a list of the various gigabit Ethernet interfaces that arecompatible to Rocks 3?

I am changing my nodes connectivity from 10/100 to 1000.

Have anyone done that and how are the differences in performance orturnaround time?

Have anyone successfully build a set of grid compute nodes using Rocks3?

Thanks and Regards


-----Original Message-----From: npaci-rocks-discussion-request at sdsc.edu[mailto:npaci-rocks-discussion-request at sdsc.edu] Sent: Thursday, December 11, 2003 9:25 AMTo: npaci-rocks-discussion at sdsc.eduSubject: npaci-rocks-discussion digest, Vol 1 #641 - 13 msgs

Send npaci-rocks-discussion mailing list submissions tonpaci-rocks-discussion at sdsc.edu

To subscribe or unsubscribe via the World Wide Web, visit

http://lists.sdsc.edu/mailman/listinfo.cgi/npaci-rocks-discussionor, via email, send a message with subject or body 'help' to

npaci-rocks-discussion-request at sdsc.edu

You can reach the person managing the list atnpaci-rocks-discussion-admin at sdsc.edu

When replying, please edit your Subject line so it is more specificthan "Re: Contents of npaci-rocks-discussion digest..."

Today's Topics:

1. Non-homogenous legacy hardware (Chris Dwan (CCGB)) 2. Error during Make when building a new install floppy (TerrenceMartin) 3. Re: Error during Make when building a new install floppy (TimCarlson) 4. Re: Non-homogenous legacy hardware (Tim Carlson) 5. ssh_known_hosts and ganglia (Jag) 6. Re: ssh_known_hosts and ganglia (Mason J. Katz) 7. "TypeError: loop over non-sequence" when trying to build CDdistro (V. Rowley) 8. Re: one node short in "labels" (Greg Bruno) 9. Re: "TypeError: loop over non-sequence" when trying to build CDdistro (Mason J. Katz) 10. Re: "TypeError: loop over non-sequence" when trying to build CD distro (V. Rowley) 11. Re: "TypeError: loop over non-sequence" when trying to build CD distro (Tim Carlson)

--__--__--

Message: 1Date: Wed, 10 Dec 2003 14:04:53 -0600 (CST)From: "Chris Dwan (CCGB)" <cdwan at mail.ahc.umn.edu>To: npaci-rocks-discussion at sdsc.edu

Subject: [Rocks-Discuss]Non-homogenous legacy hardware

I am integrating legacy systems into a ROCKS cluster, and have hit asnag with the auto-partition configuration: The new (old) systems haveSCSI disks, while old (new) ones contain IDE. This is a non-issue solong as the initial install does its default partitioning. However, Ihave a "replace-auto-partition.xml" file which is unworkable for theSCSIbased systems since it makes specific reference to "hda" rather than"sda."


Is this possible?

Thanks, in advance. If this is out there on the mailing list archives,apointer would be greatly appreciated.


--__--__--

Message: 2Date: Wed, 10 Dec 2003 12:09:11 -0800From: Terrence Martin <tmartin at physics.ucsd.edu>To: npaci-rocks-discussion <npaci-rocks-discussion at sdsc.edu>Subject: [Rocks-Discuss]Error during Make when building a new installfloppy


This is with the default CVS checkout with an update today according to the rocks userguide. I have not actually attempted to make any changes.

make[3]: Leaving directory `/home/install/rocks/src/rocks/boot/7.3/loader/anaconda-7.3/loader'make[2]: Leaving directory `/home/install/rocks/src/rocks/boot/7.3/loader/anaconda-7.3'strip -o loader anaconda-7.3/loader/loaderstrip: anaconda-7.3/loader/loader: No such file or directorymake[1]: *** [loader] Error 1make[1]: Leaving directory`/home/install/rocks/src/rocks/boot/7.3/loader'make: *** [loader] Error 2

Of course I could avoid all of this together and just put my binary module into the appropriate location in the boot image.

Would it be correct to modify the following image file with my changes and then write it to a floppy via dd?


Basically I am injecting an updated e1000 driver with changes to pcitable to support the address of my gigabit cards.

Terrence

--__--__--

Message: 3Date: Wed, 10 Dec 2003 12:40:41 -0800 (PST)From: Tim Carlson <tim.carlson at pnl.gov>Subject: Re: [Rocks-Discuss]Error during Make when building a newinstall floppyTo: Terrence Martin <tmartin at physics.ucsd.edu>Cc: npaci-rocks-discussion <npaci-rocks-discussion at sdsc.edu>Reply-to: Tim Carlson <tim.carlson at pnl.gov>


> I get the following error when I try to rebuild a boot floppy forrocks.>


> Of course I could avoid all of this together and just put my binary> module into the appropriate location in the boot image.>> Would it be correct to modify the following image file with my changes> and then write it to a floppy via dd?>>/home/install/ftp.rocksclusters.org/pub/rocks/rocks-3.0.0/rocks-dist/7.3/en/os/i386/images/bootnet.img>> Basically I am injecting an updated e1000 driver with changes to> pcitable to support the address of my gigabit cards.

Modifiying the bootnet.img is about 1/3 of what you need to do if you godown that path. You also need to work on netstg1.img and you'll need toupdate the drive in the kernel rpm that gets installed on the box. Noneofthis is trivial.



Tim


--__--__--

Message: 4Date: Wed, 10 Dec 2003 12:52:38 -0800 (PST)From: Tim Carlson <tim.carlson at pnl.gov>Subject: Re: [Rocks-Discuss]Non-homogenous legacy hardwareTo: "Chris Dwan (CCGB)" <cdwan at mail.ahc.umn.edu>Cc: npaci-rocks-discussion at sdsc.eduReply-to: Tim Carlson <tim.carlson at pnl.gov>


>> I am integrating legacy systems into a ROCKS cluster, and have hit a> snag with the auto-partition configuration: The new (old) systemshave> SCSI disks, while old (new) ones contain IDE. This is a non-issue so> long as the initial install does its default partitioning. However, I> have a "replace-auto-partition.xml" file which is unworkable for theSCSI> based systems since it makes specific reference to "hda" rather than> "sda."




You could probably (maybe) wrap most of that in an<eval sh="bash"></eval>



Tim


--__--__--

Message: 5From: Jag <agrajag at dragaera.net>To: npaci-rocks-discussion at sdsc.eduDate: Wed, 10 Dec 2003 13:21:07 -0500Subject: [Rocks-Discuss]ssh_known_hosts and ganglia


I looked through the ganglia docs and didn't see anything indicating howto do this, so I'm assuming Rocks made some changes. Unfortunately therocks iso images don't seem to contain srpms, so I'm now coming here. What did Rocks do to ganglia to make the distribution of ssh keys work?


--__--__--

Message: 6Cc: npaci-rocks-discussion at sdsc.eduFrom: "Mason J. Katz" <mjk at sdsc.edu>Subject: Re: [Rocks-Discuss]ssh_known_hosts and gangliaDate: Wed, 10 Dec 2003 14:39:15 -0800To: Jag <agrajag at dragaera.net>

Most of the SRPMS are on our FTP site, but we've screwed this up before. The SRPMS are entirely Rocks specific so they are of little value outside of Rocks. You can also checkout our CVS tree (cvs.rocksclusters.org) where rocks/src/ganglia shows what we add. We have a ganglia-python package we created to allow us to write our own metrics at a high level than the provide gmetric application. We've also moved from this method to a single cluster-wide ssh key for Rocks 3.1.

-mjk


> I noticed a previous post on this list> (https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/2003-May/ > 001934.html) indicating that Rocks distributes ssh keys for all the > nodes over> ganglia. Can anyone enlighten me as to how this is done?>> I looked through the ganglia docs and didn't see anything indicating > how> to do this, so I'm assuming Rocks made some changes. Unfortunatelythe> rocks iso images don't seem to contain srpms, so I'm now coming here.> What did Rocks do to ganglia to make the distribution of ssh keyswork?>> Also, does anyone know where Rocks SRPMs can be found? I've donequite> a bit of searching, but haven't found them anywhere.

--__--__--

Message: 7Date: Wed, 10 Dec 2003 14:43:49 -0800From: "V. Rowley" <vrowley at ucsd.edu>To: npaci-rocks-discussion at sdsc.eduSubject: [Rocks-Discuss]"TypeError: loop over non-sequence" when tryingto build CD distro

When I run this:

[root at rocks14 install]# rocks-dist mirror ; rocks-dist dist ; rocks-dist

--dist=cdrom cdrom


> Cleaning distribution> Resolving versions (RPMs)> Resolving versions (SRPMs)> Adding support for rebuild distribution from source> Creating files (symbolic links - fast)> Creating symlinks to kickstart files> Fixing Comps Database> Generating hdlist (rpm database)> Patching second stage loader (eKV, partioning, ...)> patching "rocks-ekv" into distribution ...> patching "rocks-piece-pipe" into distribution ...> patching "PyXML" into distribution ...> patching "expat" into distribution ...> patching "rocks-pylib" into distribution ...> patching "MySQL-python" into distribution ...> patching "rocks-kickstart" into distribution ...> patching "rocks-kickstart-profiles" into distribution ...> patching "rocks-kickstart-dtds" into distribution ...> building CRAM filesystem ...> Cleaning distribution> Resolving versions (RPMs)> Resolving versions (SRPMs)> Creating symlinks to kickstart files> Generating hdlist (rpm database)> Segregating RPMs (rocks, non-rocks)> sh: ./kickstart.cgi: No such file or directory> sh: ./kickstart.cgi: No such file or directory> Traceback (innermost last):> File "/opt/rocks/bin/rocks-dist", line 807, in ?> app.run()> File "/opt/rocks/bin/rocks-dist", line 623, in run> eval('self.command_%s()' % (command))> File "<string>", line 0, in ?> File "/opt/rocks/bin/rocks-dist", line 736, in command_cdrom> builder.build()> File "/opt/rocks/lib/python/rocks/build.py", line 1223, in build> (rocks, nonrocks) = self.segregateRPMS()> File "/opt/rocks/lib/python/rocks/build.py", line 1107, insegregateRPMS> for pkg in ks.getSection('packages'):> TypeError: loop over non-sequence

Any ideas?


See pictures from our trip to China athttp://www.sagacitech.com/Chinaweb

--__--__--

Message: 8Cc: rocks <npaci-rocks-discussion at sdsc.edu>From: Greg Bruno <bruno at rocksclusters.org>Subject: Re: [Rocks-Discuss]one node short in "labels"Date: Wed, 10 Dec 2003 15:12:49 -0800To: Vincent Fox <vincent_b_fox at yahoo.com>

> So I go to the "labels" selection on the web page to print out the=20> pretty labels. What a nice idea by the way!> =A0> EXCEPT....it's one node short! I go up to 0-13 and this stops at=20> 0-12.=A0 Any ideas where I should check to fix this?



- gb

--__--__--

Message: 9Cc: npaci-rocks-discussion at sdsc.eduFrom: "Mason J. Katz" <mjk at sdsc.edu>Subject: Re: [Rocks-Discuss]"TypeError: loop over non-sequence" whentrying to build CD distroDate: Wed, 10 Dec 2003 15:16:27 -0800To: "V. Rowley" <vrowley at ucsd.edu>


-mjk

[root at rocks14 install]# ls -ltotal 56drwxr-sr-x 3 root wheel 4096 Dec 10 21:16 cdromdrwxrwsr-x 5 root wheel 4096 Dec 10 20:38 contrib.origdrwxr-sr-x 3 root wheel 4096 Dec 10 21:07 ftp.rocksclusters.orgdrwxr-sr-x 3 root wheel 4096 Dec 10 20:38 ftp.rocksclusters.org.orig-r-xrwsr-x 1 root wheel 19254 Sep 3 12:40 kickstart.cgidrwxr-xr-x 3 root root 4096 Dec 10 20:38 profiles.origdrwxr-sr-x 3 root wheel 4096 Dec 10 21:15 rocks-distdrwxrwsr-x 3 root wheel 4096 Dec 10 20:38 rocks-dist.origdrwxr-sr-x 3 root wheel 4096 Dec 10 21:02 srcdrwxr-sr-x 4 root wheel 4096 Dec 10 20:49 src.fooOn Dec 10, 2003, at 2:43 PM, V. Rowley wrote:

> When I run this:

>> [root at rocks14 install]# rocks-dist mirror ; rocks-dist dist ; > rocks-dist --dist=cdrom cdrom>> on a server installed with ROCKS 3.0.0, I eventually get this:>>> Cleaning distribution>> Resolving versions (RPMs)>> Resolving versions (SRPMs)>> Adding support for rebuild distribution from source>> Creating files (symbolic links - fast)>> Creating symlinks to kickstart files>> Fixing Comps Database>> Generating hdlist (rpm database)>> Patching second stage loader (eKV, partioning, ...)>> patching "rocks-ekv" into distribution ...>> patching "rocks-piece-pipe" into distribution ...>> patching "PyXML" into distribution ...>> patching "expat" into distribution ...>> patching "rocks-pylib" into distribution ...>> patching "MySQL-python" into distribution ...>> patching "rocks-kickstart" into distribution ...>> patching "rocks-kickstart-profiles" into distribution ...>> patching "rocks-kickstart-dtds" into distribution ...>> building CRAM filesystem ...>> Cleaning distribution>> Resolving versions (RPMs)>> Resolving versions (SRPMs)>> Creating symlinks to kickstart files>> Generating hdlist (rpm database)>> Segregating RPMs (rocks, non-rocks)>> sh: ./kickstart.cgi: No such file or directory>> sh: ./kickstart.cgi: No such file or directory>> Traceback (innermost last):>> File "/opt/rocks/bin/rocks-dist", line 807, in ?>> app.run()>> File "/opt/rocks/bin/rocks-dist", line 623, in run>> eval('self.command_%s()' % (command))>> File "<string>", line 0, in ?>> File "/opt/rocks/bin/rocks-dist", line 736, in command_cdrom>> builder.build()>> File "/opt/rocks/lib/python/rocks/build.py", line 1223, in build>> (rocks, nonrocks) = self.segregateRPMS()>> File "/opt/rocks/lib/python/rocks/build.py", line 1107, in >> segregateRPMS>> for pkg in ks.getSection('packages'):>> TypeError: loop over non-sequence>> Any ideas?>> -- > Vicky Rowley email: vrowley at ucsd.edu> Biomedical Informatics Research Network work: (858) 536-5980> University of California, San Diego fax: (858) 822-0828> 9500 Gilman Drive> La Jolla, CA 92093-0715>>> See pictures from our trip to China at

> http://www.sagacitech.com/Chinaweb

--__--__--

Message: 10Date: Wed, 10 Dec 2003 16:50:16 -0800From: "V. Rowley" <vrowley at ucsd.edu>To: "Mason J. Katz" <mjk at sdsc.edu>CC: npaci-rocks-discussion at sdsc.eduSubject: Re: [Rocks-Discuss]"TypeError: loop over non-sequence" whentrying to build CD distro

Yep, I did that, but only *AFTER* getting the error. [Thought it was generated by the rocks-dist sequence, but apparently not.] Go ahead. Move it back. Same difference.

Vicky

Mason J. Katz wrote:> It looks like someone moved the profiles directory to profiles.orig.> > -mjk> > > [root at rocks14 install]# ls -l> total 56> drwxr-sr-x 3 root wheel 4096 Dec 10 21:16 cdrom> drwxrwsr-x 5 root wheel 4096 Dec 10 20:38 contrib.orig> drwxr-sr-x 3 root wheel 4096 Dec 10 21:07 > ftp.rocksclusters.org> drwxr-sr-x 3 root wheel 4096 Dec 10 20:38 > ftp.rocksclusters.org.orig> -r-xrwsr-x 1 root wheel 19254 Sep 3 12:40 kickstart.cgi> drwxr-xr-x 3 root root 4096 Dec 10 20:38 profiles.orig> drwxr-sr-x 3 root wheel 4096 Dec 10 21:15 rocks-dist> drwxrwsr-x 3 root wheel 4096 Dec 10 20:38rocks-dist.orig> drwxr-sr-x 3 root wheel 4096 Dec 10 21:02 src> drwxr-sr-x 4 root wheel 4096 Dec 10 20:49 src.foo> On Dec 10, 2003, at 2:43 PM, V. Rowley wrote:> >> When I run this:>>>> [root at rocks14 install]# rocks-dist mirror ; rocks-dist dist ; >> rocks-dist --dist=cdrom cdrom>>>> on a server installed with ROCKS 3.0.0, I eventually get this:>>>>> Cleaning distribution>>> Resolving versions (RPMs)>>> Resolving versions (SRPMs)>>> Adding support for rebuild distribution from source>>> Creating files (symbolic links - fast)>>> Creating symlinks to kickstart files>>> Fixing Comps Database>>> Generating hdlist (rpm database)>>> Patching second stage loader (eKV, partioning, ...)

>>> patching "rocks-ekv" into distribution ...>>> patching "rocks-piece-pipe" into distribution ...>>> patching "PyXML" into distribution ...>>> patching "expat" into distribution ...>>> patching "rocks-pylib" into distribution ...>>> patching "MySQL-python" into distribution ...>>> patching "rocks-kickstart" into distribution ...>>> patching "rocks-kickstart-profiles" into distribution ...>>> patching "rocks-kickstart-dtds" into distribution ...>>> building CRAM filesystem ...>>> Cleaning distribution>>> Resolving versions (RPMs)>>> Resolving versions (SRPMs)>>> Creating symlinks to kickstart files>>> Generating hdlist (rpm database)>>> Segregating RPMs (rocks, non-rocks)>>> sh: ./kickstart.cgi: No such file or directory>>> sh: ./kickstart.cgi: No such file or directory>>> Traceback (innermost last):>>> File "/opt/rocks/bin/rocks-dist", line 807, in ?>>> app.run()>>> File "/opt/rocks/bin/rocks-dist", line 623, in run>>> eval('self.command_%s()' % (command))>>> File "<string>", line 0, in ?>>> File "/opt/rocks/bin/rocks-dist", line 736, in command_cdrom>>> builder.build()>>> File "/opt/rocks/lib/python/rocks/build.py", line 1223, in build>>> (rocks, nonrocks) = self.segregateRPMS()>>> File "/opt/rocks/lib/python/rocks/build.py", line 1107, in >>> segregateRPMS>>> for pkg in ks.getSection('packages'):>>> TypeError: loop over non-sequence>>>>>> Any ideas?>>>> -- >> Vicky Rowley email: vrowley at ucsd.edu>> Biomedical Informatics Research Network work: (858) 536-5980>> University of California, San Diego fax: (858) 822-0828>> 9500 Gilman Drive>> La Jolla, CA 92093-0715>>>>>> See pictures from our trip to China athttp://www.sagacitech.com/Chinaweb> > >


See pictures from our trip to China at

http://www.sagacitech.com/Chinaweb

--__--__--

Message: 11Date: Wed, 10 Dec 2003 17:23:25 -0800 (PST)From: Tim Carlson <tim.carlson at pnl.gov>Subject: Re: [Rocks-Discuss]"TypeError: loop over non-sequence" whentrying to build CD distroTo: "V. Rowley" <vrowley at ucsd.edu>Cc: "Mason J. Katz" <mjk at sdsc.edu>, npaci-rocks-discussion at sdsc.eduReply-to: Tim Carlson <tim.carlson at pnl.gov>



Tim

> Yep, I did that, but only *AFTER* getting the error. [Thought it was> generated by the rocks-dist sequence, but apparently not.] Go ahead.> Move it back. Same difference.>> Vicky>> Mason J. Katz wrote:> > It looks like someone moved the profiles directory to profiles.orig.> >> > -mjk> >> >> > [root at rocks14 install]# ls -l> > total 56> > drwxr-sr-x 3 root wheel 4096 Dec 10 21:16 cdrom> > drwxrwsr-x 5 root wheel 4096 Dec 10 20:38 contrib.orig> > drwxr-sr-x 3 root wheel 4096 Dec 10 21:07> > ftp.rocksclusters.org> > drwxr-sr-x 3 root wheel 4096 Dec 10 20:38> > ftp.rocksclusters.org.orig> > -r-xrwsr-x 1 root wheel 19254 Sep 3 12:40kickstart.cgi> > drwxr-xr-x 3 root root 4096 Dec 10 20:38profiles.orig> > drwxr-sr-x 3 root wheel 4096 Dec 10 21:15 rocks-dist> > drwxrwsr-x 3 root wheel 4096 Dec 10 20:38rocks-dist.orig> > drwxr-sr-x 3 root wheel 4096 Dec 10 21:02 src> > drwxr-sr-x 4 root wheel 4096 Dec 10 20:49 src.foo> > On Dec 10, 2003, at 2:43 PM, V. Rowley wrote:> >> >> When I run this:> >>> >> [root at rocks14 install]# rocks-dist mirror ; rocks-dist dist ;> >> rocks-dist --dist=cdrom cdrom> >>> >> on a server installed with ROCKS 3.0.0, I eventually get this:

> >>> >>> Cleaning distribution> >>> Resolving versions (RPMs)> >>> Resolving versions (SRPMs)> >>> Adding support for rebuild distribution from source> >>> Creating files (symbolic links - fast)> >>> Creating symlinks to kickstart files> >>> Fixing Comps Database> >>> Generating hdlist (rpm database)> >>> Patching second stage loader (eKV, partioning, ...)> >>> patching "rocks-ekv" into distribution ...> >>> patching "rocks-piece-pipe" into distribution ...> >>> patching "PyXML" into distribution ...> >>> patching "expat" into distribution ...> >>> patching "rocks-pylib" into distribution ...> >>> patching "MySQL-python" into distribution ...> >>> patching "rocks-kickstart" into distribution ...> >>> patching "rocks-kickstart-profiles" into distribution ...> >>> patching "rocks-kickstart-dtds" into distribution ...> >>> building CRAM filesystem ...> >>> Cleaning distribution> >>> Resolving versions (RPMs)> >>> Resolving versions (SRPMs)> >>> Creating symlinks to kickstart files> >>> Generating hdlist (rpm database)> >>> Segregating RPMs (rocks, non-rocks)> >>> sh: ./kickstart.cgi: No such file or directory> >>> sh: ./kickstart.cgi: No such file or directory> >>> Traceback (innermost last):> >>> File "/opt/rocks/bin/rocks-dist", line 807, in ?> >>> app.run()> >>> File "/opt/rocks/bin/rocks-dist", line 623, in run> >>> eval('self.command_%s()' % (command))> >>> File "<string>", line 0, in ?> >>> File "/opt/rocks/bin/rocks-dist", line 736, in command_cdrom> >>> builder.build()> >>> File "/opt/rocks/lib/python/rocks/build.py", line 1223, in build> >>> (rocks, nonrocks) = self.segregateRPMS()> >>> File "/opt/rocks/lib/python/rocks/build.py", line 1107, in> >>> segregateRPMS> >>> for pkg in ks.getSection('packages'):> >>> TypeError: loop over non-sequence> >>> >>> >> Any ideas?> >>> >> --> >> Vicky Rowley email: vrowley at ucsd.edu> >> Biomedical Informatics Research Network work: (858) 536-5980> >> University of California, San Diego fax: (858) 822-0828> >> 9500 Gilman Drive> >> La Jolla, CA 92093-0715> >>> >>> >> See pictures from our trip to China athttp://www.sagacitech.com/Chinaweb> >> >> >

>> --> Vicky Rowley email: vrowley at ucsd.edu> Biomedical Informatics Research Network work: (858) 536-5980> University of California, San Diego fax: (858) 822-0828> 9500 Gilman Drive> La Jolla, CA 92093-0715>>> See pictures from our trip to China athttp://www.sagacitech.com/Chinaweb>>

--__--__--

_______________________________________________npaci-rocks-discussion mailing listnpaci-rocks-discussion at sdsc.eduhttp://lists.sdsc.edu/mailman/listinfo.cgi/npaci-rocks-discussion

End of npaci-rocks-discussion Digest


From tmartin at physics.ucsd.edu Wed Dec 10 18:03:41 2003From: tmartin at physics.ucsd.edu (Terrence Martin)Date: Wed, 10 Dec 2003 18:03:41 -0800Subject: [Rocks-Discuss]Rocks 3.0.0Message-ID: <[email protected]>

I am having a problem on install of rocks 3.0.0 on my new cluster.

The python error occurs right after anaconda starts and just before the install asks for the roll CDROM.

The error refers to an inability to find or load rocks.file. The error is associated I think with the window that pops up and asks you in put the roll CDROM in.

The process I followed to get to this point is

Put the Rocks 3.0.0 CDROM into the CDROM driveBoot the systemAt the prompt type frontendWait till anaconda startsError referring to unable to load rocks.file.

I have successfully installed rocks on a smaller cluster but that has

different hardware. I used the same CDROM for both installs.

Any thoughts?

Terrence

From vrowley at ucsd.edu Wed Dec 10 19:52:49 2003From: vrowley at ucsd.edu (V. Rowley)Date: Wed, 10 Dec 2003 19:52:49 -0800Subject: [Rocks-Discuss]"TypeError: loop over non-sequence" when trying to build CD distroIn-Reply-To: <[email protected]>References: <[email protected]>Message-ID: <[email protected]>

Looks like python is okay:

> [root at rocks14 birn-oracle1]# which python> /usr/bin/python> [root at rocks14 birn-oracle1]# python --help> Unknown option: --> usage: python [option] ... [-c cmd | file | -] [arg] ...> Options and arguments (and corresponding environment variables):> -d : debug output from parser (also PYTHONDEBUG=x)> -i : inspect interactively after running script, (also PYTHONINSPECT=x)> and force prompts, even if stdin does not appear to be a terminal> -O : optimize generated bytecode (a tad; also PYTHONOPTIMIZE=x)> -OO : remove doc-strings in addition to the -O optimizations> -S : don't imply 'import site' on initialization> -t : issue warnings about inconsistent tab usage (-tt: issue errors)> -u : unbuffered binary stdout and stderr (also PYTHONUNBUFFERED=x)> -v : verbose (trace import statements) (also PYTHONVERBOSE=x)> -x : skip first line of source, allowing use of non-Unix forms of #!cmd> -X : disable class based built-in exceptions> -c cmd : program passed in as string (terminates option list)> file : program read from script file> - : program read from stdin (default; interactive mode if a tty)> arg ...: arguments passed to program in sys.argv[1:]> Other environment variables:> PYTHONSTARTUP: file executed on interactive startup (no default)> PYTHONPATH : ':'-separated list of directories prefixed to the> default module search path. The result is sys.path.> PYTHONHOME : alternate <prefix> directory (or <prefix>:<exec_prefix>).> The default module search path uses <prefix>/python1.5.> [root at rocks14 birn-oracle1]#

Tim Carlson wrote:> On Wed, 10 Dec 2003, V. Rowley wrote:> > Did you remove python by chance? kickstart.cgi calls python directly in> /usr/bin/python while rocks-dist does an "env python"> > Tim>

> >>Yep, I did that, but only *AFTER* getting the error. [Thought it was>>generated by the rocks-dist sequence, but apparently not.] Go ahead.>>Move it back. Same difference.>>>>Vicky>>>>Mason J. Katz wrote:>>>>>It looks like someone moved the profiles directory to profiles.orig.>>>>>> -mjk>>>>>>>>>[root at rocks14 install]# ls -l>>>total 56>>>drwxr-sr-x 3 root wheel 4096 Dec 10 21:16 cdrom>>>drwxrwsr-x 5 root wheel 4096 Dec 10 20:38 contrib.orig>>>drwxr-sr-x 3 root wheel 4096 Dec 10 21:07>>>ftp.rocksclusters.org>>>drwxr-sr-x 3 root wheel 4096 Dec 10 20:38>>>ftp.rocksclusters.org.orig>>>-r-xrwsr-x 1 root wheel 19254 Sep 3 12:40 kickstart.cgi>>>drwxr-xr-x 3 root root 4096 Dec 10 20:38 profiles.orig>>>drwxr-sr-x 3 root wheel 4096 Dec 10 21:15 rocks-dist>>>drwxrwsr-x 3 root wheel 4096 Dec 10 20:38 rocks-dist.orig>>>drwxr-sr-x 3 root wheel 4096 Dec 10 21:02 src>>>drwxr-sr-x 4 root wheel 4096 Dec 10 20:49 src.foo>>>On Dec 10, 2003, at 2:43 PM, V. Rowley wrote:>>>>>>>>>>When I run this:>>>>>>>>[root at rocks14 install]# rocks-dist mirror ; rocks-dist dist ;>>>>rocks-dist --dist=cdrom cdrom>>>>>>>>on a server installed with ROCKS 3.0.0, I eventually get this:>>>>>>>>>>>>>Cleaning distribution>>>>>Resolving versions (RPMs)>>>>>Resolving versions (SRPMs)>>>>>Adding support for rebuild distribution from source>>>>>Creating files (symbolic links - fast)>>>>>Creating symlinks to kickstart files>>>>>Fixing Comps Database>>>>>Generating hdlist (rpm database)>>>>>Patching second stage loader (eKV, partioning, ...)>>>>> patching "rocks-ekv" into distribution ...>>>>> patching "rocks-piece-pipe" into distribution ...>>>>> patching "PyXML" into distribution ...>>>>> patching "expat" into distribution ...>>>>> patching "rocks-pylib" into distribution ...>>>>> patching "MySQL-python" into distribution ...>>>>> patching "rocks-kickstart" into distribution ...>>>>> patching "rocks-kickstart-profiles" into distribution ...>>>>> patching "rocks-kickstart-dtds" into distribution ...>>>>> building CRAM filesystem ...>>>>>Cleaning distribution

>>>>>Resolving versions (RPMs)>>>>>Resolving versions (SRPMs)>>>>>Creating symlinks to kickstart files>>>>>Generating hdlist (rpm database)>>>>>Segregating RPMs (rocks, non-rocks)>>>>>sh: ./kickstart.cgi: No such file or directory>>>>>sh: ./kickstart.cgi: No such file or directory>>>>>Traceback (innermost last):>>>>> File "/opt/rocks/bin/rocks-dist", line 807, in ?>>>>> app.run()>>>>> File "/opt/rocks/bin/rocks-dist", line 623, in run>>>>> eval('self.command_%s()' % (command))>>>>> File "<string>", line 0, in ?>>>>> File "/opt/rocks/bin/rocks-dist", line 736, in command_cdrom>>>>> builder.build()>>>>> File "/opt/rocks/lib/python/rocks/build.py", line 1223, in build>>>>> (rocks, nonrocks) = self.segregateRPMS()>>>>> File "/opt/rocks/lib/python/rocks/build.py", line 1107, in>>>>>segregateRPMS>>>>> for pkg in ks.getSection('packages'):>>>>>TypeError: loop over non-sequence>>>>>>>>>>>>Any ideas?>>>>>>>>-->>>>Vicky Rowley email: vrowley at ucsd.edu>>>>Biomedical Informatics Research Network work: (858) 536-5980>>>>University of California, San Diego fax: (858) 822-0828>>>>9500 Gilman Drive>>>>La Jolla, CA 92093-0715>>>>>>>>>>>>See pictures from our trip to China at http://www.sagacitech.com/Chinaweb>>>>>>>>>>>-->>Vicky Rowley email: vrowley at ucsd.edu>>Biomedical Informatics Research Network work: (858) 536-5980>>University of California, San Diego fax: (858) 822-0828>>9500 Gilman Drive>>La Jolla, CA 92093-0715>>>>>>See pictures from our trip to China at http://www.sagacitech.com/Chinaweb>>>>> > > >



From wyzhong78 at msn.com Wed Dec 10 20:38:53 2003From: wyzhong78 at msn.com (zhong wenyu)Date: Thu, 11 Dec 2003 12:38:53 +0800Subject: [Rocks-Discuss]Rocks 3.0.0 problem:not able to boot upMessage-ID: <[email protected]>

>From: Greg Bruno <bruno at rocksclusters.org>>To: "zhong wenyu" <wyzhong78 at msn.com>>CC: npaci-rocks-discussion at sdsc.edu>Subject: Re: [Rocks-Discuss]Rocks 3.0.0 problem:not able to boot up>Date: Mon, 8 Dec 2003 15:31:08 -0800>>>I have installed Rocks 3.0.0 with default options successful,there >>was not any trouble.But I boot it up,it stopped at beginning,just >>show "GRUB" on the screen and waiting...>>when you built the frontend, did you start with the rocks base CD >then add the HPC roll?>> - gb>I have raveled out this trouble.But I don't know why.I have one SCSI harddisk and one IDE disk On the frontend,I choose SCSI to be the first HDD and installed "/" on it.then it can not boot up.Even disabled the IDE HDD and install it again,It can not boot up also.at last I choose SCSI to be the first HDD and install,then choose IDE HDD to be the first and boot up, it's ok!GRUB must be installed on IDE HDD?thanks!

_________________________________________________________________??????????????? MSN Hotmail? http://www.hotmail.com

From wyzhong78 at msn.com Wed Dec 10 20:44:09 2003From: wyzhong78 at msn.com (zhong wenyu)Date: Thu, 11 Dec 2003 12:44:09 +0800Subject: [Rocks-Discuss]I can't use xpbs in rocksMessage-ID: <[email protected]>

Hi,everyone!I have installed rocks 2.3.2 and 3.0.0,xpbs can not be use in both of them.typed:xpbs[enter]showed:xpbs: initialization failed! output: invalid command name "Pref_Init"thanks!

_________________________________________________________________?????????????? MSN Messenger: http://messenger.msn.com/cn

From phil at sdsc.edu Wed Dec 10 21:26:50 2003From: phil at sdsc.edu (Philip Papadopoulos)Date: Wed, 10 Dec 2003 21:26:50 -0800Subject: [Rocks-Discuss]Rocks 3.0.0 problem:not able to boot upIn-Reply-To: <[email protected]>References: <[email protected]>Message-ID: <[email protected]>

There is a conflict in the way the BIOS numbers drives and the way theinstallkernel numbers the drive (and this is not standard). You should check inyour BIOSif you can select which is the boot device. If it just says "Hard Disk"(no choice betweenIDE and SCSI), then you are stuck with needing to have Grub on thedevice thatBIOS thinks is the boot device. If you can choose, then SCSI canprobably be madeto work.

These sorts of issues (this is a general redhat/linux problem) can bequite troublesome(and annoying). We had some older HW that had two different types ofSCSI controllerswith drives on each controller. The boot kernel labeled the /sdadifferently than the BIOS.Install went fine, by the dreaded "OS Not Found" BIOS message whenrebooting. The cause was thatthe Grub loader was being put on Linux's notion of /sda, but when BIOSloaded, it foundnothing (because grub was installed on BIOS's idea of /sdb). For thisparticular machine, we were notable to change BIOSes notion -- we had to force Linux to boot thebootloader on linuxes idea of/sdb.

-P

zhong wenyu wrote:

>>>>> From: Greg Bruno <bruno at rocksclusters.org>>> To: "zhong wenyu" <wyzhong78 at msn.com>>> CC: npaci-rocks-discussion at sdsc.edu>> Subject: Re: [Rocks-Discuss]Rocks 3.0.0 problem:not able to boot up>> Date: Mon, 8 Dec 2003 15:31:08 -0800>>>>> I have installed Rocks 3.0.0 with default options successful,there>>> was not any trouble.But I boot it up,it stopped at beginning,just>>> show "GRUB" on the screen and waiting...>>>>>> when you built the frontend, did you start with the rocks base CD>> then add the HPC roll?

>>>> - gb>>> I have raveled out this trouble.But I don't know why.> I have one SCSI harddisk and one IDE disk On the frontend,I choose> SCSI to be the first HDD and installed "/" on it.then it can not boot> up.Even disabled the IDE HDD and install it again,It can not boot up> also.at last I choose SCSI to be the first HDD and install,then choose> IDE HDD to be the first and boot up, it's ok!> GRUB must be installed on IDE HDD?> thanks!>> _________________________________________________________________> ??????????????? MSN Hotmail? http://www.hotmail.com

From mjk at sdsc.edu Wed Dec 10 22:04:57 2003From: mjk at sdsc.edu (Mason J. Katz)Date: Wed, 10 Dec 2003 22:04:57 -0800Subject: [Rocks-Discuss]"TypeError: loop over non-sequence" when trying to build CD distroIn-Reply-To: <[email protected]>References: <[email protected]> <[email protected]>Message-ID: <[email protected]>

Hi Vicky,

The following directory cannot resolve its symlinks anymore. If you start removing the profiles and mirror directories around Rocks cannot find them to build kickstart files.

-mjk

[root at rocks14 default]# ls -ltotal 16lrwxrwxrwx 1 root root 113 Nov 13 20:19 core.xml -> /home/install/ftp.rocksclusters.org/pub/rocks/rocks-3.0.0/rocks-dist/ 7.3/en/os/i386/build/graphs/default/core.xml-rwxrwsr-x 1 root wheel 3123 Sep 3 17:10 hpc.xml-rwxr-xr-x 1 root root 495 Sep 9 22:55 patch.xml-rwxrwsr-x 1 root wheel 452 Sep 3 17:10 root.xmllrwxrwxrwx 1 root root 112 Nov 13 20:19 rsh.xml -> /home/install/ftp.rocksclusters.org/pub/rocks/rocks-3.0.0/rocks-dist/ 7.3/en/os/i386/build/graphs/default/rsh.xml-rwxrwsr-x 1 root wheel 923 Sep 3 17:10 sge.xml

On Dec 10, 2003, at 7:52 PM, V. Rowley wrote:

> Looks like python is okay:>>> [root at rocks14 birn-oracle1]# which python>> /usr/bin/python>> [root at rocks14 birn-oracle1]# python --help>> Unknown option: -->> usage: python [option] ... [-c cmd | file | -] [arg] ...

>> Options and arguments (and corresponding environment variables):>> -d : debug output from parser (also PYTHONDEBUG=x)>> -i : inspect interactively after running script, (also >> PYTHONINSPECT=x)>> and force prompts, even if stdin does not appear to be a >> terminal>> -O : optimize generated bytecode (a tad; also PYTHONOPTIMIZE=x)>> -OO : remove doc-strings in addition to the -O optimizations>> -S : don't imply 'import site' on initialization>> -t : issue warnings about inconsistent tab usage (-tt: issue >> errors)>> -u : unbuffered binary stdout and stderr (also PYTHONUNBUFFERED=x)>> -v : verbose (trace import statements) (also PYTHONVERBOSE=x)>> -x : skip first line of source, allowing use of non-Unix forms of >> #!cmd>> -X : disable class based built-in exceptions>> -c cmd : program passed in as string (terminates option list)>> file : program read from script file>> - : program read from stdin (default; interactive mode if a tty)>> arg ...: arguments passed to program in sys.argv[1:]>> Other environment variables:>> PYTHONSTARTUP: file executed on interactive startup (no default)>> PYTHONPATH : ':'-separated list of directories prefixed to the>> default module search path. The result is sys.path.>> PYTHONHOME : alternate <prefix> directory (or >> <prefix>:<exec_prefix>).>> The default module search path uses <prefix>/python1.5.>> [root at rocks14 birn-oracle1]#>>>> Tim Carlson wrote:>> On Wed, 10 Dec 2003, V. Rowley wrote:>> Did you remove python by chance? kickstart.cgi calls python directly >> in>> /usr/bin/python while rocks-dist does an "env python">> Tim>>> Yep, I did that, but only *AFTER* getting the error. [Thought it was>>> generated by the rocks-dist sequence, but apparently not.] Go ahead.>>> Move it back. Same difference.>>>>>> Vicky>>>>>> Mason J. Katz wrote:>>>>>>> It looks like someone moved the profiles directory to profiles.orig.>>>>>>>> -mjk>>>>>>>>>>>> [root at rocks14 install]# ls -l>>>> total 56>>>> drwxr-sr-x 3 root wheel 4096 Dec 10 21:16 cdrom>>>> drwxrwsr-x 5 root wheel 4096 Dec 10 20:38 contrib.orig>>>> drwxr-sr-x 3 root wheel 4096 Dec 10 21:07>>>> ftp.rocksclusters.org>>>> drwxr-sr-x 3 root wheel 4096 Dec 10 20:38>>>> ftp.rocksclusters.org.orig>>>> -r-xrwsr-x 1 root wheel 19254 Sep 3 12:40

>>>> kickstart.cgi>>>> drwxr-xr-x 3 root root 4096 Dec 10 20:38 >>>> profiles.orig>>>> drwxr-sr-x 3 root wheel 4096 Dec 10 21:15 rocks-dist>>>> drwxrwsr-x 3 root wheel 4096 Dec 10 20:38 >>>> rocks-dist.orig>>>> drwxr-sr-x 3 root wheel 4096 Dec 10 21:02 src>>>> drwxr-sr-x 4 root wheel 4096 Dec 10 20:49 src.foo>>>> On Dec 10, 2003, at 2:43 PM, V. Rowley wrote:>>>>>>>>>>>>> When I run this:>>>>>>>>>> [root at rocks14 install]# rocks-dist mirror ; rocks-dist dist ;>>>>> rocks-dist --dist=cdrom cdrom>>>>>>>>>> on a server installed with ROCKS 3.0.0, I eventually get this:>>>>>>>>>>>>>>>> Cleaning distribution>>>>>> Resolving versions (RPMs)>>>>>> Resolving versions (SRPMs)>>>>>> Adding support for rebuild distribution from source>>>>>> Creating files (symbolic links - fast)>>>>>> Creating symlinks to kickstart files>>>>>> Fixing Comps Database>>>>>> Generating hdlist (rpm database)>>>>>> Patching second stage loader (eKV, partioning, ...)>>>>>> patching "rocks-ekv" into distribution ...>>>>>> patching "rocks-piece-pipe" into distribution ...>>>>>> patching "PyXML" into distribution ...>>>>>> patching "expat" into distribution ...>>>>>> patching "rocks-pylib" into distribution ...>>>>>> patching "MySQL-python" into distribution ...>>>>>> patching "rocks-kickstart" into distribution ...>>>>>> patching "rocks-kickstart-profiles" into distribution ...>>>>>> patching "rocks-kickstart-dtds" into distribution ...>>>>>> building CRAM filesystem ...>>>>>> Cleaning distribution>>>>>> Resolving versions (RPMs)>>>>>> Resolving versions (SRPMs)>>>>>> Creating symlinks to kickstart files>>>>>> Generating hdlist (rpm database)>>>>>> Segregating RPMs (rocks, non-rocks)>>>>>> sh: ./kickstart.cgi: No such file or directory>>>>>> sh: ./kickstart.cgi: No such file or directory>>>>>> Traceback (innermost last):>>>>>> File "/opt/rocks/bin/rocks-dist", line 807, in ?>>>>>> app.run()>>>>>> File "/opt/rocks/bin/rocks-dist", line 623, in run>>>>>> eval('self.command_%s()' % (command))>>>>>> File "<string>", line 0, in ?>>>>>> File "/opt/rocks/bin/rocks-dist", line 736, in command_cdrom>>>>>> builder.build()>>>>>> File "/opt/rocks/lib/python/rocks/build.py", line 1223, in build>>>>>> (rocks, nonrocks) = self.segregateRPMS()>>>>>> File "/opt/rocks/lib/python/rocks/build.py", line 1107, in>>>>>> segregateRPMS>>>>>> for pkg in ks.getSection('packages'):

>>>>>> TypeError: loop over non-sequence>>>>>>>>>>>>>>> Any ideas?>>>>>>>>>> -->>>>> Vicky Rowley email: vrowley at ucsd.edu>>>>> Biomedical Informatics Research Network work: (858) 536-5980>>>>> University of California, San Diego fax: (858) 822-0828>>>>> 9500 Gilman Drive>>>>> La Jolla, CA 92093-0715>>>>>>>>>>>>>>> See pictures from our trip to China at >>>>> http://www.sagacitech.com/Chinaweb>>>>>>>>>>>>>>> -->>> Vicky Rowley email: vrowley at ucsd.edu>>> Biomedical Informatics Research Network work: (858) 536-5980>>> University of California, San Diego fax: (858) 822-0828>>> 9500 Gilman Drive>>> La Jolla, CA 92093-0715>>>>>>>>> See pictures from our trip to China at >>> http://www.sagacitech.com/Chinaweb>>>>>>>> -- > Vicky Rowley email: vrowley at ucsd.edu> Biomedical Informatics Research Network work: (858) 536-5980> University of California, San Diego fax: (858) 822-0828> 9500 Gilman Drive> La Jolla, CA 92093-0715>>> See pictures from our trip to China at > http://www.sagacitech.com/Chinaweb

From bruno at rocksclusters.org Wed Dec 10 22:31:11 2003From: bruno at rocksclusters.org (Greg Bruno)Date: Wed, 10 Dec 2003 22:31:11 -0800Subject: [Rocks-Discuss]Rocks 3.0.0In-Reply-To: <[email protected]>References: <[email protected]>Message-ID: <[email protected]>

> I am having a problem on install of rocks 3.0.0 on my new cluster.>> The python error occurs right after anaconda starts and just before > the install asks for the roll CDROM.>> The error refers to an inability to find or load rocks.file. The error > is associated I think with the window that pops up and asks you in put

> the roll CDROM in.>> The process I followed to get to this point is>> Put the Rocks 3.0.0 CDROM into the CDROM drive> Boot the system> At the prompt type frontend> Wait till anaconda starts> Error referring to unable to load rocks.file.>> I have successfully installed rocks on a smaller cluster but that has > different hardware. I used the same CDROM for both installs.>> Any thoughts?

hard to say -- but some folks had similar problems due to bad memory:

https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/2003-February/ 001246.html

- gb

From vincent_b_fox at yahoo.com Wed Dec 10 22:43:21 2003From: vincent_b_fox at yahoo.com (Vincent Fox)Date: Wed, 10 Dec 2003 22:43:21 -0800 (PST)Subject: [Rocks-Discuss]ATLAS rpm build problems on PII platformIn-Reply-To: <[email protected]>Message-ID: <[email protected]>

Okay, here's the context diff as plain text. I test-applied it using "patch -p0 < atlas.patch" and did a compile on my PII box successfully. I can send it as attachment or submit to CVS or some other way if you need: *** atlas.spec.in.orig Thu Dec 11 06:27:13 2003--- atlas.spec.in Thu Dec 11 06:30:46 2003****************** 111,117 ****--- 111,133 ---- y " | make+ elif [ $CPUID -eq 4 ]+ then+ #+ # Pentium II+ #+ echo "0+ y+ y+ n+ y+ linux+ 0+ /usr/bin/g77+ -O+ y+ " | make else

#

Greg Bruno <bruno at rocksclusters.org> wrote:> Okay, came up my own quick hack:>> Edit atlas.spec.in, go to "other x86" section, remove> 2 lines right above "linux", seems to make rpm now.>> A more formal patch would be put in a section for> cpuid eq 4 with this correction I suppose.

if you provide the patch, we'll include it in our next release.

- gb

---------------------------------Do you Yahoo!?New Yahoo! Photos - easier uploading and sharing-------------- next part --------------An HTML attachment was scrubbed...URL: https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20031210/be5c8b04/attachment-0001.html

From naihh at imcb.a-star.edu.sg Thu Dec 11 00:08:14 2003From: naihh at imcb.a-star.edu.sg (Nai Hong Hwa Francis)Date: Thu, 11 Dec 2003 16:08:14 +0800Subject: [Rocks-Discuss]RE: Have anyone successfully build a set of grid compute nodes using Rocks?Message-ID: <5E118EED7CC277468A275F11EEEC39B94CCDB9@EXIMCB2.imcb.a-star.edu.sg>

Hi,

Have anyone successfully build a set of grid compute nodes using Rocks3?Anyone care to share?









Today's Topics:

1. RE: Do you have a list of the various models of Gigabit EthernetInterfaces compatible to Rocks 3? (Nai Hong Hwa Francis) 2. Rocks 3.0.0 (Terrence Martin) 3. Re: "TypeError: loop over non-sequence" when trying to build CD distro (V. Rowley)

--__--__--

Message: 1Date: Thu, 11 Dec 2003 09:45:18 +0800From: "Nai Hong Hwa Francis" <naihh at imcb.a-star.edu.sg>To: <npaci-rocks-discussion at sdsc.edu>Subject: [Rocks-Discuss]RE: Do you have a list of the various models ofGigabit Ethernet Interfaces compatible to Rocks 3?

Hi All,




Thanks and Regards


-----Original Message-----From: npaci-rocks-discussion-request at sdsc.edu[mailto:npaci-rocks-discussion-request at sdsc.edu]=20Sent: Thursday, December 11, 2003 9:25 AMTo: npaci-rocks-discussion at sdsc.eduSubject: npaci-rocks-discussion digest, Vol 1 #641 - 13 msgs


To subscribe or unsubscribe via the World Wide Web, visit=09http://lists.sdsc.edu/mailman/listinfo.cgi/npaci-rocks-discussionor, via email, send a message with subject or body 'help' to




Today's Topics:


-- __--__--

Message: 1Date: Wed, 10 Dec 2003 14:04:53 -0600 (CST)From: "Chris Dwan (CCGB)" <cdwan at mail.ahc.umn.edu>To: npaci-rocks-discussion at sdsc.eduSubject: [Rocks-Discuss]Non-homogenous legacy hardware



Is this possible?

Thanks, in advance. If this is out there on the mailing list archives,

apointer would be greatly appreciated.


-- __--__--



This is with the default CVS checkout with an update today accordingto=20the rocks userguide. I have not actually attempted to make any changes.

make[3]: Leaving directory=20`/home/install/rocks/src/rocks/boot/7.3/loader/anaconda-7.3/loader'make[2]: Leaving directory=20`/home/install/rocks/src/rocks/boot/7.3/loader/anaconda-7.3'strip -o loader anaconda-7.3/loader/loaderstrip: anaconda-7.3/loader/loader: No such file or directorymake[1]: *** [loader] Error 1make[1]: Leaving directory`/home/install/rocks/src/rocks/boot/7.3/loader'make: *** [loader] Error 2

Of course I could avoid all of this together and just put my binary=20module into the appropriate location in the boot image.

Would it be correct to modify the following image file with mychanges=20and then write it to a floppy via dd?


Basically I am injecting an updated e1000 driver with changes to=20pcitable to support the address of my gigabit cards.

Terrence

-- __--__--









Tim


-- __--__--



>> I am integrating legacy systems into a ROCKS cluster, and have hit a> snag with the auto-partition configuration: The new (old) systemshave> SCSI disks, while old (new) ones contain IDE. This is a non-issue so

> long as the initial install does its default partitioning. However, I> have a "replace-auto-partition.xml" file which is unworkable for theSCSI> based systems since it makes specific reference to "hda" rather than> "sda."




You could probably (maybe) wrap most of that in an<eval sh=3D"bash"></eval>



Tim


-- __--__--



I looked through the ganglia docs and didn't see anything indicating howto do this, so I'm assuming Rocks made some changes. Unfortunately therocks iso images don't seem to contain srpms, so I'm now coming here.=20What did Rocks do to ganglia to make the distribution of ssh keys work?


-- __--__--


Most of the SRPMS are on our FTP site, but we've screwed this up =20before. The SRPMS are entirely Rocks specific so they are of little =20value outside of Rocks. You can also checkout our CVS tree =20(cvs.rocksclusters.org) where rocks/src/ganglia shows what we add. We=20have a ganglia-python package we created to allow us to write our own=20metrics at a high level than the provide gmetric application. We've =20also moved from this method to a single cluster-wide ssh key for Rocks=203.1.

-mjk


> I noticed a previous post on this list> (https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/2003-May/=20> 001934.html) indicating that Rocks distributes ssh keys for all the=20> nodes over> ganglia. Can anyone enlighten me as to how this is done?>> I looked through the ganglia docs and didn't see anything indicating=20> how> to do this, so I'm assuming Rocks made some changes. Unfortunatelythe> rocks iso images don't seem to contain srpms, so I'm now coming here.> What did Rocks do to ganglia to make the distribution of ssh keyswork?>> Also, does anyone know where Rocks SRPMs can be found? I've donequite> a bit of searching, but haven't found them anywhere.

-- __--__--


When I run this:


--dist=3Dcdrom cdrom


> Cleaning distribution> Resolving versions (RPMs)> Resolving versions (SRPMs)> Adding support for rebuild distribution from source

> Creating files (symbolic links - fast)> Creating symlinks to kickstart files> Fixing Comps Database> Generating hdlist (rpm database)> Patching second stage loader (eKV, partioning, ...)> patching "rocks-ekv" into distribution ...> patching "rocks-piece-pipe" into distribution ...> patching "PyXML" into distribution ...> patching "expat" into distribution ...> patching "rocks-pylib" into distribution ...> patching "MySQL-python" into distribution ...> patching "rocks-kickstart" into distribution ...> patching "rocks-kickstart-profiles" into distribution ...> patching "rocks-kickstart-dtds" into distribution ...> building CRAM filesystem ...> Cleaning distribution> Resolving versions (RPMs)> Resolving versions (SRPMs)> Creating symlinks to kickstart files> Generating hdlist (rpm database)> Segregating RPMs (rocks, non-rocks)> sh: ./kickstart.cgi: No such file or directory> sh: ./kickstart.cgi: No such file or directory> Traceback (innermost last):> File "/opt/rocks/bin/rocks-dist", line 807, in ?> app.run()> File "/opt/rocks/bin/rocks-dist", line 623, in run> eval('self.command_%s()' % (command))> File "<string>", line 0, in ?> File "/opt/rocks/bin/rocks-dist", line 736, in command_cdrom> builder.build()> File "/opt/rocks/lib/python/rocks/build.py", line 1223, in build> (rocks, nonrocks) =3D self.segregateRPMS()> File "/opt/rocks/lib/python/rocks/build.py", line 1107, insegregateRPMS> for pkg in ks.getSection('packages'):> TypeError: loop over non-sequence

Any ideas?

--=20Vicky Rowley email: vrowley at ucsd.eduBiomedical Informatics Research Network work: (858) 536-5980University of California, San Diego fax: (858) 822-08289500 Gilman DriveLa Jolla, CA 92093-0715


-- __--__--

Message: 8Cc: rocks <npaci-rocks-discussion at sdsc.edu>From: Greg Bruno <bruno at rocksclusters.org>Subject: Re: [Rocks-Discuss]one node short in "labels"Date: Wed, 10 Dec 2003 15:12:49 -0800

To: Vincent Fox <vincent_b_fox at yahoo.com>

> So I go to the "labels" selection on the web page to print out =the=3D20> pretty labels. What a nice idea by the way!> =3DA0> EXCEPT....it's one node short! I go up to 0-13 and this stops at=3D20> 0-12.=3DA0 Any ideas where I should check to fix this?



- gb

-- __--__--



-mjk

[root at rocks14 install]# ls -ltotal 56drwxr-sr-x 3 root wheel 4096 Dec 10 21:16 cdromdrwxrwsr-x 5 root wheel 4096 Dec 10 20:38 contrib.origdrwxr-sr-x 3 root wheel 4096 Dec 10 21:07=20ftp.rocksclusters.orgdrwxr-sr-x 3 root wheel 4096 Dec 10 20:38=20ftp.rocksclusters.org.orig-r-xrwsr-x 1 root wheel 19254 Sep 3 12:40 kickstart.cgidrwxr-xr-x 3 root root 4096 Dec 10 20:38 profiles.origdrwxr-sr-x 3 root wheel 4096 Dec 10 21:15 rocks-distdrwxrwsr-x 3 root wheel 4096 Dec 10 20:38 rocks-dist.origdrwxr-sr-x 3 root wheel 4096 Dec 10 21:02 srcdrwxr-sr-x 4 root wheel 4096 Dec 10 20:49 src.fooOn Dec 10, 2003, at 2:43 PM, V. Rowley wrote:

> When I run this:>> [root at rocks14 install]# rocks-dist mirror ; rocks-dist dist ;=20> rocks-dist --dist=3Dcdrom cdrom>> on a server installed with ROCKS 3.0.0, I eventually get this:>>> Cleaning distribution>> Resolving versions (RPMs)>> Resolving versions (SRPMs)>> Adding support for rebuild distribution from source>> Creating files (symbolic links - fast)

>> Creating symlinks to kickstart files>> Fixing Comps Database>> Generating hdlist (rpm database)>> Patching second stage loader (eKV, partioning, ...)>> patching "rocks-ekv" into distribution ...>> patching "rocks-piece-pipe" into distribution ...>> patching "PyXML" into distribution ...>> patching "expat" into distribution ...>> patching "rocks-pylib" into distribution ...>> patching "MySQL-python" into distribution ...>> patching "rocks-kickstart" into distribution ...>> patching "rocks-kickstart-profiles" into distribution ...>> patching "rocks-kickstart-dtds" into distribution ...>> building CRAM filesystem ...>> Cleaning distribution>> Resolving versions (RPMs)>> Resolving versions (SRPMs)>> Creating symlinks to kickstart files>> Generating hdlist (rpm database)>> Segregating RPMs (rocks, non-rocks)>> sh: ./kickstart.cgi: No such file or directory>> sh: ./kickstart.cgi: No such file or directory>> Traceback (innermost last):>> File "/opt/rocks/bin/rocks-dist", line 807, in ?>> app.run()>> File "/opt/rocks/bin/rocks-dist", line 623, in run>> eval('self.command_%s()' % (command))>> File "<string>", line 0, in ?>> File "/opt/rocks/bin/rocks-dist", line 736, in command_cdrom>> builder.build()>> File "/opt/rocks/lib/python/rocks/build.py", line 1223, in build>> (rocks, nonrocks) =3D self.segregateRPMS()>> File "/opt/rocks/lib/python/rocks/build.py", line 1107, in=20>> segregateRPMS>> for pkg in ks.getSection('packages'):>> TypeError: loop over non-sequence>> Any ideas?>> --=20> Vicky Rowley email: vrowley at ucsd.edu> Biomedical Informatics Research Network work: (858) 536-5980> University of California, San Diego fax: (858) 822-0828> 9500 Gilman Drive> La Jolla, CA 92093-0715>>> See pictures from our trip to China at=20> http://www.sagacitech.com/Chinaweb

-- __--__--

Message: 10Date: Wed, 10 Dec 2003 16:50:16 -0800From: "V. Rowley" <vrowley at ucsd.edu>To: "Mason J. Katz" <mjk at sdsc.edu>CC: npaci-rocks-discussion at sdsc.eduSubject: Re: [Rocks-Discuss]"TypeError: loop over non-sequence" when

trying to build CD distro

Yep, I did that, but only *AFTER* getting the error. [Thought it was=20generated by the rocks-dist sequence, but apparently not.] Go ahead.=20Move it back. Same difference.

Vicky

Mason J. Katz wrote:> It looks like someone moved the profiles directory to profiles.orig.>=20> -mjk>=20>=20> [root at rocks14 install]# ls -l> total 56> drwxr-sr-x 3 root wheel 4096 Dec 10 21:16 cdrom> drwxrwsr-x 5 root wheel 4096 Dec 10 20:38 contrib.orig> drwxr-sr-x 3 root wheel 4096 Dec 10 21:07=20> ftp.rocksclusters.org> drwxr-sr-x 3 root wheel 4096 Dec 10 20:38=20> ftp.rocksclusters.org.orig> -r-xrwsr-x 1 root wheel 19254 Sep 3 12:40 kickstart.cgi> drwxr-xr-x 3 root root 4096 Dec 10 20:38 profiles.orig> drwxr-sr-x 3 root wheel 4096 Dec 10 21:15 rocks-dist> drwxrwsr-x 3 root wheel 4096 Dec 10 20:38rocks-dist.orig> drwxr-sr-x 3 root wheel 4096 Dec 10 21:02 src> drwxr-sr-x 4 root wheel 4096 Dec 10 20:49 src.foo> On Dec 10, 2003, at 2:43 PM, V. Rowley wrote:>=20>> When I run this:>>>> [root at rocks14 install]# rocks-dist mirror ; rocks-dist dist ;=20>> rocks-dist --dist=3Dcdrom cdrom>>>> on a server installed with ROCKS 3.0.0, I eventually get this:>>>>> Cleaning distribution>>> Resolving versions (RPMs)>>> Resolving versions (SRPMs)>>> Adding support for rebuild distribution from source>>> Creating files (symbolic links - fast)>>> Creating symlinks to kickstart files>>> Fixing Comps Database>>> Generating hdlist (rpm database)>>> Patching second stage loader (eKV, partioning, ...)>>> patching "rocks-ekv" into distribution ...>>> patching "rocks-piece-pipe" into distribution ...>>> patching "PyXML" into distribution ...>>> patching "expat" into distribution ...>>> patching "rocks-pylib" into distribution ...>>> patching "MySQL-python" into distribution ...>>> patching "rocks-kickstart" into distribution ...>>> patching "rocks-kickstart-profiles" into distribution ...>>> patching "rocks-kickstart-dtds" into distribution ...>>> building CRAM filesystem ...>>> Cleaning distribution

>>> Resolving versions (RPMs)>>> Resolving versions (SRPMs)>>> Creating symlinks to kickstart files>>> Generating hdlist (rpm database)>>> Segregating RPMs (rocks, non-rocks)>>> sh: ./kickstart.cgi: No such file or directory>>> sh: ./kickstart.cgi: No such file or directory>>> Traceback (innermost last):>>> File "/opt/rocks/bin/rocks-dist", line 807, in ?>>> app.run()>>> File "/opt/rocks/bin/rocks-dist", line 623, in run>>> eval('self.command_%s()' % (command))>>> File "<string>", line 0, in ?>>> File "/opt/rocks/bin/rocks-dist", line 736, in command_cdrom>>> builder.build()>>> File "/opt/rocks/lib/python/rocks/build.py", line 1223, in build>>> (rocks, nonrocks) =3D self.segregateRPMS()>>> File "/opt/rocks/lib/python/rocks/build.py", line 1107, in=20>>> segregateRPMS>>> for pkg in ks.getSection('packages'):>>> TypeError: loop over non-sequence>>>>>> Any ideas?>>>> --=20>> Vicky Rowley email: vrowley at ucsd.edu>> Biomedical Informatics Research Network work: (858) 536-5980>> University of California, San Diego fax: (858) 822-0828>> 9500 Gilman Drive>> La Jolla, CA 92093-0715>>>>>> See pictures from our trip to China athttp://www.sagacitech.com/Chinaweb>=20>=20>=20



-- __--__--

Message: 11Date: Wed, 10 Dec 2003 17:23:25 -0800 (PST)From: Tim Carlson <tim.carlson at pnl.gov>Subject: Re: [Rocks-Discuss]"TypeError: loop over non-sequence" whentrying to build CD distro

To: "V. Rowley" <vrowley at ucsd.edu>Cc: "Mason J. Katz" <mjk at sdsc.edu>, npaci-rocks-discussion at sdsc.eduReply-to: Tim Carlson <tim.carlson at pnl.gov>



Tim

> Yep, I did that, but only *AFTER* getting the error. [Thought it was> generated by the rocks-dist sequence, but apparently not.] Go ahead.> Move it back. Same difference.>> Vicky>> Mason J. Katz wrote:> > It looks like someone moved the profiles directory to profiles.orig.> >> > -mjk> >> >> > [root at rocks14 install]# ls -l> > total 56> > drwxr-sr-x 3 root wheel 4096 Dec 10 21:16 cdrom> > drwxrwsr-x 5 root wheel 4096 Dec 10 20:38 contrib.orig> > drwxr-sr-x 3 root wheel 4096 Dec 10 21:07> > ftp.rocksclusters.org> > drwxr-sr-x 3 root wheel 4096 Dec 10 20:38> > ftp.rocksclusters.org.orig> > -r-xrwsr-x 1 root wheel 19254 Sep 3 12:40kickstart.cgi> > drwxr-xr-x 3 root root 4096 Dec 10 20:38profiles.orig> > drwxr-sr-x 3 root wheel 4096 Dec 10 21:15 rocks-dist> > drwxrwsr-x 3 root wheel 4096 Dec 10 20:38rocks-dist.orig> > drwxr-sr-x 3 root wheel 4096 Dec 10 21:02 src> > drwxr-sr-x 4 root wheel 4096 Dec 10 20:49 src.foo> > On Dec 10, 2003, at 2:43 PM, V. Rowley wrote:> >> >> When I run this:> >>> >> [root at rocks14 install]# rocks-dist mirror ; rocks-dist dist ;> >> rocks-dist --dist=3Dcdrom cdrom> >>> >> on a server installed with ROCKS 3.0.0, I eventually get this:> >>> >>> Cleaning distribution> >>> Resolving versions (RPMs)> >>> Resolving versions (SRPMs)> >>> Adding support for rebuild distribution from source> >>> Creating files (symbolic links - fast)> >>> Creating symlinks to kickstart files> >>> Fixing Comps Database> >>> Generating hdlist (rpm database)> >>> Patching second stage loader (eKV, partioning, ...)> >>> patching "rocks-ekv" into distribution ...

> >>> patching "rocks-piece-pipe" into distribution ...> >>> patching "PyXML" into distribution ...> >>> patching "expat" into distribution ...> >>> patching "rocks-pylib" into distribution ...> >>> patching "MySQL-python" into distribution ...> >>> patching "rocks-kickstart" into distribution ...> >>> patching "rocks-kickstart-profiles" into distribution ...> >>> patching "rocks-kickstart-dtds" into distribution ...> >>> building CRAM filesystem ...> >>> Cleaning distribution> >>> Resolving versions (RPMs)> >>> Resolving versions (SRPMs)> >>> Creating symlinks to kickstart files> >>> Generating hdlist (rpm database)> >>> Segregating RPMs (rocks, non-rocks)> >>> sh: ./kickstart.cgi: No such file or directory> >>> sh: ./kickstart.cgi: No such file or directory> >>> Traceback (innermost last):> >>> File "/opt/rocks/bin/rocks-dist", line 807, in ?> >>> app.run()> >>> File "/opt/rocks/bin/rocks-dist", line 623, in run> >>> eval('self.command_%s()' % (command))> >>> File "<string>", line 0, in ?> >>> File "/opt/rocks/bin/rocks-dist", line 736, in command_cdrom> >>> builder.build()> >>> File "/opt/rocks/lib/python/rocks/build.py", line 1223, in build> >>> (rocks, nonrocks) =3D self.segregateRPMS()> >>> File "/opt/rocks/lib/python/rocks/build.py", line 1107, in> >>> segregateRPMS> >>> for pkg in ks.getSection('packages'):> >>> TypeError: loop over non-sequence> >>> >>> >> Any ideas?> >>> >> --> >> Vicky Rowley email: vrowley at ucsd.edu> >> Biomedical Informatics Research Network work: (858) 536-5980> >> University of California, San Diego fax: (858) 822-0828> >> 9500 Gilman Drive> >> La Jolla, CA 92093-0715> >>> >>> >> See pictures from our trip to China athttp://www.sagacitech.com/Chinaweb> >> >> >>> --> Vicky Rowley email: vrowley at ucsd.edu> Biomedical Informatics Research Network work: (858) 536-5980> University of California, San Diego fax: (858) 822-0828> 9500 Gilman Drive> La Jolla, CA 92093-0715>>> See pictures from our trip to China athttp://www.sagacitech.com/Chinaweb

>>

-- __--__--



DISCLAIMER:This email is confidential and may be privileged. If you are not the =intended recipient, please delete it and notify us immediately. Please =do not copy or use it for any purpose, or disclose its contents to any =other person as it may be an offence under the Official Secrets Act. =Thank you.

--__--__--

Message: 2Date: Wed, 10 Dec 2003 18:03:41 -0800From: Terrence Martin <tmartin at physics.ucsd.edu>To: npaci-rocks-discussion at sdsc.eduSubject: [Rocks-Discuss]Rocks 3.0.0






I have successfully installed rocks on a smaller cluster but that has different hardware. I used the same CDROM for both installs.

Any thoughts?

Terrence

--__--__--

Message: 3Date: Wed, 10 Dec 2003 19:52:49 -0800From: "V. Rowley" <vrowley at ucsd.edu>To: npaci-rocks-discussion at sdsc.eduSubject: Re: [Rocks-Discuss]"TypeError: loop over non-sequence" whentrying to build CD distro


> [root at rocks14 birn-oracle1]# which python> /usr/bin/python> [root at rocks14 birn-oracle1]# python --help> Unknown option: --> usage: python [option] ... [-c cmd | file | -] [arg] ...> Options and arguments (and corresponding environment variables):> -d : debug output from parser (also PYTHONDEBUG=x)> -i : inspect interactively after running script, (alsoPYTHONINSPECT=x)> and force prompts, even if stdin does not appear to be aterminal> -O : optimize generated bytecode (a tad; also PYTHONOPTIMIZE=x)> -OO : remove doc-strings in addition to the -O optimizations> -S : don't imply 'import site' on initialization> -t : issue warnings about inconsistent tab usage (-tt: issueerrors)> -u : unbuffered binary stdout and stderr (also PYTHONUNBUFFERED=x)> -v : verbose (trace import statements) (also PYTHONVERBOSE=x)> -x : skip first line of source, allowing use of non-Unix forms of#!cmd> -X : disable class based built-in exceptions> -c cmd : program passed in as string (terminates option list)> file : program read from script file> - : program read from stdin (default; interactive mode if a tty)> arg ...: arguments passed to program in sys.argv[1:]> Other environment variables:> PYTHONSTARTUP: file executed on interactive startup (no default)> PYTHONPATH : ':'-separated list of directories prefixed to the> default module search path. The result is sys.path.> PYTHONHOME : alternate <prefix> directory (or<prefix>:<exec_prefix>).> The default module search path uses <prefix>/python1.5.> [root at rocks14 birn-oracle1]#

Tim Carlson wrote:> On Wed, 10 Dec 2003, V. Rowley wrote:> > Did you remove python by chance? kickstart.cgi calls python directlyin> /usr/bin/python while rocks-dist does an "env python"> > Tim> > >>Yep, I did that, but only *AFTER* getting the error. [Thought it was>>generated by the rocks-dist sequence, but apparently not.] Go ahead.

>>Move it back. Same difference.>>>>Vicky>>>>Mason J. Katz wrote:>>>>>It looks like someone moved the profiles directory to profiles.orig.>>>>>> -mjk>>>>>>>>>[root at rocks14 install]# ls -l>>>total 56>>>drwxr-sr-x 3 root wheel 4096 Dec 10 21:16 cdrom>>>drwxrwsr-x 5 root wheel 4096 Dec 10 20:38 contrib.orig>>>drwxr-sr-x 3 root wheel 4096 Dec 10 21:07>>>ftp.rocksclusters.org>>>drwxr-sr-x 3 root wheel 4096 Dec 10 20:38>>>ftp.rocksclusters.org.orig>>>-r-xrwsr-x 1 root wheel 19254 Sep 3 12:40 kickstart.cgi>>>drwxr-xr-x 3 root root 4096 Dec 10 20:38 profiles.orig>>>drwxr-sr-x 3 root wheel 4096 Dec 10 21:15 rocks-dist>>>drwxrwsr-x 3 root wheel 4096 Dec 10 20:38rocks-dist.orig>>>drwxr-sr-x 3 root wheel 4096 Dec 10 21:02 src>>>drwxr-sr-x 4 root wheel 4096 Dec 10 20:49 src.foo>>>On Dec 10, 2003, at 2:43 PM, V. Rowley wrote:>>>>>>>>>>When I run this:>>>>>>>>[root at rocks14 install]# rocks-dist mirror ; rocks-dist dist ;>>>>rocks-dist --dist=cdrom cdrom>>>>>>>>on a server installed with ROCKS 3.0.0, I eventually get this:>>>>>>>>>>>>>Cleaning distribution>>>>>Resolving versions (RPMs)>>>>>Resolving versions (SRPMs)>>>>>Adding support for rebuild distribution from source>>>>>Creating files (symbolic links - fast)>>>>>Creating symlinks to kickstart files>>>>>Fixing Comps Database>>>>>Generating hdlist (rpm database)>>>>>Patching second stage loader (eKV, partioning, ...)>>>>> patching "rocks-ekv" into distribution ...>>>>> patching "rocks-piece-pipe" into distribution ...>>>>> patching "PyXML" into distribution ...>>>>> patching "expat" into distribution ...>>>>> patching "rocks-pylib" into distribution ...>>>>> patching "MySQL-python" into distribution ...>>>>> patching "rocks-kickstart" into distribution ...>>>>> patching "rocks-kickstart-profiles" into distribution ...>>>>> patching "rocks-kickstart-dtds" into distribution ...>>>>> building CRAM filesystem ...>>>>>Cleaning distribution>>>>>Resolving versions (RPMs)>>>>>Resolving versions (SRPMs)

>>>>>Creating symlinks to kickstart files>>>>>Generating hdlist (rpm database)>>>>>Segregating RPMs (rocks, non-rocks)>>>>>sh: ./kickstart.cgi: No such file or directory>>>>>sh: ./kickstart.cgi: No such file or directory>>>>>Traceback (innermost last):>>>>> File "/opt/rocks/bin/rocks-dist", line 807, in ?>>>>> app.run()>>>>> File "/opt/rocks/bin/rocks-dist", line 623, in run>>>>> eval('self.command_%s()' % (command))>>>>> File "<string>", line 0, in ?>>>>> File "/opt/rocks/bin/rocks-dist", line 736, in command_cdrom>>>>> builder.build()>>>>> File "/opt/rocks/lib/python/rocks/build.py", line 1223, in build>>>>> (rocks, nonrocks) = self.segregateRPMS()>>>>> File "/opt/rocks/lib/python/rocks/build.py", line 1107, in>>>>>segregateRPMS>>>>> for pkg in ks.getSection('packages'):>>>>>TypeError: loop over non-sequence>>>>>>>>>>>>Any ideas?>>>>>>>>-->>>>Vicky Rowley email: vrowley at ucsd.edu>>>>Biomedical Informatics Research Network work: (858) 536-5980>>>>University of California, San Diego fax: (858) 822-0828>>>>9500 Gilman Drive>>>>La Jolla, CA 92093-0715>>>>>>>>>>>>See pictures from our trip to China athttp://www.sagacitech.com/Chinaweb>>>>>>>>>>>-->>Vicky Rowley email: vrowley at ucsd.edu>>Biomedical Informatics Research Network work: (858) 536-5980>>University of California, San Diego fax: (858) 822-0828>>9500 Gilman Drive>>La Jolla, CA 92093-0715>>>>>>See pictures from our trip to China athttp://www.sagacitech.com/Chinaweb>>>>> > > >



--__--__--




From naihh at imcb.a-star.edu.sg Thu Dec 11 00:09:34 2003From: naihh at imcb.a-star.edu.sg (Nai Hong Hwa Francis)Date: Thu, 11 Dec 2003 16:09:34 +0800Subject: [Rocks-Discuss]RE: Install rocks on Titan64 Superblade Classic with Dual Opteron 244Message-ID: <5E118EED7CC277468A275F11EEEC39B94CCDBA@EXIMCB2.imcb.a-star.edu.sg>

Hi,

Has anyone successfully install rocks on Titan64 Superblade Classic withDual Opteron 244?









Today's Topics:

1. RE: Do you have a list of the various models of Gigabit EthernetInterfaces compatible to Rocks 3? (Nai Hong Hwa Francis) 2. Rocks 3.0.0 (Terrence Martin) 3. Re: "TypeError: loop over non-sequence" when trying to build CD distro (V. Rowley)

--__--__--

Message: 1Date: Thu, 11 Dec 2003 09:45:18 +0800From: "Nai Hong Hwa Francis" <naihh at imcb.a-star.edu.sg>To: <npaci-rocks-discussion at sdsc.edu>Subject: [Rocks-Discuss]RE: Do you have a list of the various models ofGigabit Ethernet Interfaces compatible to Rocks 3?

Hi All,




Have anyone successfully build a set of grid compute nodes using Rocks3?

Thanks and Regards


-----Original Message-----From: npaci-rocks-discussion-request at sdsc.edu[mailto:npaci-rocks-discussion-request at sdsc.edu]=20Sent: Thursday, December 11, 2003 9:25 AMTo: npaci-rocks-discussion at sdsc.eduSubject: npaci-rocks-discussion digest, Vol 1 #641 - 13 msgs


To subscribe or unsubscribe via the World Wide Web, visit=09http://lists.sdsc.edu/mailman/listinfo.cgi/npaci-rocks-discussionor, via email, send a message with subject or body 'help' to




Today's Topics:


-- __--__--

Message: 1Date: Wed, 10 Dec 2003 14:04:53 -0600 (CST)From: "Chris Dwan (CCGB)" <cdwan at mail.ahc.umn.edu>To: npaci-rocks-discussion at sdsc.eduSubject: [Rocks-Discuss]Non-homogenous legacy hardware



Is this possible?

Thanks, in advance. If this is out there on the mailing list archives,

apointer would be greatly appreciated.


-- __--__--



This is with the default CVS checkout with an update today accordingto=20the rocks userguide. I have not actually attempted to make any changes.

make[3]: Leaving directory=20`/home/install/rocks/src/rocks/boot/7.3/loader/anaconda-7.3/loader'make[2]: Leaving directory=20`/home/install/rocks/src/rocks/boot/7.3/loader/anaconda-7.3'strip -o loader anaconda-7.3/loader/loaderstrip: anaconda-7.3/loader/loader: No such file or directorymake[1]: *** [loader] Error 1make[1]: Leaving directory`/home/install/rocks/src/rocks/boot/7.3/loader'make: *** [loader] Error 2

Of course I could avoid all of this together and just put my binary=20module into the appropriate location in the boot image.

Would it be correct to modify the following image file with mychanges=20and then write it to a floppy via dd?


Basically I am injecting an updated e1000 driver with changes to=20pcitable to support the address of my gigabit cards.

Terrence

-- __--__--









Tim


-- __--__--



>> I am integrating legacy systems into a ROCKS cluster, and have hit a> snag with the auto-partition configuration: The new (old) systemshave> SCSI disks, while old (new) ones contain IDE. This is a non-issue so

> long as the initial install does its default partitioning. However, I> have a "replace-auto-partition.xml" file which is unworkable for theSCSI> based systems since it makes specific reference to "hda" rather than> "sda."




You could probably (maybe) wrap most of that in an<eval sh=3D"bash"></eval>



Tim


-- __--__--



I looked through the ganglia docs and didn't see anything indicating howto do this, so I'm assuming Rocks made some changes. Unfortunately therocks iso images don't seem to contain srpms, so I'm now coming here.=20What did Rocks do to ganglia to make the distribution of ssh keys work?


-- __--__--


Most of the SRPMS are on our FTP site, but we've screwed this up =20before. The SRPMS are entirely Rocks specific so they are of little =20value outside of Rocks. You can also checkout our CVS tree =20(cvs.rocksclusters.org) where rocks/src/ganglia shows what we add. We=20have a ganglia-python package we created to allow us to write our own=20metrics at a high level than the provide gmetric application. We've =20also moved from this method to a single cluster-wide ssh key for Rocks=203.1.

-mjk


> I noticed a previous post on this list> (https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/2003-May/=20> 001934.html) indicating that Rocks distributes ssh keys for all the=20> nodes over> ganglia. Can anyone enlighten me as to how this is done?>> I looked through the ganglia docs and didn't see anything indicating=20> how> to do this, so I'm assuming Rocks made some changes. Unfortunatelythe> rocks iso images don't seem to contain srpms, so I'm now coming here.> What did Rocks do to ganglia to make the distribution of ssh keyswork?>> Also, does anyone know where Rocks SRPMs can be found? I've donequite> a bit of searching, but haven't found them anywhere.

-- __--__--


When I run this:


--dist=3Dcdrom cdrom


> Cleaning distribution> Resolving versions (RPMs)> Resolving versions (SRPMs)> Adding support for rebuild distribution from source

> Creating files (symbolic links - fast)> Creating symlinks to kickstart files> Fixing Comps Database> Generating hdlist (rpm database)> Patching second stage loader (eKV, partioning, ...)> patching "rocks-ekv" into distribution ...> patching "rocks-piece-pipe" into distribution ...> patching "PyXML" into distribution ...> patching "expat" into distribution ...> patching "rocks-pylib" into distribution ...> patching "MySQL-python" into distribution ...> patching "rocks-kickstart" into distribution ...> patching "rocks-kickstart-profiles" into distribution ...> patching "rocks-kickstart-dtds" into distribution ...> building CRAM filesystem ...> Cleaning distribution> Resolving versions (RPMs)> Resolving versions (SRPMs)> Creating symlinks to kickstart files> Generating hdlist (rpm database)> Segregating RPMs (rocks, non-rocks)> sh: ./kickstart.cgi: No such file or directory> sh: ./kickstart.cgi: No such file or directory> Traceback (innermost last):> File "/opt/rocks/bin/rocks-dist", line 807, in ?> app.run()> File "/opt/rocks/bin/rocks-dist", line 623, in run> eval('self.command_%s()' % (command))> File "<string>", line 0, in ?> File "/opt/rocks/bin/rocks-dist", line 736, in command_cdrom> builder.build()> File "/opt/rocks/lib/python/rocks/build.py", line 1223, in build> (rocks, nonrocks) =3D self.segregateRPMS()> File "/opt/rocks/lib/python/rocks/build.py", line 1107, insegregateRPMS> for pkg in ks.getSection('packages'):> TypeError: loop over non-sequence

Any ideas?



-- __--__--

Message: 8Cc: rocks <npaci-rocks-discussion at sdsc.edu>From: Greg Bruno <bruno at rocksclusters.org>Subject: Re: [Rocks-Discuss]one node short in "labels"Date: Wed, 10 Dec 2003 15:12:49 -0800