[back] [Abstract] [Copyright Notice] [Contents] [next]

Sparc cluster installation experience: The "Sparcgate"@doshisha.ac.jp - Chapter 2
Doing a TFTP Boot on SPARC


2.1 Installing on SparcStation 5/10/20

Installing on SparcStation 5/10/20, often abbreviated to SS5, SS10, SS20, was a breeze. First, boot them. At the beginning, it says what MAC address it has, and stop the boot process with pressing the keys Stop-A.

Now, go back to the TFTP server, and use the RARP command like this:

rarp -s host-IP-address MAC-address e.g. rarp -s 192.168.1.16 20:0:9:a8:2:fe

You have to have decided the IP address to give it to. Then, in the TFTP directory, /home/tftpduser, do a ln -s boot.img ip-address-in-hex.architecture e.g. ln -s boot.2.2.1.img A8C00110.SUN4M[4]

All this done, you can type on the SPARC console boot net and hopefully it will load up and start. If it doesn't respond, it is probably that the RARP setting or the filename is wrong. If it stops halfway through the loading process, then you should try doing doing Stop-A and typing boot net again. TFTP does not seem to recover when it has errors.[5]

If you see the letters TILO, you have succeeded. Wait a bit more for the installer to begin.[6]


2.2 Installing on SparcIPX

It is a SUN4C machine, and you need to ln -s to a SUN4C name in TFTP directory. When you press stop-A, it asks for b net instead. Other things are the same.


2.3 Installing on SparcServer1000

It is a SUN4D machine, and has Multiple CPUs. I tried with Slink boot disks, but it died with several weird messages. watchdog-reset was one of them, and it was probably caused by the kernel hecking up on something.

I tried potato boot-disk version 2.2.8, and it worked fine. Installation was quite smooth, up until the point where it wanted to reboot the machine: it didn't. SparcServer apparently doesn't reboot.

Anyway, this increased the number of machines available. It's quite nice. However, the multi-CPU feature of this machine could not be taken advantage of. It's a shame really. SuperSparc's SMP is not supported by the kernel.

Another note here, is that on using an unstable version of Debian. Well, Potato, at the time of testing was not unstable, but frozen, but it was still unustable sometimes. It had a glitch with libstdc++, and it had to be manually upgraded or apt-get would not even start. It was something serious.[7]


2.4 Running the machines

It is important to consider heat and electricity problems. Lots of machines produce lots of heat, and consume a lot of electricity. I have encountered both electricity problem, and heat problem.

Machines seem to have a variable electricity consumption, and it probably increases when the HD is accessed. This is quite a critical problem in that when a machine crashes on writing the harddisk, the machine is guaranteed to crash.

Also, heat problem is important. The first thing worth noting is closing of the box itself is important. Often, when doing configuration of machines, people work with the box open. However, that sometimes causes problems with unexplainable failure in unexpected places and random reboot. Also, when the 20 or so machines were put together in one rack here, it went berserk with the heat, and stopped responding to pings, or anything. After putting a fan in front of all the machines made it okay. These machines were probably never designed to be stacked up on a tower, and not so close together. So, when designing a beowulf cluster, make sure that air flows smoothly. Air-cooling machines (as most of the popular machines are) are cooled by air.


[back] [Abstract] [Copyright Notice] [Contents] [next]
Sparc cluster installation experience: The "Sparcgate"@doshisha.ac.jp
Junichi Uekawa dancer@netfort.gr.jp