I got a new Dell PowerEdge 2900 delivered at a customer site thousands of miles away. I installed it remotely after instructing the people onsite to plug in the network interfaces and power. That worked, but it would have been a whole lot easier if it wasn’t for those problems:
Problem 1: stupid default firmware settings.
The PE2900 came with a DRAC5 remote management card. First problem: this card defaults to a fixed IP address of 192.168.0.120. Whose idea was that? What’s wrong with defaulting to dhcp? Anyway, I got in by tweaking my network config a bit – but I’m glad I avoid 192.168.0.0/24 subnets, because if you were in 192.168.0.0/24 and you had a system on 192.168.0.120…
At that point I could ssh into the DRAC with the default username and password. Good. The ‘connect com2′ command connects to the serial port to which the console output can be redirected from the BIOS, allowing full remote text-based interaction. Perfectly sufficient if you run a unix based operating system. Except… that redirection is disabled by default. Seriously – this is supposed to be enterprise hardware?
Here’s how to fix this:
$ racadm config -g cfgSerial -o cfgSerialCom2RedirEnable 1 $ racadm getconfig -g cfgSerial cfgSerialBaudRate=115200 cfgSerialConsoleEnable=1 cfgSerialConsoleQuitKey=^\ cfgSerialConsoleIdleTimeout=300 cfgSerialConsoleNoAuth=0 cfgSerialConsoleCommand= cfgSerialHistorySize=8192 cfgSerialCom2RedirEnable=1 cfgSerialTelnetEnable=0 cfgSerialSshEnable=1
Here’s another one: all serial settings default to 115200 baud – except for the one setting that you need to get serial after the bootloader hands off to the kernel. That one is set to 57600. Why?
$ racadm getconfig -g cfgIpmiSol cfgIpmiSolEnable=0 cfgIpmiSolBaudRate=57600 cfgIpmiSolMinPrivilege=4 cfgIpmiSolAccumulateInterval=10 cfgIpmiSolSendThreshold=220 $ racadm config -g cfgIpmiSol -o cfgIpmiSolBaudRate 115200 Object value modified successfully $ racadm getconfig -g cfgIpmiSol cfgIpmiSolEnable=0 cfgIpmiSolBaudRate=115200 cfgIpmiSolMinPrivilege=4 cfgIpmiSolAccumulateInterval=10 cfgIpmiSolSendThreshold=220
So if you get proper output from the BIOS, and grub, but only garbage from the kernel and your getty, this is why. Note: even setting (m)getty to 57200 produced garbage, seems like cfgIpmiSolBaudRate really needs to match the speed of the other serial parameters.
Problem 2: crappy QA
I was too lazy to figure out how to get the Hardy installer not use its graphical splash screen, so that meant I needed to use the web-based graphical console redirection. Similarly to my experience with HP’s ILO console redirection, this is *painful* if you run GNU/Linux on your client. Eventually I came across this blog post which worked for me, with a few caveats: I had to start from a clear profile (i.e. mv ~/.mozilla ~/.moz-backup), and using the firefox-2 packages in Ubuntu just does not work, I really had to download the Mozilla binary (I used 2.0.18).
Why does this have to be so difficult? Is it really that hard to keep the firefox modules more up to date, Dell? If you find that is the case, perhaps you should consider outsourcing this task to the free software community. Release the sources for the firefox plugin under a free software license, and watch a community develop around them to keep them up to date. Of course, that will also require cooperation from the DRAC side of things. How about opening up the source for the DRAC controller? Doing that would allow the community to fix bugs more rapidly, and add functionality. Since we’ve proven that we can do a free software system bios and mp3 player firmware, DRAC controller firmware should not be a problem.
Besides, the network interface and netstat output on the DRAC5 looks eerily familiar:
$ racadm ifconfig eth0 Link encap:Ethernet HWaddr 00:22:19:98:AF:AF inet addr:192.168.0.120 Bcast:192.168.0.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:110880 errors:0 dropped:0 overruns:0 frame:0 TX packets:74405 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:10413478 (9.9 MiB) TX bytes:40059892 (38.2 MiB) Interrupt:27 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:1020 errors:0 dropped:0 overruns:0 frame:0 TX packets:1020 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:739308 (721.9 KiB) TX bytes:739308 (721.9 KiB) $ racadm netstat Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt Iface 192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 lo 0.0.0.0 192.168.0.1 0.0.0.0 UG 0 0 0 eth0 Active Internet connections (w/o servers) Proto Recv-Q Send-Q Local Address Foreign Address State tcp 0 0 192.168.0.120:5900 192.168.1.1:3007 ESTABLISHED tcp 0 0 192.168.0.120:22 192.168.1.1:1257 ESTABLISHED tcp 0 0 192.168.0.120:5901 192.168.1.1:2091 ESTABLISHED
The DRAC5 wouldn’t be running the Linux kernel now would it? A little poking at a firmware update image confirms that yes, it is – I see Linux kernel strings, Busybox strings, etc. In fact, they didn’t even try to hid this:
$ sh --help BusyBox v1.00 (2008.08.22-17:37+0000) multi-call binary No help available.
So, where is the source? A recent thread on the linux-poweredge list at Dell suggests you can ask for SKU 420-3178. I think I might try that.