HP DL20 Gen 9 and Centos 7.4

File this under "I hate you HP".

In 2016 I setup a HP DL20 Gen 9 server with CentOS 7. Everything worked great through multiple upgrades until September 2017. In September I upgraded CentOS 7.3 to 7.4 and immediately found that the machine would not boot with the latest kernel. Rolling back to the previous kernel allowed me to boot so I moved on for the time being. Fast forward a couple months and I decided it was time to tackle the issue. I ran the latest updates and was greeted with the same kernel panic on boot.

kernel-panic1.jpg
Kernel Panic

First thing I did was update the BIOS. I didn't think that would work but what the heck. It was old anyway...

I did a little googling and found that this was a know problem with HP servers and CentOS: https://bugs.centos.org/view.php?id=13943&nbn=1

HP's solution is to disable RAID and turn on AHCI: https://support.hpe.com/hpsc/doc/public/display?docId=emr_na-c04655546

I turned on AHCI and rebooted. Good news... The kernel panic is gone. Bad news... now it doesn't see the partitions.

cant-boot.jpg
My drives? Where are my drives?!?!

After a bit I realized that the "Recovery" kernel in the boot menu almost worked. All my data was there but I couldn't see /boot/grub2/grubenv. I think that's because the recovery Initramfs didn't have vfat support. I proceeded to follow the CentOS instructions to Create a New Initramfs. Since I had a partially working environment I didn't need to mount disks. I issued the following command to generate a new Initramfs with the latest kernel on my machine.
dracut -f /boot/initramfs-3.10.0-693.5.2.el7.x86_64.img 3.10.0-693.5.2.el7.x86_64

I rebooted and selected this kernel. Nope. Still no go. What was wrong? My drives were there, I could see all my partitions but I still couldn't boot. I stumbled on this post and realized that my Initramfs must be missing something crucial. I followed the instructions to add AHCI:
dracut --add-drivers ahci -f /boot/initramfs-3.10.0-693.5.2.el7.x86_64.img 3.10.0-693.5.2.el7.x86_64
Reboot... No luck and no improvement. It's not the lack of AHCI or at least not just AHCI.

After a lot of flailing around I found a solution was "hostonly=no". I went in /etc/dracut.conf.d/ and found hpdsa.conf. This was the broken HP raid driver so I deleted the file. Next I created /etc/dracut.conf.d/all.conf with the following contents:
hostonly=no
And generated a new Initramfs:
dracut -f /boot/initramfs-3.10.0-693.5.2.el7.x86_64.img 3.10.0-693.5.2.el7.x86_64

This created a new 52Meg Initramfs file rather than the normal 23Meg file. Because of the hostonly=no option, this file includes everything but the kitchen sink. And it boots!

I should probably dig a little deeper and figure out what driver was missing but I'll leave that for another day.
Topic revision: r1 - 06 Dec 2017, BobWicksall
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback