Jump to content

Empty /boot/ - wrong partition being mounted


ug2215

Recommended Posts

Howdy,

 

This problem was triggered when I added a PCIe network card, requiring a full reboot.

 

The issue appears to be that my boot drive has moved from /dev/sda to /dev/sdd, yet unRAID keeps trying to mount /dev/sda1 to /boot even though it successfully boots, apparently using /dev/sdd1. I cannot figure out how to resolve this.

 

After basically successful boot, the array is mounted and license key is recognized. However VMs leveraging virtd will not start and I see this error message in the web interface on the VM panel:

Warning: parse_ini_file(/boot/config/domain.cfg): failed to open stream: No such file or directory in /usr/local/emhttp/plugins/dynamix.vm.manager/classes/libvirt_helpers.php on line 441 

 

If I logon via SSH, I can see that /boot/ is empty:

root@Chimera:~# ls /boot/ -l
total 0

 

And /etc/mtab reflects the misunderstanding of the boot device's location (*** for emphasis, not present in actual file):

root@Chimera:~# cat /etc/mtab
proc /proc proc rw 0 0
sysfs /sys sysfs rw 0 0
tmpfs /var/log tmpfs rw,size=128m,mode=0755 0 0
******/dev/sda1 /boot vfat rw,noatime,nodiratime,umask=0,shortname=mixed 0 1*******
/mnt /mnt none rw,bind 0 0
/dev/md1 /mnt/disk1 btrfs rw,noatime,nodiratime 0 0
/dev/nvme0n1p1 /mnt/cache btrfs rw,noatime,nodiratime 0 0
shfs /mnt/user0 fuse.shfs rw,nosuid,nodev,noatime,allow_other 0 0
shfs /mnt/user fuse.shfs rw,nosuid,nodev,noatime,allow_other 0 0
/dev/loop0 /var/lib/docker btrfs rw 0 0
/dev/loop1 /etc/libvirt btrfs rw 0 0

 

In fact, there is no /dev/sda:

root@Chimera:~# ls /dev/sd*
/dev/sdb  /dev/sdb1  /dev/sdc  /dev/sdc1  /dev/sdd  /dev/sdd1

 

The correct configuration is, to some degree, reflected in /etc/fstab:

root@Chimera:~# cat /etc/fstab
/dev/disk/by-label/UNRAID  /boot     vfat   auto,rw,exec,noatime,nodiratime,umask=0,shortname=mixed  0  1

root@Chimera:~# ls -l /dev/disk/by-label/UNRAID
lrwxrwxrwx 1 root root 10 Oct 24 18:51 /dev/disk/by-label/UNRAID -> ../../sdd1

 

I can successfully mount /dev/sdd1 and see my still-intact configuration files:

root@Chimera:~# cd /tmp/
root@Chimera:/tmp# mkdir boot
root@Chimera:/tmp# mount /dev/sdd1 boot/
root@Chimera:/tmp# ls boot/
System\ Volume\ Information/  changes.txt*  license.txt*        packages/
bzimage*                      config/       make_bootable.bat*  previous/
bzroot*                       ldlinux.c32*  make_bootable_mac*  syslinux/
bzroot-gui*                   ldlinux.sys*  memtest*            syslog.txt*

 

If I stop the array, the web interface will begin to complain that I am not registered; because it cannot find the license key in /boot/.

 

With the array stopped, I can unmount /dev/sda1 from /boot/ and mount /dev/sdd1 to /boot/:

root@Chimera:/tmp# umount /boot/
root@Chimera:/tmp# mount /dev/sdd1 /boot/
root@Chimera:/tmp# ls /boot/
System\ Volume\ Information/  bzroot*      changes.txt*  ldlinux.c32*  license.txt*        make_bootable_mac*  packages/  syslinux/
bzimage*                      bzroot-gui*  config/       ldlinux.sys*  make_bootable.bat*  memtest*            previous/  syslog.txt*

 

After doing this, the web interface stops complaining that there is no valid license key; it can now verify my licensed status because /boot/ is intact.

Unfortunately, if I restart the array, /boot/ goes empty again:

root@Chimera:/tmp# ls /boot/ -l
total 0

 

This is particularly odd because mtab still reflects a correct mounting: (*** for emphasis, not present in actual file)

root@Chimera:/tmp# cat /etc/mtab
proc /proc proc rw 0 0
sysfs /sys sysfs rw 0 0
tmpfs /var/log tmpfs rw,size=128m,mode=0755 0 0
/mnt /mnt none rw,bind 0 0
******/dev/sdd1 /boot vfat rw 0 0******
/dev/md1 /mnt/disk1 btrfs rw,noatime,nodiratime 0 0
/dev/nvme0n1p1 /mnt/cache btrfs rw,noatime,nodiratime 0 0
shfs /mnt/user0 fuse.shfs rw,nosuid,nodev,noatime,allow_other 0 0
shfs /mnt/user fuse.shfs rw,nosuid,nodev,noatime,allow_other 0 0
/dev/loop0 /var/lib/docker btrfs rw 0 0
/dev/loop1 /etc/libvirt btrfs rw 0 0

 

I have, of course, tried running filesystem repairs using both Windows and fsck, to no avail.

 

The problem seems to be unRAID mounting the wrong location to /boot/, but I can't figure out how to change its mind.

 

Please advise.

Link to comment

I tried a different port to no avail, but in doing so I think I realized what is happening.

I have my VMs setup for passthrough of USB controllers. I think that when I added a new PCIe device, the NIC, it came into the order at a new value and changed the values of existing passthrough PCIe devices.

 

So, new question: how can I change VMs to not autostart from the command-line? I cannot find their XML files on-disk.

 

Note: Therefore, I hypothesize that what is happening is that unRAID is booting successfully but then passing through the USB controller hosting the boot device. Then, the VM that received it fails to startup because its PCI devices are not as expected, and it releases it. But, it "blanks out" because it was pulled and comes back as a different drive device in /dev/sd*.

Link to comment

I was unable to find them in the plugins directory.

It turns out that even if /boot/ is gone, you can change the "autostart" preference; so I was able to prevent them from coming up on boot and breaking things.

 

I was able to resolve this issue by just changing the PCI address of a passed-through USB Controller. The one I wanted to passthrough is adjacent to the one that hosts the boot media. I just needed to increment it by one to resume grabbing the correct controller. The new NIC came in at 4, pushing everything above it up by one. The two USB controllers had been 7 and 8 but became 8 and 9.

Thank you to trurl; that suggestion triggered the thought about USB passthrough. While doing that I thought: well, it's a little tricky to move this USB device because I pass through so many ports; if I pick the wrong one it will... do about what I'm seeing... Aha!

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...