I have had this Unraid server running for 3-4 years without any issues. The config and hardware has been static since initially built/setup. Early this morning, ran into an "Unmountable: No file system" disk issue with disk 1 of a 2-disk cache pool...
Restarted the array in maintenance mode and attempted a btrfs check, but got the following in the Web UI:
bad key ordering 126 127
bad key ordering 126 127
bad key ordering 126 127
bad key ordering 126 127
ERROR: failed to read block groups: Operation not permitted
ERROR: cannot open file system
Opening filesystem to check...
Attempted mounting the filesystem per the FAQ post but got error:
root@Homeserver:/# mount -o ro,notreelog,nologreplay /dev/sdb1 /x
mount: /x: wrong fs type, bad option, bad superblock on /dev/sdb1, missing codepage or helper program, or other error.
Attempted doing an xfs_repair, but got issues with finding superblock:
root@Homeserver:/# xfs_repair -v /dev/sdb1
Phase 1 - find and verify superblock...
bad primary superblock - bad magic number !!!
attempting to find secondary superblock...
...........Sorry, could not find valid secondary superblock
Exiting now.
root@Homeserver:/#
Rebooted server and ran diagnostics (attached). The following is from syslog after reboot:
Aug 30 10:20:42 Homeserver emhttpd: shcmd (40): mkdir -p /mnt/cache
Aug 30 10:20:42 Homeserver emhttpd: cache uuid: b4197f0b-68d1-4496-9837-6e480fcf1bbc
Aug 30 10:20:42 Homeserver emhttpd: cache TotDevices: 2
Aug 30 10:20:42 Homeserver emhttpd: cache NumDevices: 2
Aug 30 10:20:42 Homeserver emhttpd: cache NumFound: 2
Aug 30 10:20:42 Homeserver emhttpd: cache NumMissing: 0
Aug 30 10:20:42 Homeserver emhttpd: cache NumMisplaced: 0
Aug 30 10:20:42 Homeserver emhttpd: cache NumExtra: 0
Aug 30 10:20:42 Homeserver emhttpd: cache LuksState: 0
Aug 30 10:20:42 Homeserver emhttpd: shcmd (41): mount -t btrfs -o noatime,nodiratime -U b4197f0b-68d1-4496-9837-6e480fcf1bbc /mnt/cache
Aug 30 10:20:42 Homeserver kernel: BTRFS info (device sdb1): disk space caching is enabled
Aug 30 10:20:42 Homeserver kernel: BTRFS info (device sdb1): has skinny extents
Aug 30 10:20:42 Homeserver kernel: BTRFS critical (device sdb1): corrupt leaf: root=2 block=243354976256 slot=127, bad key order, prev (576460982172839936 168 4096) current (229869420544 168 4096)
### [PREVIOUS LINE REPEATED 1 TIMES] ###
Aug 30 10:20:42 Homeserver kernel: BTRFS error (device sdb1): failed to read block groups: -5
Aug 30 10:20:42 Homeserver root: mount: /mnt/cache: wrong fs type, bad option, bad superblock on /dev/sdc1, missing codepage or helper program, or other error.
Aug 30 10:20:42 Homeserver emhttpd: shcmd (41): exit status: 32
Aug 30 10:20:42 Homeserver emhttpd: /mnt/cache mount error: No file system
Aug 30 10:20:42 Homeserver emhttpd: shcmd (42): umount /mnt/cache
Aug 30 10:20:42 Homeserver kernel: BTRFS error (device sdb1): open_ctree failed
Aug 30 10:20:42 Homeserver root: umount: /mnt/cache: not mounted.
Aug 30 10:20:42 Homeserver emhttpd: shcmd (42): exit status: 32
Aug 30 10:20:42 Homeserver emhttpd: shcmd (43): rmdir /mnt/cache
Guessing the disk is beyond repair and I would need to something like:
remove bad disk from pool
attempt to wipe/format/salvage disk
attempt to use what's on cache pool disk 2 to restore content to disk 1?
rejoin disks into pool?
Before I start going down these paths, thought it would be best to consult the sages on forums and avoid any (more) bad moves I could make.
Thanks in advance!
Mark.
homeserver-diagnostics-20200830-1044.zip