Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Lost ALL array drives instantly, unmountable. (6.12.8)

Featured Replies

Seems i'm about having a bad weekend.

As i decided to overhaul my tower build server to a rackmount Supermicro 36 slot SAS chassis, so i can add more disks.

On my first startup, all my array drives got the status unmountable: Unsupported partition layout.

All drives were XFS before this happened.

 

image.thumb.png.355df30724588afc5e2133405041fb6a.png

 

I have no clue why this happened. I didn't change the mainbord or controller, it was a 1 to 1 transfer of all hardware into a new case, and added some disks (not formatted yet).

 

The only difference is, before i used 3 mini-sas to 4 SATA cables to connect my harddrives, now i'm using a single SAS cable to the chassis expander. This shouldn't make that much of a difference ?

Disks not having SMART errors, and are recognised by unraid.

 

In maintenance mode, i click on a random disk, to perform a xfs repair, using the default -n option.

This is the output:

Phase 1 - find and verify superblock...
bad primary superblock - bad magic number !!!

attempting to find secondary superblock...
.found candidate secondary superblock...
verified secondary superblock...
would write modified primary superblock
Primary superblock would have been modified.
Cannot proceed further in no_modify mode.
Exiting now.

 

 

Any clues what happened, and more importantly, can this be fixed without loosing the data ?

I've read something about running xfs_repair -V on each disk, but before doing that, wanted to consult this forum first ;)

 

In my diag file, you will also see 12 Seagate disks, those are new and unrelated to the array. I already tried to boot without those new disks, it made no difference.

 

 

 

parodius-diagnostics-20240315-2035.zip

Solved by JorgeB

Maybe the expander remapped the partitions?

 

@JorgeBmay have a better option, but I seem to remember the solution was to unassign one drive at a time, start the array, stop the array, reassign the drive, let it rebuild. Several hours for each drive.

 

Whatever you do, don't remove more than one drive at a time. To test my solution, unassign one drive, start the array, and see if the emulated drive mounts properly.

  • Author
3 hours ago, JonathanM said:

Maybe the expander remapped the partitions?

 

@JorgeBmay have a better option, but I seem to remember the solution was to unassign one drive at a time, start the array, stop the array, reassign the drive, let it rebuild. Several hours for each drive.

 

Whatever you do, don't remove more than one drive at a time. To test my solution, unassign one drive, start the array, and see if the emulated drive mounts properly.

 

Yea, i think as well something have to do with the expander, because that's the only component that was really changed.

I have swapped disks in the past between direct to controller and expanders, never had such issue, although that was on HW raid. To my knowledge, expanders are just SAS switches, they don't do anything with data on drives.

Unless this was a problem that started before the HW changes, and came up on the next boot, coincidentally after i moved to a new chassis.

 

Tried to unassign one disk, but it wont allow me to start the array due Missing disk, or do i need to remove the drive physically ? 

image.png.8f3280578b6e5dcd5f74cf084b08bc02.png

 

 

Edited by Deler7

Is there not a checkbox to allow you start? Post a full screenshot of the main page.

  • Author

Ok.

When the array is stopped, i mark 1 disk as no device

image.thumb.png.74d747e500ba54d1503e067195581d88.png

...

...

image.thumb.png.9e5e763d4e4bb38a6399e6abc68fd48d.png

 

On this way, i can't start the array, as its missing a disk.

Note: disk7 is unassigned, but that was to replace a defective drive a few weeks (and reboots) ago.

Note2: The Seagate 4TB drives are

 

When i start the array with all the correct disks assigned:

image.thumb.png.3a3feda74191f67e18d7205659310377.png

...

...

image.thumb.png.61c302e59828371d9fa8f67848cfd9d5.png

 

It wants me to format the WDC drives, ofcourse not doing so, as it contains data i would love to get it back.

 

3 minutes ago, Deler7 said:

On this way, i can't start the array, as its missing a disk.

 

56 minutes ago, JonathanM said:

Is there not a checkbox to allow you start?

image.png.96bea389bc52a2878c745f0d21bde822.png

  • Author

Ah, when i checked the box, START remains grey. Closed all browsertabs, did a cache clear, logged back into server, and now it turns orange, so i can proceed now. 

Removed the disk, checked the box, start array.

 

image.thumb.png.e4ab364aa4affaeec9c5ff3c7d474d82.png

 

Stopped the array, reattach the disk i just unassigned.

image.thumb.png.86215928c2a84aaf9fc3a69845ac307b.png

 

And start the array.

 

image.thumb.png.6c52bc7ee48681313132268eb2d39f53.png

 

I assume, the waiting game starts ? ;)

image.png.41d3819eb543d6a443ef10ead798493e.png

image.png.d7dabfdd641a525b7d394c42dc618e65.png

 

 

 

 

Edited by Deler7

29 minutes ago, Deler7 said:

I assume, the waiting game starts ?

Yes and no. My theory didn't work, if it did, the disk slot being rebuilt would be mounted already. Let the rebuild complete, and wait for @JorgeB

  • Community Expert

This happened before with those Adaptec RAID controllers, they can apparently sometimes overwrite the MBR of the disks, but it's not good news that the emulated disk didn't mount, if you didn't reboot since doing that, post the diagnostics.

  • Author

After a night of spinning, the rebuild is complete.

image.thumb.png.567dd9230e904e9b5981068619884803.png

 

13 minutes ago, JorgeB said:

This happened before with those Adaptec RAID controllers, they can apparently sometimes overwrite the MBR of the disks, but it's not good news that the emulated disk didn't mount, if you didn't reboot since doing that, post the diagnostics.

 

I havent rebooted since my previous steps. My new diagnostics below 👍

 

 

parodius-diagnostics-20240316-1120.zip

  • Community Expert

No valid filesystem is being detected on the rebuilt disk1, post the output of:

blkid

and also

gdisk /dev/sdi

the latter is for disk2, to check the current partition layout.

  • Author

Allright.

 

root@PARODIUS:~# blkid
/dev/sda1: LABEL_FATBOOT="UNRAID" LABEL="UNRAID" UUID="2736-60C3" BLOCK_SIZE="512" TYPE="vfat"
/dev/loop1: TYPE="squashfs"
/dev/mapper/nvme3n1p1: LABEL="nvme-two" UUID="4128737249177850753" UUID_SUB="5585851513646370683" BLOCK_SIZE="4096" TYPE="zfs_member"
/dev/nvme0n1p1: LABEL="cache" UUID="17440156396158875726" UUID_SUB="14585744415475683753" BLOCK_SIZE="4096" TYPE="zfs_member"
/dev/nvme3n1p1: UUID="ad1f4b74-88bc-409b-8586-e81baf646027" TYPE="crypto_LUKS"
/dev/nvme2n1p1: UUID="b8d4d0c9-ec99-4505-8cce-c9a19a817ba1" TYPE="crypto_LUKS"
/dev/loop2: UUID="139601bd-7ef3-471e-9dc5-5e5e4f78d045" BLOCK_SIZE="512" TYPE="xfs"
/dev/loop0: TYPE="squashfs"
/dev/mapper/nvme2n1p1: LABEL="nvme-one" UUID="18166102060409839496" UUID_SUB="9952805448310778193" BLOCK_SIZE="4096" TYPE="zfs_member"
/dev/nvme1n1p1: LABEL="cache" UUID="17440156396158875726" UUID_SUB="926196455089919428" BLOCK_SIZE="4096" TYPE="zfs_member"
/dev/sdt1: PARTUUID="30cecb74-aef3-47de-99a5-1b6d6033178a"

 

and

 

From DISK2

root@PARODIUS:~# gdisk /dev/sdy
GPT fdisk (gdisk) version 1.0.9.1

Caution: invalid main GPT header, but valid backup; regenerating main header
from backup!

Warning: Invalid CRC on main header data; loaded backup partition table.
Warning! Main and backup partition tables differ! Use the 'c' and 'e' options
on the recovery & transformation menu to examine the two tables.

Warning! Main partition table CRC mismatch! Loaded backup partition table
instead of main partition table!

Warning! One or more CRCs don't match. You should repair the disk!
Main header: ERROR
Backup header: OK
Main partition table: ERROR
Backup partition table: OK

Partition table scan:
  MBR: not present
  BSD: not present
  APM: not present
  GPT: damaged

Found invalid MBR and corrupt GPT. What do you want to do? (Using the
GPT MAY permit recovery of GPT data.)
 1 - Use current GPT
 2 - Create blank GPT

Your answer: ^C

 

I did the same for DISK3 until DISK12 all have identical results as above.

 

In case of any help, this is the output of the rebuilded DISK1: (see below)

root@PARODIUS:~# gdisk /dev/sdt
GPT fdisk (gdisk) version 1.0.9.1

Partition table scan:
  MBR: protective
  BSD: not present
  APM: not present
  GPT: present

Found valid GPT with protective MBR; using GPT.

Command (? for help): ^C

 

 

 

 

  • Community Expert
  • Solution

gdisk output confirms the partitions got clobbered, likely by the RAID controller, you can try to fix one with gdisk to see if it works after, to play it safer I would recommend cloning the disk first with dd and do it on the clone.

  • Author

A small update from my side.

I made a DD copy from 1 disk to one of my new installed HDD's before continuing potentially screwing things (more)up ;)

Not sure how to use the gdisk tool, i just tried following a guide, using gdisk on the drive that just got its backup. I went for the "r" option (r recovery and transformation options), then choose "b" (use backup GPT header (rebuilding main)), and when finished write and exit.

Then i rebooted the server, but still no luck, same error message.

Then, in maintenance mode, in Unraid UI, i went for the same disk, and i started the xfs_repair without any arguments. Rebooted the server again, did a regular ARRAY START and there it was, this particular disk with all its files accessible again !

There were a few (just really only a few) files in the Lost+found folders, but that's OK for me.

So, i guess im on the right track (?)

 

At the moment, i'm DD'ing every disk now to my new placed harddrives to have a backup at least. When that is finished, i should follow the same procedure on each disk ?

 

Edited by Deler7

  • Community Expert
9 hours ago, Deler7 said:

When that is finished, i should follow the same procedure on each disk ?

If it worked for one it should work for the other ones, except maybe disk1, since that one was rebuilt.

 

 

  • Author
1 hour ago, JorgeB said:

If it worked for one it should work for the other ones, except maybe disk1, since that one was rebuilt.

 

 

It worked, even for DISK1 👍

image.thumb.png.36f62c315fcefd045b84b7b41dbed2eb.png

 

Thank you all very much for the support !!!

 

  • Community Expert

If you can, I would recommend replacing that controller, since it's a common issue with them.

  • Author
20 hours ago, JorgeB said:

If you can, I would recommend replacing that controller, since it's a common issue with them.

 

Oh yes, i instantly ordered a cheap LSI 9211-i8 controller and installed it yesterday on the server. As i expected to have some complications, i kept my backups, but this change from Adaptec to LSI went flawlessly. It was more like plug&play, all drives were recognised and the array stated without errors.

To make sure everything is OK, i started a paritycheck.

image.png.e174d393ab6f3ecb7634eee8eede7da4.png

Perfect !

All my issues are resolved now.

 

However, perhaps good information for others who are reading this thread in future and have the same issue. I have found the source of my original issue!

I now can reproduce the issue.

 

Not the swap to the SAS-SATA cables to SAS-Extender that caused this.

It was change of controller mode.

 

The Adaptec 7 series have 4 different controller modes you can pick for operation.

Auto

- RAID hide RAW. Act as a pure HW raid controller, unassigned disks are not exposed to OS.

- RAID expose RAW.(default) Same as above, but now it exposes unassigned disks not assigned on a HW RAID to the OS. This is the default setting at factory defaults ie. new cards.

- HBA. No RAID volumes, all disks are RAW exposed to the OS, comparable to IT mode, this is the mode you should run with UNRAID.

 

Here is the thing, although you might expect "RAID expose RAW" would be the same as "HBA" when having no RAID volumes, IT ISN'T. Switching between those 2 modes does somehow mess up with your partition tables. I did some tests on a discardable array. Create the array when controller is in HBA, then change controller to RAID expose RAW, will render the issue i had on my first post of this topic (drives are unmountable). Even when switching back to HBA mode, the damage is already done, and disk partitions needs to be fixed.

 

Well, how could this happen to me, as i wasn't messing at the controller settings when migrating to a new enclosure.

My mainbord UEFI bios. 🤬

As some are aware, when installing the 7 series (and perhaps newer) Adaptec cards on mainbord's that have UEFI, you no longer get the legacy CTRL+A for menu option at bootup, instead, you enter the controller bios within your mainbord UEFI bios at the Add-On devices sub-menu.

 

For me unknown reason, the mainbord also remembers what slot the controller is plugged into, and keep it settings by slot. Meaning, when you change PCIe slot of the controller, at my Gigabyte Z590 mainbord, it defaults the controller to its original setting "RAID expose RAW".

 

And indeed, when i was migrating to the enclosure, i did a one-time boot the system with the controller on a different PCIe slot, as i wanted to test a GPU in its main slot.

I didn't check the array at that time, but i did a shutdown, moved the controller back to its original slot, but the 'damage' was already done. Hence my issue above.

 

This seems only happening on my Gigabyte Z590 UEFI based mainbord, as my older dual Xeon legacy BIOS mainbord does not behave like this. On the legacy board, i can move the controller to any PCIe slot i like, it does not change it's mode. But when i install the controller to my Gigabyte Z590, it store my settings per PCIe slot.

 

My lesson learned: do NOT change the PCIe slot of Adaptec cards at UEFI based mainbord's without verifying settings before fist boot. Or buy a LSI HBA in IT mode and never look back ;)

  • Community Expert

Thanks for posting the above, it may help other users in the future, I'll keep a link to this thread.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.