Upgraded mobo/cpu and half of array is unmountable

July 30, 201510 yr

Hello all:

I built my system 6 years ago and some of my drives have been in service since day one. I recently upgraded the mobo etc in order to run docker containers better but when powering on the system half of the array is unmountable. It sees the drives and they are green.

I have two 8 port sata add-on cards so I my guess is one of them got fried somehow. I went ahead and ordered a new one but thought I would get a sanity check from the log to see if you folks things this is indeed the case.

A lot of read errors on the unmountable drives so I assume all 7 drives didn't go bad and it must be the add-on card.

syslog attached.

hog-syslog-20150730-2342.zip

Quote

July 30, 201510 yr

Author

Here are some relevant messages from the log:

Jul 30 23:37:00 Hog emhttp: shcmd (170): mkdir -p /mnt/disk6

Jul 30 23:37:00 Hog emhttp: shcmd (171): set -o pipefail ; mount -t reiserfs -o noatime,nodiratime /dev/md6 /mnt/disk6 |& logger

Jul 30 23:37:00 Hog kernel: REISERFS (device md6): found reiserfs format "3.6" with standard journal

Jul 30 23:37:00 Hog kernel: REISERFS (device md6): using ordered data mode

Jul 30 23:37:00 Hog kernel: reiserfs: using flush barriers

Jul 30 23:37:00 Hog kernel: REISERFS warning (device md6): sh-462 check_advise_trans_params: bad transaction max size (4294967295). FSCK?

Jul 30 23:37:00 Hog kernel: REISERFS warning (device md6): sh-2022 reiserfs_fill_super: unable to initialize journal space

Jul 30 23:37:00 Hog logger: mount: wrong fs type, bad option, bad superblock on /dev/md6,

Jul 30 23:37:00 Hog logger: missing codepage or helper program, or other error

Jul 30 23:37:00 Hog logger: In some cases useful info is found in syslog - try

Jul 30 23:37:00 Hog logger: dmesg | tail or so

Jul 30 23:37:00 Hog logger:

Jul 30 23:37:00 Hog emhttp: shcmd: shcmd (171): exit status: 32

Jul 30 23:37:00 Hog emhttp: mount error: No file system (32)

Jul 30 23:37:00 Hog emhttp: shcmd (172): rmdir /mnt/disk6

Jul 30 23:37:00 Hog emhttp: shcmd (173): mkdir -p /mnt/disk7

Jul 30 23:37:00 Hog emhttp: shcmd (174): set -o pipefail ; mount -t reiserfs -o noatime,nodiratime /dev/md7 /mnt/disk7 |& logger

Jul 30 23:37:00 Hog kernel: REISERFS (device md7): found reiserfs format "3.6" with standard journal

Jul 30 23:37:00 Hog kernel: REISERFS (device md7): using ordered data mode

Jul 30 23:37:00 Hog kernel: reiserfs: using flush barriers

Jul 30 23:37:00 Hog kernel: REISERFS warning (device md7): sh-462 check_advise_trans_params: bad transaction max size (4294967295). FSCK?

Jul 30 23:37:00 Hog kernel: REISERFS warning (device md7): sh-2022 reiserfs_fill_super: unable to initialize journal space

Jul 30 23:37:00 Hog logger: mount: wrong fs type, bad option, bad superblock on /dev/md7,

Jul 30 23:37:00 Hog logger: missing codepage or helper program, or other error

Jul 30 23:37:00 Hog logger: In some cases useful info is found in syslog - try

Jul 30 23:37:00 Hog logger: dmesg | tail or so

Jul 30 23:37:00 Hog logger:

Jul 30 23:37:00 Hog emhttp: shcmd: shcmd (174): exit status: 32

Jul 30 23:37:00 Hog emhttp: mount error: No file system (32)

Jul 30 23:37:00 Hog emhttp: shcmd (175): rmdir /mnt/disk7

Jul 30 23:37:00 Hog emhttp: shcmd (176): mkdir -p /mnt/disk8

Jul 30 23:37:00 Hog emhttp: shcmd (177): set -o pipefail ; mount -t reiserfs -o noatime,nodiratime /dev/md8 /mnt/disk8 |& logger

Jul 30 23:37:00 Hog kernel: REISERFS (device md8): found reiserfs format "3.6" with standard journal

Jul 30 23:37:00 Hog kernel: REISERFS (device md8): using ordered data mode

Jul 30 23:37:00 Hog kernel: reiserfs: using flush barriers

Jul 30 23:37:00 Hog kernel: REISERFS warning (device md8): sh-462 check_advise_trans_params: bad transaction max size (4294967295). FSCK?

Jul 30 23:37:00 Hog logger: mount: wrong fs type, bad option, bad superblock on /dev/md8,

Jul 30 23:37:00 Hog logger: missing codepage or helper program, or other error

Jul 30 23:37:00 Hog logger: In some cases useful info is found in syslog - try

Jul 30 23:37:00 Hog logger: dmesg | tail or so

Jul 30 23:37:00 Hog logger:

Jul 30 23:37:00 Hog emhttp: shcmd: shcmd (177): exit status: 32

Jul 30 23:37:00 Hog emhttp: mount error: No file system (32)

Jul 30 23:37:00 Hog emhttp: shcmd (178): rmdir /mnt/disk8

Quote

July 31, 201510 yr

I built my system 6 years ago and some of my drives have been in service since day one. I recently upgraded the mobo etc in order to run docker containers better but when powering on the system half of the array is unmountable. It sees the drives and they are green.

I have two 8 port sata add-on cards so I my guess is one of them got fried somehow. I went ahead and ordered a new one but thought I would get a sanity check from the log to see if you folks things this is indeed the case.

A lot of read errors on the unmountable drives so I assume all 7 drives didn't go bad and it must be the add-on card.

Unfortunately, it's not quite that simple, as various drives from BOTH cards are having the same trouble, are unusable at present. There are a series of roughly 30 second hangs, each followed by trouble reported by numerous drives, all attached to the 2 cards only. There are also sections of DMAR errors, reminiscent of the Marvell controller issues with virtualization. Since they are both Marvell chipset based cards, try turning off IOMMU (or possibly all virtualization) in the BIOS settings, and start again. By the way, under these conditions, I certainly would not assign Disk 13, or do much of anything with the array, until all drives are operational again.

I have to say that this is the first time I've seen a problem quite like yours, with the 30 second hangs, that start even before the initialization is complete. Something's very wrong, but I don't know what. Drives on both cards were able to identify themselves correctly, were fully set up without issue, then after each hang return bad values and become unusable. Perhaps disabling the virtualization will help, but if so, check for firmware updates. You've obviously got a large investment in your system, be a shame to lose virtualization capabilities, if that's the problem.

Quote

July 31, 201510 yr

Author

Thanks for the help. I did disable IOMMU and the boot was much fast (no 30 second delay) but the drives are still showing as unmountable.

There are some bios updates so I may try that as well. I have attached a new syslog with IOMMU turned off.

Excerpt showing no hang:

----------

Jul 31 18:16:17 Hog emhttp: shcmd (51): mkdir -p /mnt/disk10

Jul 31 18:16:17 Hog kernel: REISERFS warning (device md9): sh-2022 reiserfs_fill_super: unable to initialize journal space

Jul 31 18:16:17 Hog emhttp: shcmd (52): set -o pipefail ; mount -t reiserfs -o noatime,nodiratime /dev/md10 /mnt/disk10 |& logger

Jul 31 18:16:17 Hog kernel: REISERFS (device md10): found reiserfs format "3.6" with standard journal

Jul 31 18:16:17 Hog kernel: REISERFS (device md10): using ordered data mode

Jul 31 18:16:17 Hog kernel: reiserfs: using flush barriers

Jul 31 18:16:19 Hog kernel: REISERFS warning (device md10): sh-462 check_advise_trans_params: bad transaction max size (4294967295). FSCK?

Jul 31 18:16:19 Hog logger: mount: wrong fs type, bad option, bad superblock on /dev/md10,

Jul 31 18:16:19 Hog logger: missing codepage or helper program, or other error

Jul 31 18:16:19 Hog logger: In some cases useful info is found in syslog - try

Jul 31 18:16:19 Hog logger: dmesg | tail or so

Jul 31 18:16:19 Hog logger:

Jul 31 18:16:19 Hog emhttp: shcmd: shcmd (52): exit status: 32

Jul 31 18:16:19 Hog emhttp: mount error: No file system (32)

Jul 31 18:16:19 Hog emhttp: shcmd (53): rmdir /mnt/disk10

Jul 31 18:16:19 Hog kernel: REISERFS warning (device md10): sh-2022 reiserfs_fill_super: unable to initialize journal space

Jul 31 18:16:19 Hog emhttp: shcmd (54): mkdir -p /mnt/disk11

Jul 31 18:16:19 Hog emhttp: shcmd (55): set -o pipefail ; mount -t xfs -o noatime,nodiratime /dev/md11 /mnt/disk11 |& logger

Jul 31 18:16:19 Hog kernel: XFS (md11): Mounting V5 Filesystem

Jul 31 18:16:19 Hog kernel: XFS (md11): Ending clean mount

Jul 31 18:16:19 Hog emhttp: shcmd (56): xfs_growfs /mnt/disk11 |& logger

Jul 31 18:16:19 Hog logger: meta-data=/dev/md11 isize=512 agcount=4, agsize=244188659 blks

Jul 31 18:16:19 Hog logger: = sectsz=512 attr=2, projid32bit=1

Jul 31 18:16:19 Hog logger: = crc=1 finobt=1

Jul 31 18:16:19 Hog logger: data = bsize=4096 blocks=976754633, imaxpct=5

Jul 31 18:16:19 Hog logger: = sunit=0 swidth=0 blks

Jul 31 18:16:19 Hog logger: naming =version 2 bsize=4096 ascii-ci=0 ftype=1

Jul 31 18:16:19 Hog logger: log =internal bsize=4096 blocks=476930, version=2

Jul 31 18:16:19 Hog logger: = sectsz=512 sunit=0 blks, lazy-count=1

Jul 31 18:16:19 Hog logger: realtime =none extsz=4096 blocks=0, rtextents=0

Jul 31 18:16:19 Hog emhttp: shcmd (57): mkdir -p /mnt/disk12

Jul 31 18:16:19 Hog emhttp: shcmd (58): set -o pipefail ; mount -t reiserfs -o noatime,nodiratime /dev/md12 /mnt/disk12 |& logger

Jul 31 18:16:19 Hog kernel: REISERFS (device md12): found reiserfs format "3.6" with standard journal

Jul 31 18:16:19 Hog kernel: REISERFS (device md12): using ordered data mode

Jul 31 18:16:19 Hog kernel: reiserfs: using flush barriers

Jul 31 18:16:19 Hog kernel: REISERFS warning (device md12): sh-462 check_advise_trans_params: bad transaction max size (4294967295). FSCK?

Jul 31 18:16:19 Hog logger: mount: wrong fs type, bad option, bad superblock on /dev/md12,

Jul 31 18:16:19 Hog logger: missing codepage or helper program, or other error

Jul 31 18:16:19 Hog logger: In some cases useful info is found in syslog - try

Jul 31 18:16:19 Hog logger: dmesg | tail or so

Jul 31 18:16:19 Hog logger:

Jul 31 18:16:19 Hog emhttp: shcmd: shcmd (58): exit status: 32

Jul 31 18:16:19 Hog emhttp: mount error: No file system (32)

Jul 31 18:16:19 Hog emhttp: shcmd (59): rmdir /mnt/disk12

-----------------------

hog-syslog-20150731-1820.zip

Quote

July 31, 201510 yr

Author

And just for completeness sake, the MOBO in question is:

GA-Z97X-UD5H-BK running BIOS F6

http://www.gigabyte.com/products/product-page.aspx?pid=4978#ov

Quote

July 31, 201510 yr

Author

Upgraded bios to latest version with same results. I am crying today.

Quote

July 31, 201510 yr

Author

Also booted into Ubuntu and the same drives can not be mounted which rules out UNRaid itself as having a problem.

Quote

July 31, 201510 yr

Author

Installed a new SATA card today with same results.

Guys, I am really at a loss here. Could it be the motherboard?

--

gs

Quote

July 31, 201510 yr

Author

Just upgraded to 6.1-rc2 with same unmountable drives. 6.1-rc2 log attached.

hog-syslog-20150731-1607.zip

Quote

July 31, 201510 yr

Author

Oh man... I really am in upgrade hell.

I had a good quad core AMD board laying around that I decided to try out. The first couple of boots I was seeing missing disks (2 of them) so I rechecked all the cables, and rebooted. Boot happened fine, bios on cards saw all drives and unraid started.

However, even with a different board/cpu I am now seeing the exact same unmountable drives as when I was using the new Z97 motherboard.

I have no clue.

Quote

July 31, 201510 yr

Author

Yeah, so I give up officially. Re-installed original mobo, cpu, ram and same errors.

Tried new sata cards, three motherboards, new cables, with same result.

Quote

August 1, 201510 yr

Nuke it from orbit, it's the only way to be sure...

Could it be the sata interface of one of the drive that plays tricks with the mobo?. Can you reboot with only the working hd connected and add one hd per reboot?

Quote

August 1, 201510 yr

Author

I think I figured it out.....

For some reason, the journal parameter on the reiserfs drives became toast on the unmountable drives. To fix this, I run the following:

# reiserfsck --check /dev/sdh1

//sdh1 is an example. Use the right one scxx for your drive that you see in the menu.

reiserfs_open_journal: journal parameters from the superblock does not match

to the journal headers ones. It looks like that you created your fs with old

reiserfsprogs. Journal header is fixed.

I hope my two days of frustration helps people in the future.... not sure how this happened to so many drive.

Quote

August 1, 201510 yr

The proper way to run reiserfsck is against the md devices. Since you've run it against sdh1, parity is now no longer 100% in sync with the changes to the drive.

You'll notice that if you do a non-correcting parity check there will be a number of parity errors. If you are satisfied that the drive is indeed now mounting and accessing correctly, you should run a correcting parity check to bring everything back in tune.

https://lime-technology.com/wiki/index.php/Check_Disk_Filesystems

Quote

August 1, 201510 yr

Author

Yeah, the plan is to rebuild parity after I am done with all of this.

Quote

August 1, 201510 yr

Thanks for the help. I did disable IOMMU and the boot was much fast (no 30 second delay) but the drives are still showing as unmountable.

There are some bios updates so I may try that as well. I have attached a new syslog with IOMMU turned off.

Excerpt showing no hang:

----------

Jul 31 18:16:17 Hog emhttp: shcmd (51): mkdir -p /mnt/disk10

Jul 31 18:16:17 Hog kernel: REISERFS warning (device md9): sh-2022 reiserfs_fill_super: unable to initialize journal space

Jul 31 18:16:17 Hog emhttp: shcmd (52): set -o pipefail ; mount -t reiserfs -o noatime,nodiratime /dev/md10 /mnt/disk10 |& logger

Jul 31 18:16:17 Hog kernel: REISERFS (device md10): found reiserfs format "3.6" with standard journal

Jul 31 18:16:17 Hog kernel: REISERFS (device md10): using ordered data mode

Jul 31 18:16:17 Hog kernel: reiserfs: using flush barriers

Jul 31 18:16:19 Hog kernel: REISERFS warning (device md10): sh-462 check_advise_trans_params: bad transaction max size (4294967295). FSCK?

Jul 31 18:16:19 Hog logger: mount: wrong fs type, bad option, bad superblock on /dev/md10,

Just checked the new syslog, compared it with the previous, and wow, the improvement is night and day! Turning off IOMMU has completely fixed the problem. All drives are now working fine.

The part you have been including above is certainly a problem, but it's rather minor compared to all the exceptions that were occurring with so many drives. It's just damage from previous crashes, and as you have found is fixable. I think you are fine now, with IOMMU turned off, once you get each of the drives with damaged file systems repaired. I'm sorry I couldn't get back to you sooner, and save you some anguish and work.

If you could from a command prompt provide the results of the lspci command, I would appreciate it. I will probably want to check and possibly add the card model numbers to the Marvell chipsets & virtualization 'black list'. Your situation was similar in some ways, but different in others. The problem did not occur before the upgrade to the 64 bit kernels with virtualization enabled, which is characteristic of this. But what was different is that your drives did all initialize without issue, then later most but not all failed. Plus there's those 30 second hangs.

Quote

August 1, 201510 yr

Author

Rob, thanks for all of the help.

So, I figured a few things out that may be helpful to others.

I actually turned IMMU and all virt settings back on without issues. The problem did end up being both sata add-on cards in combination with virt on. On the new Mobo ( GA-Z97X-UD5H-BK) with the i7-4790k Devils Canyon CPU, my old PCI sata cards with cause these hangs every single time and sometimes system crashes. I tried both cards independently while only hooking up one drive and would still get the issues. Since these cards are so old and only support 3.0Gbs it was probably time for an upgrade anyway.

Old sata add-on cards causing the issue: SUPERMICRO AOC-SAT2-MV8 64-bit PCI-X133MHz SATA II (3.0Gb/s) Controller Card

New cards that work great out of the box: SUPERMICRO AOC-SAS2LP-MV8 PCI-Express 2.0 x8 SATA / SAS 8-Port Controller Card

I think all my frustration came from the following scenario:

With virt disabled, old cards worked but I still saw those reiserfs mount errors which made me think the cards were still bad. I should have paid closer attention to the logs and put two and two together. Old cards do work fine if you disable virt (as per your post in the defect/bug forum).

Links to parts in question:

Motherboard: http://www.newegg.com/Product/Product.aspx?Item=N82E16813128722&cm_re=ga-z97x-ud5h-bk-_-13-128-722-_-Product

CPU: http://www.newegg.com/Product/Product.aspx?Item=N82E16819117369&cm_re=BX80646I74790K_i7-4790K-_-19-117-369-_-Product

Old Sata add-on cards not working with virt: http://www.newegg.com/Product/Product.aspx?Item=N82E16815121009

New Sata add-on cards working with virt: http://www.newegg.com/Product/Product.aspx?Item=N82E16816101792

Quote

August 1, 201510 yr

Author

I am currently rebuilding parity and only have 1.5 hours left. In the meantime, I thought I would brag about my new system.

Before I was running anywhere from 70-100% cpu load depending on if the server was sitting idle.

Memory (2gb) was around 60% when idle and 100% when doing anything of note.

Check out this screenshot of the new system.

Quote

August 1, 201510 yr

Thanks, I've added the old card's model number (88SX6081) to the Defect report. I'm glad you have found a much better solution.

Quote

Upgraded mobo/cpu and half of array is unmountable

Featured Replies

Archived

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)