xoC Posted September 5, 2023 Author Share Posted September 5, 2023 I'm currently rebuilding so it should update everything, no ? If no, do I need to stop the rebuild, and how to run a correcting check ? Thanks Quote Link to comment
xoC Posted September 5, 2023 Author Share Posted September 5, 2023 BTW both disk1 & 2 were disabled. If I'm not mistaken, the only way to make then enabled again in the array is to unselect the drive, run the array (maintenance mode here), stop the array, reselect the drives, re-run the array (maintenance mode again here), they now appear in blue and then start a rebuild ? Quote Link to comment
JorgeB Posted September 5, 2023 Share Posted September 5, 2023 29 minutes ago, xoC said: BTW both disk1 & 2 were disabled. If the disks were disable you fixed the filesystem on the actual disk, not the emulated disk, and rebuilding will rebuild the emulated disk on top, were the emulated disks mounting before rebuilding? Quote Link to comment
itimpi Posted September 5, 2023 Share Posted September 5, 2023 44 minutes ago, xoC said: It seems the GUI repair didn't fix anything even if it said so Did you run it from the GUI without the -n option? The default is to run a read-only check. Quote Link to comment
xoC Posted September 5, 2023 Author Share Posted September 5, 2023 48 minutes ago, JorgeB said: If the disks were disable you fixed the filesystem on the actual disk, not the emulated disk, and rebuilding will rebuild the emulated disk on top, were the emulated disks mounting before rebuilding? One was mounting, I didn't do anything to that one, the other didn't mount and it said in the system log "can't mount disk, run xfs repair because bad primary superblock" or something like that. 41 minutes ago, itimpi said: Did you run it from the GUI without the -n option? The default is to run a read-only check. I did -n, then with no -n, it said to do it with -L which I did. After completing, I ran -n again and there was no error. But the disk was still unmountable. When done from command line with -L, it did mount after that. Quote Link to comment
JorgeB Posted September 5, 2023 Share Posted September 5, 2023 1 hour ago, xoC said: the other didn't mount And is it mounting now during the rebuild? Quote Link to comment
xoC Posted September 5, 2023 Author Share Posted September 5, 2023 I'm in maintenance mode, how can I mount an array disk ? When not in rebuild, I was mounting via unassigned devices to verify it's mounting, but I don't know how to do that to an array disk. If that helps : Since the beginning of the rebuild, the system log is free of error. Quote Link to comment
JorgeB Posted September 5, 2023 Share Posted September 5, 2023 25 minutes ago, xoC said: how can I mount an array disk ? You can't, wait for the rebuild to finish and start in normal mode. P.S. rebuild can be done in normal model. 26 minutes ago, xoC said: When not in rebuild, I was mounting via unassigned devices Except for emulated disks you can only do this if the disks are mounted read-only, or it will invalidate parity, an no point in doing it for emulated disks. Quote Link to comment
xoC Posted September 5, 2023 Author Share Posted September 5, 2023 So, rebuild is done, I restarted the array in normal mode : No Lost+Found folder on any disk. I'll shut off the server, replug my cache disk, and see if it works back powerwise. Quote Link to comment
xoC Posted September 5, 2023 Author Share Posted September 5, 2023 So it seems to work with everything plugged back as it was since 2+ years. Could it be possible that the corrupted file system was just preventing rebuilds, as it just tried again and again to mount the disk during rebuild ? Anyway, I'll monitor closely the next few days and thanks a lot for your answers. Quote Link to comment
xoC Posted November 10, 2023 Author Share Posted November 10, 2023 Hello ! So, it worked for ~20 days and then I got some error. It was late september, and I had too much work and no time to check, so my server has been powered down since that time. I managed to get the diagnostics before shutting down, they are attached. Thanks in advance. nastorm-diagnostics-20230927-1715.zip Quote Link to comment
JorgeB Posted November 10, 2023 Share Posted November 10, 2023 Disk dropped offline so there's no SMART, but since it's on a Marvel controller that could be the problem, they are known to have that (and more) issues, if possible use a non Marvell controller: 1 Quote Link to comment
xoC Posted November 10, 2023 Author Share Posted November 10, 2023 It's one of the controller from the motherboard (which has 3 controllers managing 10 ports). Maybe it is failing ? Quote Link to comment
JorgeB Posted November 10, 2023 Share Posted November 10, 2023 47 minutes ago, xoC said: It's one of the controller from the motherboard (which has 3 controllers managing 10 ports) 6 of those are from the Intel SATA controller, those are fine, 2 are from a JMB controller, they *should* be fine, and the remaining two are from a Marvell controller, and that's where the disabled disk is connected. 1 Quote Link to comment
xoC Posted November 10, 2023 Author Share Posted November 10, 2023 Thanks for your clarification 1 Quote Link to comment
xoC Posted November 21, 2023 Author Share Posted November 21, 2023 So I've bought a 6 port card based on ASM1166, upgraded the firmware to the latest. How's the procedure to migrate disk ? I have in mind to move all the ones connected to the maxwell & jmicron chipsets. Can I just move the 4 disks at the same time, restart and it will be recognized ? Quote Link to comment
JorgeB Posted November 21, 2023 Share Posted November 21, 2023 47 minutes ago, xoC said: Can I just move the 4 disks at the same time, restart and it will be recognized ? Yep. 1 Quote Link to comment
xoC Posted November 21, 2023 Author Share Posted November 21, 2023 Thanks. Currently rebuilding the disabled disk, let's hope it will be good this time ! Quote Link to comment
xoC Posted November 27, 2023 Author Share Posted November 27, 2023 So, we're back at smart errors. Disk was at "reported uncorrect = 1" when I re-plugged it (before rebuilding the array). I acknowledged the issue and let the server run, it rebuilt and there was no error. This night I received a mail after parity check saying it failed, with "Disk 5 - ST4000VN006-3CW104_ZW603BKR (sdk) - active 32 C (disk has read errors) [NOK]" with reported uncorrect gone to 2. System log show some read errors on disk 5, on sectors close to each other, but no more disk reset with the new controller. It is a quite recent disk BTW. Nov 26 04:15:21 NAStorm kernel: ata22.00: exception Emask 0x0 SAct 0x7f SErr 0x0 action 0x0 Nov 26 04:15:21 NAStorm kernel: ata22.00: error: { UNC } Nov 26 04:15:21 NAStorm kernel: I/O error, dev sdk, sector 146077392 op 0x0:(READ) flags 0x0 phys_seg 59 prio class 2 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077328 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077336 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077344 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077352 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077360 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077368 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077376 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077384 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077392 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077400 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077408 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077416 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077424 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077432 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077440 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077448 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077456 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077464 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077472 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077480 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077488 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077496 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077504 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077512 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077520 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077528 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077536 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077544 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077552 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077560 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077568 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077576 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077584 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077592 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077600 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077608 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077616 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077624 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077632 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077640 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077648 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077656 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077664 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077672 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077680 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077688 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077696 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077704 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077712 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077720 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077728 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077736 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077744 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077752 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077760 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077768 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077776 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077784 Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077792 Nov 26 04:30:15 NAStorm root: Fix Common Problems: Error: disk5 (ST4000VN006-3CW104_ZW603BKR) has read errors I'm attaching current diagnostics. nastorm-diagnostics-20231127-1642.zip Quote Link to comment
JorgeB Posted November 27, 2023 Share Posted November 27, 2023 It's logged as a disk problem, since it's not the first time I would considerer replacing it, or give it one last chance. 1 Quote Link to comment
xoC Posted November 27, 2023 Author Share Posted November 27, 2023 Ok... I bought that disk in march 2023, so it seems like a pretty bad one Quote Link to comment
xoC Posted November 27, 2023 Author Share Posted November 27, 2023 Do you think it's a big enough reason to get a warranty replacement ? Quote Link to comment
JorgeB Posted November 27, 2023 Share Posted November 27, 2023 Run an extended SMART test, if it fails it should be easy to get a replacement, with just the "reported Unc" errors not sure. 1 Quote Link to comment
xoC Posted November 29, 2023 Author Share Posted November 29, 2023 Extended SMART test passed, I'm gonna give it one last chance. 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.