xoC

User Customizations · December 11, 2023

5 hours ago, mgutt said:

Yes

Thanks for your answer.

Sometime ago, I did a mistake and began to move files from the GUI (not with Rsync) and it just filled my disks with full copies.
I've deleted the non important backups, but looking at the overall size of the backups and considering my backup is from a folder where I only add files, never delete, it is way bigger than it should be if I didn't mess with a bad moving command.

Since then, I've followed your commands in that topic to move all my backups will be on the Share / Disks I want with the Rsync command. Is it possible then to run a "check" and if a file is there multiple times (from a same source path), full copies of the same file in different folders, to "convert" all this copies to one hardlink each and shrink my backup size ?

edit : oh and one more question, to better understand the "system" : if a backup share is on multiple disk, when we move with Rsync from one disk to another, can hardlinks from one drive link to data on another disk ? Because when we do "rsync --archive --hard-links --remove-source-files" it only speaks about disks, not shares. How could it know ?

User Customizations · December 11, 2023

Small question : if I want to manually delete a backup from day X, can I naively use delete from the unraid GUI or do I need an Rsync delete command ?

December 11, 2023

I have the same thing happening since an update somewhere in september/october IIRC. Usually at these times, the dashboard shows this :

image.png.7f51d49a571909dec0f1147759339efe.png

Dashboard becomes unresponsive. Everything else also. Can't even power off it doesn't respond (even from a one push on the HW button). And the terminal doesn't load so I can see what happens with HTOP...

Edit : Since the last few months, I managed to get some diagnostics during one of this hang, showing nothing also.

Edit2 : it was before September actually :

December 11, 2023

Hi. Yes the Dynamix one. I'm disabling it to test, thanks.

December 11, 2023

Hello,

My CPU activity is a bit weird. See following screenshot from just freshly rebooted server :

Is it normal ?

November 29, 2023

Extended SMART test passed, I'm gonna give it one last chance.

November 27, 2023

Do you think it's a big enough reason to get a warranty replacement ?

November 27, 2023

Ok... I bought that disk in march 2023, so it seems like a pretty bad one

November 27, 2023

So, we're back at smart errors.

Disk was at "reported uncorrect = 1" when I re-plugged it (before rebuilding the array).
I acknowledged the issue and let the server run, it rebuilt and there was no error.
This night I received a mail after parity check saying it failed, with "Disk 5 - ST4000VN006-3CW104_ZW603BKR (sdk) - active 32 C (disk has read errors) [NOK]" with reported uncorrect gone to 2.

System log show some read errors on disk 5, on sectors close to each other, but no more disk reset with the new controller. It is a quite recent disk BTW.

Nov 26 04:15:21 NAStorm kernel: ata22.00: exception Emask 0x0 SAct 0x7f SErr 0x0 action 0x0
Nov 26 04:15:21 NAStorm kernel: ata22.00: error: { UNC }
Nov 26 04:15:21 NAStorm kernel: I/O error, dev sdk, sector 146077392 op 0x0:(READ) flags 0x0 phys_seg 59 prio class 2
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077328
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077336
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077344
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077352
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077360
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077368
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077376
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077384
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077392
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077400
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077408
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077416
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077424
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077432
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077440
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077448
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077456
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077464
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077472
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077480
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077488
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077496
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077504
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077512
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077520
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077528
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077536
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077544
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077552
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077560
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077568
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077576
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077584
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077592
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077600
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077608
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077616
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077624
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077632
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077640
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077648
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077656
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077664
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077672
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077680
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077688
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077696
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077704
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077712
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077720
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077728
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077736
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077744
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077752
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077760
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077768
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077776
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077784
Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077792
Nov 26 04:30:15 NAStorm root: Fix Common Problems: Error: disk5 (ST4000VN006-3CW104_ZW603BKR) has read errors

I'm attaching current diagnostics.

nastorm-diagnostics-20231127-1642.zip

November 21, 2023

Thanks. Currently rebuilding the disabled disk, let's hope it will be good this time !

November 21, 2023

So I've bought a 6 port card based on ASM1166, upgraded the firmware to the latest.

How's the procedure to migrate disk ?

I have in mind to move all the ones connected to the maxwell & jmicron chipsets. Can I just move the 4 disks at the same time, restart and it will be recognized ?

November 10, 2023

Thanks for your clarification

November 10, 2023

It's one of the controller from the motherboard (which has 3 controllers managing 10 ports). Maybe it is failing ?

November 10, 2023

Hello !

So, it worked for ~20 days and then I got some error. It was late september, and I had too much work and no time to check, so my server has been powered down since that time.
I managed to get the diagnostics before shutting down, they are attached.

Thanks in advance.

nastorm-diagnostics-20230927-1715.zip

September 5, 2023

So it seems to work with everything plugged back as it was since 2+ years.

Could it be possible that the corrupted file system was just preventing rebuilds, as it just tried again and again to mount the disk during rebuild ?

Anyway, I'll monitor closely the next few days and thanks a lot for your answers.

September 5, 2023

So, rebuild is done, I restarted the array in normal mode :

image.png.cf54c76eb742ad431a62d79ba274bf4d.png

No Lost+Found folder on any disk.

I'll shut off the server, replug my cache disk, and see if it works back powerwise.

September 5, 2023

I'm in maintenance mode, how can I mount an array disk ? When not in rebuild, I was mounting via unassigned devices to verify it's mounting, but I don't know how to do that to an array disk.

If that helps :

image.png.a35d958cfc13d535b759ec1fbfa7695b.png

image.png.6076439ca32923dcd7dd6c949fe006c9.png

Since the beginning of the rebuild, the system log is free of error.

September 5, 2023

48 minutes ago, JorgeB said:

If the disks were disable you fixed the filesystem on the actual disk, not the emulated disk, and rebuilding will rebuild the emulated disk on top, were the emulated disks mounting before rebuilding?

One was mounting, I didn't do anything to that one, the other didn't mount and it said in the system log "can't mount disk, run xfs repair because bad primary superblock" or something like that.

41 minutes ago, itimpi said:

Did you run it from the GUI without the -n option? The default is to run a read-only check.

I did -n, then with no -n, it said to do it with -L which I did. After completing, I ran -n again and there was no error. But the disk was still unmountable. When done from command line with -L, it did mount after that.

September 5, 2023

BTW both disk1 & 2 were disabled.

If I'm not mistaken, the only way to make then enabled again in the array is to unselect the drive, run the array (maintenance mode here), stop the array, reselect the drives, re-run the array (maintenance mode again here), they now appear in blue and then start a rebuild ?

September 5, 2023

I'm currently rebuilding so it should update everything, no ?

If no, do I need to stop the rebuild, and how to run a correcting check ?

Thanks

September 5, 2023

So I unplugged my two cache disks and put their power connector on disk1 & disk2.

Parity 1, 2 and disk 1, 2 were on the same power line, in series.

I tried to mount disk 1 : unmountable, no valid superblock > please use xfs_repair

disk 2 : mountable.

For some unknown reasons, I ran again xfs_repair from the GUI and both didn't found issues.
I ran xfs_repair -n /dev/sde1 and there were error, I tried without -n it said to use -L.

I used with -L and now the disk is mountable. It seems the GUI repair didn't fix anything even if it said so, and the command line repair actually worked.

I'm right now in maintenance mode and rebuilding the array, I will see what the outcome is in a few hours.

September 1, 2023

Thanks for the info.

September 1, 2023

Is it risky to rebuild with read error problems ? Can it write "wrong" data to my disk 1 & 2 by trying to rebuild but having lot of read errors ?

September 1, 2023

On the power supply connector, from memory, there are 4 or 6 disks in series.
I'm gonna check later when I go back to my server.
Maybe I can try to swap in their bays disk 1/2 with 3/4 and then rebuild and check if the reset errors goes to disk 3/4 ?

September 1, 2023

Thanks for your answer.

In the notification archives I found that where it said parity finished but with lots of errors.

For the power, that's a good question. The Server is running with the same HW (except a few disk swaps since then) for at least two years, but the power supply is not young.

IIRC it's a 600W one. My server is connected to a 650W UPS and it doesn't show that much load.

image.png.3754cfe5f9691a63ab0ca6da3c71ebda.png

Do you think 650W is too low though for 8 disks + 2 SSD, no GPU or pcie extension cards ?

I'm remote for the server today, and it said extended smart disks need no spin down delay. Everytime I apply "never" and come back, it's back on 15min so it seems I can't run the extended tests remote. Do I need to shut down and restart the array for that option to change ?

And the short tests do absolutely nothing.

xoC

Posts

Joined

Last visited

Content Type

Profiles

Forums

Downloads

Store

Gallery

Bug Reports

Documentation

Landing

Posts posted by xoC

rsync Incremental Backup

rsync Incremental Backup

Server unreachable constantly

Regular CPU spikes from find command

Regular CPU spikes from find command

2 disks have failed, and one of them stucks the array to "mouting" even when not here

2 disks have failed, and one of them stucks the array to "mouting" even when not here

2 disks have failed, and one of them stucks the array to "mouting" even when not here

2 disks have failed, and one of them stucks the array to "mouting" even when not here

2 disks have failed, and one of them stucks the array to "mouting" even when not here

2 disks have failed, and one of them stucks the array to "mouting" even when not here

2 disks have failed, and one of them stucks the array to "mouting" even when not here

2 disks have failed, and one of them stucks the array to "mouting" even when not here

2 disks have failed, and one of them stucks the array to "mouting" even when not here

2 disks have failed, and one of them stucks the array to "mouting" even when not here

2 disks have failed, and one of them stucks the array to "mouting" even when not here

2 disks have failed, and one of them stucks the array to "mouting" even when not here

2 disks have failed, and one of them stucks the array to "mouting" even when not here

2 disks have failed, and one of them stucks the array to "mouting" even when not here

2 disks have failed, and one of them stucks the array to "mouting" even when not here

2 disks have failed, and one of them stucks the array to "mouting" even when not here

2 disks have failed, and one of them stucks the array to "mouting" even when not here

2 disks have failed, and one of them stucks the array to "mouting" even when not here

2 disks have failed, and one of them stucks the array to "mouting" even when not here

2 disks have failed, and one of them stucks the array to "mouting" even when not here