Jump to content

xoC

Members
  • Posts

    39
  • Joined

  • Last visited

Posts posted by xoC

  1. 5 hours ago, mgutt said:

    Yes

     

    Thanks for your answer.


    Sometime ago, I did a mistake and began to move files from the GUI (not with Rsync) and it just filled my disks with full copies.
    I've deleted the non important backups, but looking at the overall size of the backups and considering my backup is from a folder where I only add files, never delete, it is way bigger than it should be if I didn't mess with a bad moving command.

    Since then, I've followed your commands in that topic to move all my backups will be on the Share / Disks I want with the Rsync command. Is it possible then to run a "check" and if a file is there multiple times (from a same source path), full copies of the same file in different folders, to "convert" all this copies to one hardlink each and shrink my backup size ?

     

    edit : oh and one more question, to better understand the "system" : if a backup share is on multiple disk, when we move with Rsync from one disk to another, can hardlinks from one drive link to data on another disk ? Because when we do "rsync --archive --hard-links --remove-source-files" it only speaks about disks, not shares. How could it know ?

  2. I have the same thing happening since an update somewhere in september/october IIRC. Usually at these times, the dashboard shows this :

     

    image.png.7f51d49a571909dec0f1147759339efe.png

    Dashboard becomes unresponsive. Everything else also. Can't even power off it doesn't respond (even from a one push on the HW button). And the terminal doesn't load so I can see what happens with HTOP...

     

    Edit : Since the last few months, I managed to get some diagnostics during one of this hang, showing nothing also.

    Edit2 : it was before September actually :

     

  3. So, we're back at smart errors.

    Disk was at "reported uncorrect = 1" when I re-plugged it (before rebuilding the array).
    I acknowledged the issue and let the server run, it rebuilt and there was no error.
    This night I received a mail after parity check saying it failed, with "Disk 5 - ST4000VN006-3CW104_ZW603BKR (sdk) - active 32 C (disk has read errors) [NOK]" with reported uncorrect gone to 2.

    image.thumb.png.50139862eb6d31add2359d049c514c38.png

     

    System log show some read errors on disk 5, on sectors close to each other, but no more disk reset with the new controller. It is a quite recent disk BTW.

     

    Nov 26 04:15:21 NAStorm kernel: ata22.00: exception Emask 0x0 SAct 0x7f SErr 0x0 action 0x0
    Nov 26 04:15:21 NAStorm kernel: ata22.00: error: { UNC }
    Nov 26 04:15:21 NAStorm kernel: I/O error, dev sdk, sector 146077392 op 0x0:(READ) flags 0x0 phys_seg 59 prio class 2
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077328
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077336
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077344
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077352
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077360
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077368
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077376
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077384
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077392
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077400
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077408
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077416
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077424
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077432
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077440
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077448
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077456
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077464
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077472
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077480
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077488
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077496
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077504
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077512
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077520
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077528
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077536
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077544
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077552
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077560
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077568
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077576
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077584
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077592
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077600
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077608
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077616
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077624
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077632
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077640
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077648
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077656
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077664
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077672
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077680
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077688
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077696
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077704
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077712
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077720
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077728
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077736
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077744
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077752
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077760
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077768
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077776
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077784
    Nov 26 04:15:21 NAStorm kernel: md: disk5 read error, sector=146077792
    Nov 26 04:30:15 NAStorm root: Fix Common Problems: Error: disk5 (ST4000VN006-3CW104_ZW603BKR) has read errors

     

    I'm attaching current diagnostics.

    nastorm-diagnostics-20231127-1642.zip

  4. 48 minutes ago, JorgeB said:

    If the disks were disable you fixed the filesystem on the actual disk, not the emulated disk, and rebuilding will rebuild the emulated disk on top, were the emulated disks mounting before rebuilding?

     

    One was mounting, I didn't do anything to that one, the other didn't mount and it said in the system log "can't mount disk, run xfs repair because bad primary superblock" or something like that.

     

    41 minutes ago, itimpi said:

    Did you run it from the GUI without the -n option?    The default is to run a read-only check.

     

    I did -n, then with no -n, it said to do it with -L which I did. After completing, I ran -n again and there was no error. But the disk was still unmountable. When done from command line with -L, it did mount after that.

  5. So I unplugged my two cache disks and put their power connector on disk1 & disk2.

    Parity 1, 2 and disk 1, 2 were on the same power line, in series.

     

    I tried to mount disk 1 : unmountable, no valid superblock > please use xfs_repair

    disk 2 : mountable.

     

    For some unknown reasons, I ran again xfs_repair from the GUI and both didn't found issues.
    I ran xfs_repair -n /dev/sde1 and there were error, I tried without -n it said to use -L.

    I used with -L and now the disk is mountable. It seems the GUI repair didn't fix anything even if it said so, and the command line repair actually worked.

     

    I'm right now in maintenance mode and rebuilding the array, I will see what the outcome is in a few hours.

  6. Thanks for your answer.

    In the notification archives I found that where it said parity finished but with lots of errors.

    image.thumb.png.ee7505769f3d8fdd4a348db8f3a795c5.png

     

    For the power, that's a good question. The Server is running with the same HW (except a few disk swaps since then) for at least two years, but the power supply is not young.

    IIRC it's a 600W one. My server is connected to a 650W UPS and it doesn't show that much load.

    image.png.3754cfe5f9691a63ab0ca6da3c71ebda.png

     

    Do you think 650W is too low though for 8 disks + 2 SSD, no GPU or pcie extension cards ?

    I'm remote for the server today, and it said extended smart disks need no spin down delay. Everytime I apply "never" and come back, it's back on 15min so it seems I can't run the extended tests remote. Do I need to shut down and restart the array for that option to change ?

    And the short tests do absolutely nothing.

     

×
×
  • Create New...