Adrian

Members
  • Posts

    193
  • Joined

  • Last visited

Posts posted by Adrian

  1. 35 minutes ago, JorgeB said:

    Also note that because one of the devices is smaller, and it's using raid1, pool can only use up to the smallest device size (128GB).

     

    Ok, so if I wanted to just fix everything and get this setup correctly, could I simply stop docker, then copy all the contents from /root/media/cache, that's where I see the appdata, docker, and some other folders that are on the SSD to a backup folder. Then replace the 128GB drive with another 256GB drive and just recreate the raid1 cache and copy the folders back?

  2. 1 hour ago, JorgeB said:

    Did you have both cache devices assigned before upgrading?

     

    It's been years since I set this up, but I've uploaded the latest logs where everything is working. If you need to know any of configuration setting values, let me know which ones specifically and I'll reply back with them.

    It looks like I have 2 physical SSDs in my server. Not really sure what I was trying to do. I see one listed under Pool Devices and the other under Unassigned Devices. I think I was trying to have one for cache and one for docker.

     

    Please note, there's a high possibility that I might not have had all this configured in a typical way.

    mediaserver-diagnostics-20230628-1139.zip

  3. 39 minutes ago, JorgeB said:

    Problem is that your cache pool is not mounting, so /mnt/cache does not exist, there's only one device assigned to the pool, and pool current consists of two devices, unassign the pool device, start array, stop array, assign both pool devices to the pool, start array, post new diags.

     

    While the cache pool not mounting is a problem, the real problem is 6.12 which obviously has some changes that somehow is breaking it. I may have jumped the gun too fast and I'm not really ready to deal with working through the issues. I'll try it again when I have some more time so I can provide more info to help sort this out, but for now I've reverted back to 6.11.5 and everything is working again. I'll try to read up on the changes and post in the release forum. Maybe there is something I can change in preparation to upgrading to 6.12.

  4. 5 hours ago, JorgeB said:
    Jun 27 20:47:14 MEDIASERVER emhttpd: no mountpoint along path: /mnt/cache/docker

     

    Looks like this path does not exist, you need to create or correct it.

     

    Yes I saw that, but it's been years since I set all this up and I don't know how to fix it. Do you do this under unassigned devices in the "Change Disk Mount Point" dialog that pops up when I click the drive? When I tried to change that to "cache" I get an error saying, "Jun 28 08:39:45 MEDIASERVER unassigned.devices: Error: Device '/dev/sdc1' mount point 'cache' - name is reserved, used in the array or a pool, or by an unassigned device."

     

    It sounds like 6.12 introduced some breaking changes. I can name it something else and mount it, but now I see it as /mnt/disks/ssd so looks like "disks" got added to the mount path. This should let me fix the Docker vDisk location, but not the Default appdata storage location which I had on the SSD also. It's only listing the user shares on the array, so I can't set it to /mnt/disks/ssd/appdata.

     

    Looks like other users are having issues after the upgrade.  


    I reading up now to see if what solutions are already out there.

  5. I recently updated to 6.12 and found my docker down. When I access the Docker tab I see the following message, Docker Service failed to start.

    I noticed there was an update to 6.12.1 and upgraded thinking maybe it had a fix for this issue. Same error message.

     

    It's been forever since I set all this up, so not even sure where to start to troubleshoot. I do see the drive listed under unassigned devices and if I mount it I can browse the contents and I see everything is there, including the docker image.

     

    Thanks in advance,

    Adrian

    mediaserver-diagnostics-20230627-2051.zip

  6. 2 hours ago, trurl said:

    As long as you don't remove or add data drives parity is valid. Parity2 has to be rebuilt if drives are reordered. 

     

    New Config, preserve all, change assignments as needed, unassign parity2. When starting the array check the parity valid box.

     

    Then stop, assign parity2, start to begin parity2 build 

    Great, thanks again. Parity2 building now.

  7. Current system has 2 Parity and 17 disks. I want to physically re-arrange the disks and want to confirm the steps.

     

    Stop Array

    New Config -> Preserve Current assignments = None?

    Configure Parity and Disk 1 - 17 in Array Devices

     

    When do I state that Parity #1 is valid?

    When do I set Parity 2 so that it rebuilds only the 2nd parity?

     

    Thank you,

    Adrian

     

  8. 1 hour ago, trurl said:

    So repairs have nothing to do with the contents of former disk14. I assume it doesn't have lost+found on it, unless you had repaired it sometime in the past.

     

    If there isn't much in lost+found and you have been able to figure out what it is everything is probably OK.

     

    The logs might have contained incomplete transactions, but they couldn't be used without mounting the disk, and the disk couldn't be mounted. Sometimes you will see these logs mentioned in syslog when a disk is being mounted.

     

    You can try to do the same with former disk1.

     

    Disregard everything I said about the folders and names being difference. I was looking in the wrong place, my fault. It's all good.

     

    Thank you trurl and everyone else for your help and guidance, I really appreciate it!

  9. 11 minutes ago, trurl said:

    Did you have to repair that unassigned disk (former disk 14) to get it to mount? 

    The old repairs that I ran were the ones previously instructed to run. I was told that was only on the emulated disks.

    Recently I just put the drive in the available slot and mounted it as an unassigned drive. I used the Web UI to browse the contents.

  10. 8 minutes ago, trurl said:

    Repair puts things in lost+found when it can't figure them out. Usually it makes up a name and doesn't know what folder it belongs to. 

    Yea that part I understood.

     

    It's just 2-3 other folders in their normal location that were not named correctly. The folders had some videos and were named with the event (concert) name, but I renamed them to include the date and city/state as the concerts are part of a tour. It's the date and city/state of the folder name that was missing.

  11. Just now, trurl said:

    Which former array disk is this? 

    Former disk 14.

     

    Disk 15 was always unassigned, it's where I would preclear from. At some point I want to rearrange the disks so I don't have a disk id being skipped, adds to the confusion.

  12. 1 hour ago, trurl said:

    Reviewed the thread.

     

    Since that disk isn't assigned, you can remove it and use that bay to try to mount one of the original disks using Unassigned Devices.

    Well this is odd, the original drive looks same. I recall the repair saying something about logs being discarded. May the folder renames were in the logs that were discarded?

     

    Any recommended tools\scripts to run to compare the drives?

  13. 15 hours ago, trurl said:

    boot up, assign the new disks to the same slots as the drives being replaced, start array to

     

     

    Rebuild completed.

     

    I quickly looked at disk 14 which is the one that had the lost+found folder and looking at some files I know I saved recently, the folders names are different. I put 3 folders with files in each and the I renamed the folder with the City\State, but I see them now with the original folder name. I recall seeing them correctly when the drive was emulated.

     

    What would I do next to get the original drive mounted so I can compare the contents?

     

    I've attached the latest diagnostics after the rebuild.

    mediaserver-diagnostics-20220721-1411.zip

  14. 25 minutes ago, trurl said:

    boot up, assign the new disks to the same slots as the drives being replaced, start array to

     

     

    Alrighty, both drives replaced and rebuilding now. Will see what happens tomorrow.

  15. 30 minutes ago, trurl said:

    rebuild to the new drives.

     

    So all my bays except 1 (where Disk 15 is) are full. So would I shutdown, pull disk 1 and 14 and insert the new drives in each and rebuild both at the same time?

  16. 10 minutes ago, JonathanM said:

    Do the emulated drives mount normally now?

     

    I think so. Still shows disabled\emulated. But I can access them through their direct share.

    One of the disks has a lost+found folder which I assume is from the repair?

     

    So would I next set these aside and rebuild onto new drives and then I can compare the rebuilt drives to the repaired ones?

  17. Ran it with -L option

    Disk 1

    Phase 1 - find and verify superblock...
    sb root inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 128
    resetting superblock root inode pointer to 128
    sb realtime bitmap inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 129
    resetting superblock realtime bitmap inode pointer to 129
    sb realtime summary inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 130
    resetting superblock realtime summary inode pointer to 130
    Phase 2 - using internal log
            - zero log...
    ALERT: The filesystem has valuable metadata changes in a log which is being
    destroyed because the -L option was used.
            - scan filesystem freespace and inode maps...
    clearing needsrepair flag and regenerating metadata
    sb_icount 0, counted 63776
    sb_ifree 0, counted 179
    sb_fdblocks 1952984865, counted 929448093
            - found root inode chunk
    Phase 3 - for each AG...
            - scan and clear agi unlinked lists...
            - process known inodes and perform inode discovery...
            - agno = 0
            - agno = 1
            - agno = 2
            - agno = 3
            - agno = 4
            - agno = 5
            - agno = 6
            - agno = 7
            - process newly discovered inodes...
    Phase 4 - check for duplicate blocks...
            - setting up duplicate extent list...
            - check for inodes claiming duplicate blocks...
            - agno = 0
            - agno = 2
            - agno = 4
            - agno = 3
            - agno = 5
            - agno = 6
            - agno = 7
            - agno = 1
    Phase 5 - rebuild AG headers and trees...
            - reset superblock...
    Phase 6 - check inode connectivity...
            - resetting contents of realtime bitmap and summary inodes
            - traversing filesystem ...
            - traversal finished ...
            - moving disconnected inodes to lost+found ...
    Phase 7 - verify and correct link counts...
    Maximum metadata LSN (1:141778) is ahead of log (1:2).
    Format log to cycle 4.
    done

     

    Disk 14

     

    Phase 1 - find and verify superblock...
    sb root inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 128
    resetting superblock root inode pointer to 128
    sb realtime bitmap inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 129
    resetting superblock realtime bitmap inode pointer to 129
    sb realtime summary inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 130
    resetting superblock realtime summary inode pointer to 130
    Phase 2 - using internal log
            - zero log...
    ALERT: The filesystem has valuable metadata changes in a log which is being
    destroyed because the -L option was used.
            - scan filesystem freespace and inode maps...
    clearing needsrepair flag and regenerating metadata
    sb_icount 0, counted 14784
    sb_ifree 0, counted 254
    sb_fdblocks 1952984865, counted 936669596
            - found root inode chunk
    Phase 3 - for each AG...
            - scan and clear agi unlinked lists...
            - process known inodes and perform inode discovery...
            - agno = 0
            - agno = 1
            - agno = 2
            - agno = 3
            - agno = 4
            - agno = 5
            - agno = 6
            - agno = 7
            - process newly discovered inodes...
    Phase 4 - check for duplicate blocks...
            - setting up duplicate extent list...
            - check for inodes claiming duplicate blocks...
            - agno = 0
            - agno = 7
            - agno = 4
            - agno = 1
            - agno = 3
            - agno = 6
            - agno = 5
            - agno = 2
    Phase 5 - rebuild AG headers and trees...
            - reset superblock...
    Phase 6 - check inode connectivity...
            - resetting contents of realtime bitmap and summary inodes
            - traversing filesystem ...
            - traversal finished ...
            - moving disconnected inodes to lost+found ...
    disconnected dir inode 11307331946, moving to lost+found
    Phase 7 - verify and correct link counts...
    resetting inode 191 nlinks from 2 to 3
    Maximum metadata LSN (1:93159) is ahead of log (1:2).
    Format log to cycle 4.
    done

  18. Disk 1

     

    Phase 1 - find and verify superblock... bad primary superblock - bad CRC in superblock !!! attempting to find secondary superblock... .found candidate secondary superblock... verified secondary superblock... writing modified primary superblock sb root inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 128 resetting superblock root inode pointer to 128 sb realtime bitmap inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 129 resetting superblock realtime bitmap inode pointer to 129 sb realtime summary inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 130 resetting superblock realtime summary inode pointer to 130 Phase 2 - using internal log - zero log... ERROR: The filesystem has valuable metadata changes in a log which needs to be replayed. Mount the filesystem to replay the log, and unmount it before re-running xfs_repair. If you are unable to mount the filesystem, then use the -L option to destroy the log and attempt a repair. Note that destroying the log may cause corruption -- please attempt a mount of the filesystem before doing this.

     

     

     

    Disk 14

     

    Phase 1 - find and verify superblock... bad primary superblock - bad CRC in superblock !!! attempting to find secondary superblock... .found candidate secondary superblock... verified secondary superblock... writing modified primary superblock sb root inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 128 resetting superblock root inode pointer to 128 sb realtime bitmap inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 129 resetting superblock realtime bitmap inode pointer to 129 sb realtime summary inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 130 resetting superblock realtime summary inode pointer to 130 Phase 2 - using internal log - zero log... ERROR: The filesystem has valuable metadata changes in a log which needs to be replayed. Mount the filesystem to replay the log, and unmount it before re-running xfs_repair. If you are unable to mount the filesystem, then use the -L option to destroy the log and attempt a repair. Note that destroying the log may cause corruption -- please attempt a mount of the filesystem before doing this.

  19. I performed the check filesystem on both disks and this is what it displayed for both drives:

     

    Phase 1 - find and verify superblock... bad primary superblock - bad CRC in superblock !!! attempting to find secondary superblock... .found candidate secondary superblock... verified secondary superblock... would write modified primary superblock Primary superblock would have been modified. Cannot proceed further in no_modify mode. Exiting now.

  20. 1 hour ago, trurl said:

    Both disks 1,14 disabled/emulated, and both emulated disks are unmountable as you should see on Main.

     

    There are disks 16,17,18, but nothing assigned as disk15, is that as it should be?

     

    We always recommend repairing the emulated filesystems and checking the results of the repair before rebuilding on top of the same disk. Even better would be to rebuild to spares after repairing the emulated filesystems so you keep the originals as they are as another possible way to recover files.

     

    Do you have any spares?

     

    Yes, disk 1 and 14 show disabled on Main.

     

    Disk 15 isn't used. I do have a physical disk, but just never added to the array. I think I precleared it and then left it there/forgot about it :)

     

    I do have spares. Would I replace both Disk 1 and Disk 14 with the spares at the same time and then rebuild?

     

  21. 11 hours ago, itimpi said:

    This is what I would recommend.  No reason not to be running the extended test on both drives in parallel as the test is completely internal to the drive.

     

    Good to know for next time, if it ever happens again.

     

    Both extended tests completed and it looks like no errors were reported.

     

    With both tests completed, I started the array. Attached is the diagnostics file generated after I started the array.

    mediaserver-diagnostics-20220719-1417.zip