Jump to content

Devotee

Members
  • Posts

    14
  • Joined

  • Last visited

Everything posted by Devotee

  1. I'm running the non-correcting parity check right now, 35% and zero errors found. Yay! I still have a few things to do (test the RAM and upgrade the 14TB drive to 18TB using the old spare parity drive) but once I'm done I think I will suggest a few changes to the Unraid team: 1. After a rebuild is done, suggest to the user that, even if no errors are found during the rebuild process, performing a non-correcting parity check is recommended 2. Since the "apply corrections" option is enabled by default on the main page, maybe it would be a good idea to disable it and if the user clicks on it, show a warning suggesting that a non-correcting parity check is recommended before doing any corrections/changes to the array 3. When scheduling a parity check, do not set "Write corrections to parity disk" to yes by default and warn the user that (and I'm quoting you) "automatic parity checks should always be non-correcting. If errors are found, there should be an investigation and a probable cause found before action is taken". I always ran the parity checks with the default "apply corrections" setting enabled because I never gave it a second thought. If I had read the recommendations here in the forum (which usually doesn't happen until you have a problem) I would have always done a non-correcting parity check first, as you suggest. Even the "schedule parity check" is set to "Write corrections to parity disk" by default which, after what you've told me, is not the best thing to do (hence the third suggestion I would made to the Unraid developers).
  2. Parity check ended with 521800 sync errors corrected. Diagnostics attached. I checked the rebuilt data drives against my original checksum files and the first drive rebuilt was totally fine (0 mismatches). The second drive rebuilt did have ONE single file where the checksum did not match (a 6.6G mkv file). I will recover that particular file from the old data drive. That seems consistent with the sync errors corrected, I'm not sure how much data is one sync error, but if it's one bit or even one byte it makes sense it only affected so little data (one single file). I'm also not sure if it's normal that only one file is affected by corruption, instead of having more scattered errors around. I still have one more data drive to swap (the 14TB Seagate drive for the old 18TB Western Digital I was using for Parity) but right now I'm a bit worried ๐Ÿ˜…The next logical steps before doing that drive replacement and data rebuild should be a new non-correcting parity check as @JonathanM recommended, followed by a new Memtest86 Pro long test (as long as possible) to rule out a possible problem with RAM. Since the first thing I did was to replace the parity drive and the first 20TB data drive rebuilt was totally fine (all files matched the original) although I didn't do a non-correcting parity checks after every step I have to assume that the problem appeared when I rebuilt the second data drive. At that point I had a corrupted file in that drive and doing a correcting Parity-Check just made things worse as parity errors were found and "corrected" when, in fact, the parity was fine. Had I done a non-correcting parity check after the second data drive rebuilt and catching the corrupted file might have been solved by just rebuilding the data drive again (although the XFS error was already telling that something was not right). Lesson learned. Besides comparing checksums for all files after a data drive rebuild I will also perform a non-correcting parity check after every rebuild process from now on. I'll report the results of the non-correcting parity check and the Memtest86 Pro tests. boxy-diagnostics-20240427-1103.zip
  3. After I swapped the first data drive (12TB) for the new one (20TB) and the rebuild was done, I did check the data on it against my checksum file and it didn't find any discrepancies. I have not checked the second data drive that was rebuilt because just after the rebuild it started with the XFS errors. I'll check it against the original checksum file from the old drive as soon as the parity check ends. Coincidentally, one of the things I did before I started the server upgrade was retesting the memory using Memtest Pro and no errors were found. I'll do another memory test anyway, leaving it for as long as possible, just in case.
  4. Ah, thanks, I'm always extra careful with these things and I usually do extra steps to ensure everything is ok. Since I was replacing three "big" drives this time and doing a parity check after each replacement would mean an extra day "lost" and there was (apparently) nothing wrong with the parity or data rebuilds, I thought I would be safe enough with a final parity check after all drives had been swapped and rebuilt. My fault then ๐Ÿ˜” I will take note of your suggestion and from now on I will play on the even safer side doing extra checks after each important step. Maybe it would be nice to add some kind of warning after a rebuild is done to let the user know that a non-correcting parity check is recommended? is at 54% right now with 521794 sync errors corrected. Once it ends I will test the rebuilt data drives' contents against my checksum files to verify if all files match and I will let you know the result. If checksums match, I will do a non-correcting parity check that will hopefully return zero errors. If there's a problem with the file checksums I will also come back running to ask for more advice (although I'm assuming it should be "plug the old data drive and copy the good files over"). I said it once before and I will say it again: that's one of the beauties of Unraid, if something goes wrong you can always go to each individual drive and try to recover as many data as possible. It's even possible and easy to go back to a previous state by just re-adding an old drive to the array (if the rebuild was done in maintenance mode and you are sure that nothing was changed on the array).
  5. These are the diagnostics from the rebuild of the second data drive (after rebuilding parity drive and one data drive without errors). Old disk was: WDC_WD120EDAZ-11F3RA0_8CKXAD2F Replacement disk was: WDC_WD200EDGZ-11B9PA0_2HG0PK2N (sdf) While the rebuild was done I precleared the first old drive, WDC_WD120EDAZ-11F3RA0_5PG9BXWC (sdg). I will post the diagnostics again when the Parity-Check task ends. I did made a checksum of the contents of both data drives I planned on replacing, I think I will check them once the Parity-Check ends to be sure that all files (the rebuilt data) is correct. I already checked he first 20TB I had replaced (with no mismatch in checksums, that's why I precleared the old 12TB drive) so I will do the second 20TB disk next. The good thing is I still have the second 12TB drive untouched so if there is any mismatch I should be able to recover the original files from it. boxy-diagnostics-20240424-1231.zip
  6. I'm in the process of upgrading my Unraid server by replacing some drives in the array with bigger ones. That's what I've done for now: 1. I precleared all new drives before doing anything. Meanwhile, I ran a Parity-Check on the existing array just in case, which went fine: Parity-Check 2024-04-14, 04:33:07 18 TB 1 day, 10 hr, 25 min, 34 sec 145.2 MB/s OK 0 2. Since I had a 18TB parity drive and the new drives were all 20TB, I removed the 18TB parity drive and installed a 20TB drive, Parity-Sync went fine: Parity-Sync 2024-04-16, 05:27:27 20 TB 1 day, 15 hr, 24 min, 56 sec 141.0 MB/s OK 0 3. I removed one of my drives from the array (12TB) and I installed a 20TB drive, Data-Rebuild went fine: Data-Rebuild 2024-04-18, 08:02:15 20 TB 1 day, 14 hr, 40 min, 55 sec 143.6 MB/s OK 0 4. Next step, I removed another of my drives from the array (12TB) and I installed a 20TB drive. Again, Data-Rebuild went (apparently) fine: Data-Rebuild 2024-04-24, 12:09:44 20 TB 1 day, 11 hr, 7 min 158.2 MB/s OK 0 However, when I restarted the array the last drive that was rebuilt couldn't be mounted (the previous one -step 3- was fine) and logs showed an XFS error ("Corruption warning: Metadata has LSN ahead of current LSN unraid") and instructed me to "Please unmount and run xfs_repair". I reached the forum for some information on this error and I ran xfs_repair as suggested in many discussions: This seemed to fix the issue and starting the array would now mount all drives. I'm now running a Parity-Check and early in the process it showed 31 sync errors corrected, it was staying like that but now (~29%) it just jumped to 521770 sync errors corrected. I never had a single parity check error before (and the Unraid server has been running fine since September 2022 Should I be worried? Should I have done anything differently? Should I be doing something else now? I'm guessing I should let the Parity-Check process finish and then run it again (in non-correcting mode this time). If it doesn't show any errors I should be good, right? To be honest, I'm not sure why the second drive rebuild "failed" leaving the drive in an unmountable state (though xfs_repair seem to easily fix it and didn't find any major problems with it, AFAIK) and I also don't know why the Parity-Check should fail now (unless the xfs_repair changed something in the drive data that requires the parity to be adjusted). Any hints/suggestions are appreciated.
  7. Same issue here, my USB stick failed and I needed 6.10.3 which is the version I was currently running on my server. There's an unofficial mirror here with almost all the old versions (if not all of them): https://www.unraid.gutt.it/ The Internet Archive / Wayback Machine has several old versions stored: 6.2.4 to 6.9.2 - https://web.archive.org/web/*/https://s3.amazonaws.com/dnld.lime-technology.com/stable/* 6.9.2 to 6.12.0 - https://web.archive.org/web/*/https://unraid-dl.sfo2.cdn.digitaloceanspaces.com/stable/* Of course, I would rather have an official Unraid repository with all the trusted official releases including checksums. I'm not sure whether it exists or not (I couldn't find it) and if it doesn't exist, why is the reason for that.
  8. I can't recall exactly but I think in my case it also happened if I completely powered down the server, the only way to make the USB drive boot again was to change it to another port, let's see what @Austin Detzel says. I also recall that it was not a problem related to Unraid or its booting drive. To isolate the problem I tried with other booting systems (Ubuntu and PassMark's Memtest) and I had the same problem, they would boot fine the first time but they would refuse to boot again unless I changed the USB drive to another port. By the way, this seems to be a known issue: I vaguely remember something about Fast Boot so I would recommend @Austin Detzelto test the following (in order): 1. Disable "Fast Boot" and try again 2. If that doesn't work, in "Settings > Advanced > USB Configuration" set "XHCI Hand-off" to "Disabled" and "Legacy USB Support" to "Auto" and try again 3. If it still doesn't work, try to tweak the booting options to enable UEFI booting/Secure Boot and be sure that you're booting the Unraid USB drive in UEFI mode (rename "EFI-" folder to "EFI"). I can't restart my server right now because I'm rebuilding Parity but if you still have problems let me know and I will check what my settings are in the BIOS so you can compare and see if copying them solves your problem.
  9. I recently upgraded my system to a MSI MPG Z590 GAMING PLUS motherboard and I had that exact same problem. The system booted fine but once I rebooted or powered it off it would not boot from the USB memory, unless it was changed to a different port. I don't recall right now what I did to fix it but I think I had to tweak the Secure Boot options and/or switch to UEFI booting (my USB drive was configured to legacy on my previous motherboard). If you're not using UEFI, try that first (enable it in BIOS and rename the "EFI-" folder on the USB drive to "EFI"). If it still doesn't work try to enable/disable the option(s) related to Secure Boot. Also be sure to properly configure the boot devices and priority in the BIOS (remove everything but the USB drive) but if I remember correctly, that didn't fix my problem until I did what I explained. I'm not sure if it was because I upgraded from a quite old motherboard (10 or 12 years old) but the MSI BIOS seemed confusing and complicated compared to what I was used to. Let me know if you need any additional help.
  10. @trurl, thanks a lot for quick replies! ๐Ÿ˜Š Oh, Ok... I see what you mean. Meh, I should have asked before doing anything, mea culpa ๐Ÿ˜” I briefly thought about this approach but the word "rebuild" sounded a bit scary. I guess I should trust Unraid and go this path so I can experience what the process of replacing a drive and rebuilding it is. One of the best Unraid features, in fact the one I love most about it, is that even if I remove the 8TB drive, it will still contain an accessible copy of all its data, so if the rebuild fails, I should still have a working "backup". I will follow your advice. If I understood everything correctly, these are the steps required to swap both drives: 1. Stop the array 2. New config removing the two 18TB drives and the 14TB drive (all three still empty at the moment) 3. Start the array and rebuild parity 4. Stop the array 5. Replace Data Drive 1 (8TB) with one of the 18TB drives 6. Start the array and let it rebuild everything 7. Stop the array 8. Replace the Data Drive 2 (8TB) with the second 18TB drive 9. Start the array and let it rebuild everything 10. Stop the array 11. Add the remaining 14TB drive to the Array 12. Start the array On one hand, I'm thinking that at this point I can just remove the two 18TB drives, leave the 14TB drive, New config with Parity 18TB + original 8TB drives + already installed 14TB, then replace the 8TB drives from there. Steps 11 and 12 would not be needed. I'm also not sure if I have to rebuild parity at any other point besides Step 3 (for example, after a drive is rebuilt, though parity at that point shouldn't need to be rebuilt). Do I need to preclear the 18TB drives again ("Clear Disk") or at least do some kind of verification ("Verify Signature" or "Verify Disk") before using them to replace the 8TB drives? One of my doubts was if content could be copied between drives (via /mnt/diskX) with the array stopped, according to your comment it has to be done while the array is started, so it makes sense that parity must be updated. That's one of the things that puzzles me... Is it ok to have duplicate data between disks while the array is running?? It's the reason why I thought that the array had to be stopped to copy data between data drives. I'm assuming that you "remove" the data from those drives when they are unassigned from the array or you physically remove them from the system, is that right? Unraid will automatically "see" the old data in the new drive.
  11. Hello! I've been testing Unraid for a few months with a basic config (old MB I had around and three drives) and now I'm upgrading it and adding space. Initially I had: Parity Drive - Seagate 14TB Data Drive 1 - Western Digital 8TB Data Drive 2 - Western Digital 8TB I bought 3 x Western Digital 18TB, my idea was to replace the 14TB drive with one Western Digital 18TB drive, then add the 2 remaining Western Digital 18TB and the old Seagate 14TB drive to the array. I've already done that and everything worked like a charm ending with this configuration: Parity Drive - Western Digital 18TB Data Drive 1 - Western Digital 8TB Data Drive 2 - Western Digital 8TB Data Drive 3 - Western Digital 18TB Data Drive 4 - Western Digital 18TB Data Drive 5 - Seagate 14TB The new, precleared disks showed as "unmountable" when I started the array, so I formatted them. Right now I'm running a parity check "just in case". Since the drives were cleared I think that step was not needed but I don't mind spending the extra time on it while I ask a few things here and prepare myself for the next steps ๐Ÿ˜Š Next thing I would like to do is to move all data from the Western Digital 8TB drives to the Western Digital 18TB drives, in order to remove them from the array. I'm using high water allocation so both disks have the same approximate amount of data. I know I could use tools like unBALANCE to do the job but since it seems an easy process (copy all data from Drive 1 to Drive 3 and all data from Drive 2 to Drive 4) I'm thinking of doing it manually. How should I do this? I'm guessing a "rsync /mnt/disk1/ /mnt/disk3/" and "rsync /mnt/disk2/ /mnt/disk4/" would do the job, but I'm not sure if I have to do this while the array is stopped or if it has to be running. Would the following approach be correct? 1. Stop array 2. rsync /mnt/disk1/ /mnt/disk3/ and rsync /mnt/disk2/ /mnt/disk4/ (I guess I can do it in parallel to maximize bandwidth and save time) 3. rm -rf /mnt/disk1/* /mnt/disk2/* 4. Start the array 5. Maybe do a parity check? 6. Stop the array 7. Remove Data Drives 1 and 2 from the array 8. Start the array I would say that steps 4-5 are overkill and that maybe I just need to remove the 8TB drives at step 4 then start the array but, as a newbie, I'm not completely sure about that. ๐Ÿ˜… Last but not least, once I have the two Western Digital 8TB out of the array, I'm thinking of using them in a 2-disk BTRFS RAID0 cache pool ("Use cache pool" set to "Only") so I have a total of 16TB I could use to intensive torrent downloading and seeding. Since all the data stored there is not important (I can redownload it if needed) I think it's better to "isolate" it from the array for performance and, more importantly, to avoid stressing the parity drive with constant new data. Any advice about this approach? Thanks in advance!
  12. Hey! This (or something very similar) just happened to me a few days ago! Initially I had: Parity Drive - Seagate 14TB Data Drive 1 - Western Digital 8TB Data Drive 2 - Western Digital 8TB I bought 3 x Western Digital 18TB, my idea was to replace the 14TB drive with one Western Digital 18TB drive, then add the 2 remaining Western Digital 18TB and the old Seagate 14TB drive to the array. It's not related to this but my next planned step was to move all data from the 8TB drives to the new free space in the drive, then replace them with 2 Western Digital 12TB drives I have around, but that's another story for another topic I was planning to create. With all disks in the system, I precleared all three Western Digital 18TB as usual. Then I stopped the array, removed the 14TB from it (not physically from the machine), added one of the Western Digital 18TB drives as the new Parity Drive, then started the array. My heart skipped a beat when I suddenly was notified of this: Those two disks were perfect, I checked the SMART logs to see if anything had changed but no errors there. I even did a short and long SMART tests on both drives (after parity was rebuilt) and no error was found. Not sure why this happened, it seems to be an error due to "hot swapping" devices, the system might have been confused with old/trash/"dirty" SMART data lying around for those devices. I'm including diagnostics just in case it helps. boxy-diagnostics-20230130-2308.zip
  13. I'm still finishing my Unraid system but after two trials I would say the possibility of using different drive sizes, being able to expand storage easily by adding new drives or replacing existing ones and, in case of problems, you don't really lose all data. Even if both the parity drive and a data drive fail, you can still copy over the data of the remaining disks (and whatever you can salvage from the data drive if it didn't die completely).
ร—
ร—
  • Create New...