Jump to content

AgentXXL

Members
  • Content Count

    170
  • Joined

  • Last visited

Everything posted by AgentXXL

  1. As mentioned above there are USB enclosures that do pass the SMART and other info from the drive(s). However, if you haven't already bought drives then why not just use the WD EasyStore/Elements or Seagate offerings? Both the stock WD and Seagate enclosures for their 10TB+ models have worked for me to pass SMART and other info via USB. Unless you're concerned about getting a longer warranty (3 - 5 yrs for retail bare drives, 2 years for drives in USB enclosures), just buy the less expensive USB drives and shuck the bare drive from the enclosure after you've let them preclear and/or stress test. The WD enclosures are almost always 'white label' REDs (the WD NAS series). Every 10TB+ Seagate that I've shucked has been a Barracuda Pro. And yes, the bare drives are still warrantied after being shucked from the USB enclosure, but in most cases only for 2 years. I've returned a bare drive to Seagate that came from an enclosure using the serial number on the bare drive itself. Not one question about why it wasn't in the USB enclosure - they just sent me a replacement sealed retail Barracuda Pro. Others have had similar experiences with recent WD drives. You save considerable money purchasing the USB drives over the bare drive. My queries to both Seagate and WD on why they do this have never been answered. They sell the same bare drive with it installed in their USB enclosure cheaper than the bare retail drive. But as mentioned above, most of the USB drives from WD/Seagate have only 2 years of warranty, even though the bare drives inside are the same model as their bare drives. Hope that helps.
  2. New backplanes? Unless I've missed something, Norcotek is now out of business. A few months ago I emailed and tried calling (disconnected number) to see if I could get replacement backplanes for my 4220. No response from the email and I've seen other mentions that Norcotek is dead. Not sure what kind of speeds you're seeing, but I've got 18 drives in mine and the speed maxes at about 150Mbps for writes to the array. More common to see it around 80 - 110 Mbps. The SATA/SAS controller and the motherboard+CPU combo plays into this as well. 16 of my drives are connected to my LSI 9201-16i which a PCIe 2.0 card that I have installed in a PCIe x8 slot. Max speed of this LSI is 6Gbps (SATA3) but it's also limited by the rest of the system and how many PCIe lanes are in use and/or dedicated to other hardware. I'm looking at a Supermicro enclosure to eventually replace my 4220 but for now I've removed the defective backplanes and direct-wired to each drive using miniSAS SFF-8087 to 4 SATA forward breakout cables. And of course separate power for each drive too. Definitely a LOT more mess than using the backplanes, but at least my system is now not throwing random UDMA CRC errors that it did when using the backplanes. I may look at upgrading the LSI to a PCIe 3.0 version with 12Gbps capability, but not until after I get a new motherboard/CPU. I'm budgeting to eventually pickup a Threadripper setup so I can run a few more VMs and still have some CPU core headroom. Dale
  3. No problem. Glad that's helping, but you don't have to delete all files and re-download - just delete the bad/renamed files for each affected download and attempt a re-postprocess from nzbget. For example, delete all files that have been renamed with the extension '.1' or any files that have had the extra leading '0' attached to the part identifier. Occassionally I'll have to ask nzbget to 'download remaining files' and let it attempt another repair before the unpack even tries to start. For some older content that often has more missing articles, I wish I could find a way to tell nzbget to download ALL remaining files as sometimes it stops download of the next parity file and just marks the download as failed. Some older (and even sometimes new) content need the full set of parity files for par to successfully repair the archive. Note that on the failed 7zip extracts that hang, I will sometimes just stop the nzbget Docker container and then use my Windows box with 7zip installed to do the extract manually. This is rare as most times I can cleanup the intermediate download folder and nzbget will then successfully call 7zip and proceed with the extract. Dale
  4. I and other users are seeing the same issue. I've discovered a few issues that seem to be related. First is that the par check/repair stage seems to fail randomly. Sometimes nzbget reports 'PAR Success' but no matter how many times I try and re-postprocess the download, the unpack fails or gets stuck. If I run QuickPAR from Windows using the same PAR set, it often finds 1 or 2 files that have all blocks present but they need to be re-joined. Once QuickPAR has re-joined these blocks/files, then nzbget can successfully unpack. The other issue is some PAR repairs leave the renamed damaged files in the source folder. I find this confuses nzbget's unpack processing, especially when the first file in the archive set has a renamed copy. For example, if nzbget PAR does a repair/rejoin, it sometimes seems to create a file with one more leading '0' in the filename, i.e. xxxxxxxxxxxxxxxxx.7z.001 is repaired/rejoined but there is a copy of the bad file named xxxxxxxxxxxxxxxxx.7z.0001. The same can happen with rar archives - the filename might be xxxxxxxxxxxxxxxxx.part001.rar and after the repair/rejoin there's a 2nd file named xxxxxxxxxxxxxxxxx.part0001.rar. When you look at the source folder (the 'intermediate' folder for most, depending on how you have nzbget configured) and delete all the 'bad' files that have been renamed and then do a re-postprocess, the unpack will usually succeed. The 3rd case of failure I've found is the complete 'halt' of the extract/unpack process, which seems to be a bug on the way 7zip is called to process .7z archives. The logs show the unpack request is calling 7zip but the unpack hangs for some reason that the logs don't identify. Hope these findings might help others and maybe even help the nzbget team further refine their post-processing routines. Note that I've also found these same issues when using the Linuxserver.io build of the nzbget Docker container. This means the issues are likely inherent to the nzbget app and/or the par/unrar/7zip extensions. Dale
  5. I was happy when I bought it over 6 years ago and used it for FreeNAS for many years with only 8 of the 20 bays populated with drives. When I moved to unRAID about 9 months ago, I had major issues with the hot-swap SATA backplanes that Norcotek has installed in the case. I eventually had to remove all the backplanes and now the drives are direct cabled - no more easy hot-swap but I never really needed that anyways. And as far as I know, Norcotek is now out of business. They haven't responded to multiple emails asking about replacement backplanes and their phone number has been disconnected. This means you'll have to look for something else - I'm considering a Supermicro 24 disk enclosure myself, but also picked up a Rosewill 4500 so I can do a 2nd unRAID setup with up to 15 x 3.5" drives (again, all direct cabled).
  6. Try the Krusader docker container.... it's quite full-featured as a file/directory utility. Just make sure to add the paths to the mountpoints for your UD device(s) so you can copy to the array.
  7. @TechMed So if you have UPS units and they're correctly configured to do shutdowns, why does this problem happen? If the remote shares on the other systems are also UPS protected, you just need to tweak your UPS shutdown sequence so that unRAID shuts down before the other systems do. This should prevent UD lockups as the remote shares should still be valid during the unRAID shutdown. The other thing to remember is to have your network gear all UPS protected as well. If your router/switches go down, that could cause the same issue where the remote shares/systems are no longer reachable until they restart. I have 2 remote mounts and haven't encountered a UD lockup like you describe. Hopefully it's just setting the shutdown times so that unRAID shuts down before the others.
  8. If your power is failing regularly (and even if it's stable), consider adding a UPS to protect your systems from 'instant shutdowns'. UPS units are relatively inexpensive and very beneficial when you have irregular power. If the outages are brown-outs (short duration) then a UPS will prevent the problem you're encountering. And longer duration outages can use UPS signalling to the OS to do controlled shutdowns if the power level on the UPS drops too low. Instead of trying to make it work under UD, the real answer is correcting/alleviating your power issues.
  9. Likely the cause - the single USB connection is identifying the 3 drives incorrectly via it's internal controller. Happens with port multiplier setups too. There's no easy way to correct this situation other than putting each of the 3 drives into separate USB enclosures. Or just attach them via SATA to your unRAID if possible, and then transfer the data from them.
  10. Possibly.... note that the 4th drive that is mounting properly also has the same ending sequence as the 3 drives that appear identical in their identification info. Could be a compatibility issue with the 1TB hard drives having a model/serial number that's longer and doesn't differentiate until later in the sequence. Only way to know is to try. If the user chooses to remove the forgotten devices at the bottom of the UD section, that might help eliminate the potential for them to get identified as the same drive.
  11. Simple work-around is to disconnect 2 of the USB drives and only do the rename of the mountpoint with one drive attached. Then remove it and attach the next and lastly do the 3rd.
  12. Not sure they made a huge difference but the full parity rebuild on my dual parity drives took 24hrs, about 3 - 4 hrs less than previously. That's 18 data drives and 2 parity drives. Regardless of time, the more important issue is that there were zero errors after replacing the SATA cable. My 168TB+ unRAID array is running quite nicely now.
  13. If you have another PC to use you could boot from a USB Linux distro, mount the image and then share it over your network for copying to the unRAID array. Or copy the data onto another disk that's formatted in a format that UD can use. Mounting it in the QNAP and transferring from there is probably just as simple though.
  14. No port multipliers: 6 x SATA from motherboard (all Intel SATA) and 16 from the LSI 9201-16i in IT mode. The old HGST 4TB drives were also 5400 rpm whereas the rest of the drives (and the 4 new 10TB replacements) are all 7200rpm. I know rotational speed doesn't always translate to higher performance, but having all drives the same won't hurt. After replacing the cable I'm now 5% into the complete parity rebuild (used the Tools -> New Config method) and no errors (CRC or otherwise). As I said above, the data and the drives themselves are fine - it was just a bad SATA cable. That's one of the disadvantages that the LSI cards have - you can't just replace the cable for single drive as you need the SFF-8087 miniSAS to SATA breakouts (4 drives per cable). At least I have spare new cables on hand. Thanks again! Dale
  15. That's my suspicion too... the full parity rebuild after doing a Tools -> New Config took about 27 hrs but I've had my monthly non-correcting parity checks take up to 45 hrs. I'll do the full parity rebuild again as soon as I shutdown and replace the cabling. Thanks!
  16. While I realize I could have let each 4TB drive replacement rebuild from parity, it took less time to move data off the remaining 4TB drives than it would have to rebuild each one onto the new 10TB replacements. Plus I used the opportunity to use the unBalance plugin to gather certain folders so that all of their content is on one drive only (an OCD thing of mine). As for the preclear, I mentioned doing it only as a way to do an initial test of the drives before shucking them. Regardless, the drives (and the data on them) appear to be fine. I'm certain that the issues reported after the new config are cabling related so I'll go ahead and replace it and then run another parity check.... I assume just leaving the 'Write corrections to parity' option checked? Or am I better to do the Tools -> New Config route again to completely rebuild parity?
  17. I'm replying to my own topic as I went ahead with the procedure in the 1st post. All seemed to go well except that the post-replacement parity resync detected thousands (273K+) of UDMA CRC errors on one of the new drives, the 1st WD 10TB one that I installed and used to migrate data off the old 4TB drives. The main tab of the unRAID webgui shows that drive (drive 18) as having 94078 errors and the message in the parity check section says: Last check completed on Tue 17 Dec 2019 11:50:04 PM MST (yesterday), finding 93267 errors. Duration: 1 day, 3 hours, 1 minute, 56 seconds. Average speed: 102.8 MB/sec The drive is still marked good (no red X) and I've tried accessing some of the files on it with no issues seen (i.e. movies/video play properly). When the parity check completed it seems to have been set to 'Write corrections to parity', I assume because of the errors. As I'm fairly certain the drive itself and the data written to it is fine, I suspect the SATA cable might have been damaged when I installed the remaining 3 new 10TB drives. All 4 new 10TB drives are connected with the same SFF-8087 mini-SAS to SATA forward breakout cable so I'll replace it with a new one. Here's my question/dilemma: After I replace the cable, is the best solution to just start the array with the 'Write corrections to parity' option checked, or should I attempt another Tools -> New Config to rebuild the parity from scratch? Diagnostics attached. animnas-diagnostics-20191218-1229.zip
  18. I waited until the preclears finished before going any further. Sure enough the CPU usage dropped to more normal levels once the preclears were complete. I suspect my system just doesn't have enough CPU cores/threads and PCIe lanes to adequately handle the large number of physical drives attached - 22 SATA connections in total: 2 parity, 18 data, 1 cache SSD, 1 UD-mounted app/VM SSD. I'm watching for a used Threadripper 2950 setup on eBay/local classifieds to upgrade the system, which is currently a 4 core i7-6700K @ 4GHz with 32GB of RAM on a microATX motherboard. I did upgrade to 6.8 now that it's been released and the NVidia build is available. So far things are settling back down in somewhat normal operation but the system upgrade is still planned. I only use preclear mainly to weed out any disks that are bad right from the factory. If you just add a disk and let unRAID do the clear, it takes just as much time but if it fails, I'm still not sure if you're left with a degraded array or if it aborts the addition of the new drive? Regardless, I'm fine with spending the time doing the preclear in advance of adding the disk(s) to the array. As for the remote shares and them possibly slowing down the system, I have indeed seen that when one of them went offline. I've cleaned up those remote mounts and still have two of them active while I shuffle data between them and the unRAID build. So far I've not experienced the lock-ups that others have seen when remote mounts go offline, but as soon as I've moved my data to where I want it, those shares will be unmounted and removed. Dale
  19. I'm in the process at current of replacing 4 x 4TB drives (8.5years old each) with some new 10TB drives (shucked WD Elements). I followed the 'safer procedure' listed here in the Wiki. https://wiki.unraid.net/Replacing_Multiple_Data_Drives_with_a_Single_Larger_Drive The 1st drive rebuilt with no issues and since then I've been migrating data off the other 4TB drives to it and other free storage on the array. I've been using a combination of command-line, Krusader and the unBalance plugin to clear the remaining 3 drives. Once these drives are clear, I plan to try and follow this procedure: 1. Stop the array. 2. Unassign the 3 x 4TB drives to be removed from the array - set to 'No device'. 3. Power down the unRAID system, pull the 3 x 4TB drives and replace with 3 x new 10TB drives (all precleared). 4. Power up (array autostart set to No before power down) and choose Tools -> New Config. 5. Re-assign all drives to their appropriate slots and then start the array. This will launch a parity rebuild. I was just wondering if there was a quicker way to do it, but from what I can tell, there's no fast way to replace an empty smaller drive with a precleared larger drive without doing the parity rebuild. I don't have any more available SATA connections so I think the method I'm following is my only choice. Any suggestions or thoughts? Thanks! Dale
  20. I'm preclearing all 4 drives at the same time, but from the looks of things, all 4 will complete at around the 46 - 48 hr mark after starting, so late tomorrow afternoon. Each major stage (pre-read, zero, post-read) takes about the same amount of time and the 1st stage completed in 16 hrs for all 4 drives. The zero stage is 85% complete at 13hrs so also looks like 16hrs to complete. Add another 16 hrs for the post-read and that's 48 hrs. 4 days is twice the time that mine usually take so it could be related to the SATA controller you're using. Regardless, I'm going to wait it out as I'd rather not waste the time already spent doing the preclears. In the past I've been able to reboot unRAID and the preclear was resumeable where it left off. I'm just not sure I want to trust that it'll work the same after all the recent updates. I'm actually curious if the excessive CPU usage reported by the web GUI (and the poor disk I/O I'm seeing) might be that I'm preclearing all 4 drives at the same time. I haven't seen preclear slow down the system in the past, but I've previously only done 2 drives simultaneously. Regardless, I'll wait now unless I can find a way to save the state of the preclear and ensure I can resume after a reboot. EDIT: I've attached the diagnostics that I was able to get by using the 'diagnostics' command from a terminal shell. Dale animnas-diagnostics-20191209-2301.zip
  21. As this question relates to both UD and Preclear plugins, I’ll post it in both support threads. I’m one of the users who hasn’t had any UD or preclear issues until recently. I am currently preclearing 4 x 10TB drives (USB attached) via the UD integration with the preclear plugin. I’m 60% through the zero stage on all 4 drives but have seen some unusual issues today. The only change to unRAID prior to these issues was updating the UD plugin to 2019.12.08 earlier this morning. I have been deploying the various updates for UD and preclear as they’ve been released over the last 2 weeks and up until today things seemed normal. Earlier this afternoon I started seeing a LOT of sporadic buffering/disk I/O issues and the CPU performance on my unRAID system is showing 65 - 75% load pretty consistently. This is based on the unRAID Dashboard tab of the web GUI. TOP in a terminal session reports about 15% utilization. The buffering/disk I/O issues are real as transfers to/from unRAID speed up and slow down at random. I’m also seeing major interruptions to streams from Plex. As I can’t determine why the web GUI shows high CPU load but TOP doesn’t I’m wondering if the last update(s) to UD/Preclear might have broken something? I’m still running unRAID 6.7.2 and note that most of the updates I’m seeing are for 6.8 related issues. I could update to 6.8 RC9 (with NVidia extensions) but I’ve held off. As the preclear is 60% through stage 2 (zeroing), I’m concerned about the upgrade and reboot possibly wiping the preclear state and making me start it over. If I have to I’ll tough it out until the preclear finishes tomorrow on all 4 drives. Any thoughts on whether the preclear status will be saved if I update? Is there a way I can manually stop the preclears and save the state for restarting after the upgrade? I’ve tried shutting down all unnecessary plugins/Dockers/VMs and the unRAID web GUI is still showing 60%+ CPU load and I’m still seeing disk I/O issues. What could cause the discrepancy between TOP in a terminal session and the web GUI? Thoughts? Suggestions? Diagnostics would be attached but 3 attempts to download them have taken over an hour during the collection stage…. another bug/related issue? Tried on 2 separate computers on 3 different browsers (Safari/Firefox on MacOS Mojave and Firefox/Chrome on Ubuntu 18.04. EDIT: I'm attaching the diagnostics that I was able to get by running the 'diagnostics' command from a terminal shell. Still couldn't get the Tools -> Diagnostics option in the web GUI to work. Dale animnas-diagnostics-20191209-2301.zip
  22. As this question relates to both UD and Preclear plugins, I’ll post it in both support threads. I’m one of the users who hasn’t had any UD or preclear issues until recently. I am currently preclearing 4 x 10TB drives (USB attached) via the UD integration with the preclear plugin. I’m 60% through the zero stage on all 4 drives but have seen some unusual issues today. The only change to unRAID prior to these issues was updating the UD plugin to 2019.12.08 earlier this morning. I have been deploying the various updates for UD and preclear as they’ve been released over the last 2 weeks and up until today things seemed normal. Earlier this afternoon I started seeing a LOT of sporadic buffering/disk I/O issues and the CPU performance on my unRAID system is showing 65 - 75% load pretty consistently. This is based on the unRAID Dashboard tab of the web GUI. TOP in a terminal session reports about 15% utilization. The buffering/disk I/O issues are real as transfers to/from unRAID speed up and slow down at random. I’m also seeing major interruptions to streams from Plex. As I can’t determine why the web GUI shows high CPU load but TOP doesn’t I’m wondering if the last update(s) to UD/Preclear might have broken something? I’m still running unRAID 6.7.2 and note that most of the updates I’m seeing are for 6.8 related issues. I could update to 6.8 RC9 (with NVidia extensions) but I’ve held off. As the preclear is 60% through stage 2 (zeroing), I’m concerned about the upgrade and reboot possibly wiping the preclear state and making me start it over. If I have to I’ll tough it out until the preclear finishes tomorrow on all 4 drives. Any thoughts on whether the preclear status will be saved if I update? Is there a way I can manually stop the preclears and save the state for restarting after the upgrade? I’ve tried shutting down all unnecessary plugins/Dockers/VMs and the unRAID web GUI is still showing 60%+ CPU load and I’m still seeing disk I/O issues. What could cause the discrepancy between TOP in a terminal session and the web GUI? Thoughts? Suggestions? Diagnostics would be attached but 3 attempts to download them have taken over an hour during the collection stage…. another bug/related issue? Tried on 2 separate computers on 3 different browsers (Safari/Firefox on MacOS Mojave and Firefox/Chrome on Ubuntu 18.04. Dale
  23. I've been reading through this thread about the random unpacking failure with nzbget. I too started experiencing it about 2 weeks ago, at the time using the linuxserver.io container. After reading through their support thread, it appears that this issue affects both it and the binhex container. I've been using the binhex container for a few days to try and troubleshoot but now the issue is occurring very frequently. Scheduling restarts of the container every 30 minutes or 60 minutes hasn't worked as one of the downloads has been stuck repairing and the time estimate is more than 1hr. Every time the container restarts, the repair process kicks off again. I'm about to manually use the par commands on my Windows or Mac box to try the repair instead, but in the meantime other downloads have failed unpacking and are also looking like they need repair. No reports of health issues during the downloads so I wonder why the par checks are failing when they shouldn't and if that's part of the cause for stuck unpacks. I'm thinking of trying a sabnbzd container next but thought I'd at least post my comments. I can post some logs if required.
  24. Although I'm not one of the experts here, 46 hours for one full pass of the preclear (pre-read, zero and post-read) isn't unreasonable. I get similar numbers for my 10TB drives, but my 8TB preclears usually take about 40 hrs. I've seen similar errors in my syslog as well, but as the drive passed the preclear and as no SMART errors were of concern, I've used those drives with no issues. I suspect you are fine, but I too would be interested to know why random sectors report errors that don't affect the preclear. Perhaps those are dealt with through the drive's sector reallocation list but SMART has shown no re-allocated sectors for any of the drives that saw those errors. In any case, I think you're safe to not be concerned with using the drive. Hope it all goes well... now that I've had my connection issues resolved for a month, I'm finding the system to be extremely stable. I haven't even seen one UDMA CRC error, which are fully recoverable but do tend to indicate connection issues.
  25. Thanks @DanielCoffey and @Pauven for the quick responses. I guess I hadn't read up enough on what the tool does, but it was recommended to use since the tunables are what's been recommended to try and improve write performance. As it's only doing reads, I do feel a lot safer using it. My 'server room' is air conditioned so I might just set the air conditioner to run it cooler than normal to try and combat the heat generation of the test. Looking forward to the results!