visionmaster

Members
  • Posts

    35
  • Joined

  • Last visited

Everything posted by visionmaster

  1. OK thanks for the guidance. So far I've changed the power supply from the 750w to a 1000w that I had. I checked all connections and reseated everything. I changed out disk 16 to a spare and started rebuild. It mounted fine and the data was emulated. About a minute in, disk 15 started racking up read errors and then no data was visible on that disk or 16. I stopped and shut down and moved those 2 drives onto a different controller and booted up again. Everything appears good with no errors. Rebuild restarted and the data is emulated again and visible on both drives. I guess I'll let it finish, and hopefully with no errors. I'm going to research a whole new build to get this 13 year old hardware updated. I'd like a big case like a rack mount 36 bay Supermicro hot swap. Thanks again for the assistance. -R
  2. Thanks for the replies so far. Is there a way to reenable disk 16 which got disabled (if it were related to a loose connection) and trust the parity. And recheck everything from there? Thanks, -R tower-diagnostics-20230214-0728.zip
  3. Hi, Before I mess anything up, just looking for advice how to proceed. I have 24 drives and 1 parity (I know I need to add another one, but my case is full.) System has been stable for 12+ years. Last parity check a couple of months ago with no issues (0 errors). I was going to upgrade a drive or 2 to get rid of old drives and increase the size of the 2tb to 8tb. I started a parity check to make sure it was good to go before the upgrade, and about 41% in, I got 1 drive disabled and 4 others with errors. (all read errors) My gut tells me its a loose connection issue. My webGUI has has paused the operation. Should I powerdown and re-seat all connections and do another parity check? What about the current disk that is disabled? Thanks, -R syslog.1.txt syslog.2.txt syslog.txt
  4. I've posted in the past regarding my issue with the Seagate 8TB ST8000VN0022 drives. ( https://forums.unraid.net/topic/105350-unraid-os-version-692-available/page/5/?tab=comments#comment-981758 ) and after reviewing this thread ( https://forums.unraid.net/topic/103938-69x-lsi-controllers-ironwolf-disks-disabling-summary-fix/ ) ultimately my fix was just to keep these 2 drives of mine set to never spin down as the errors occurred when the drives spin up from a spun down state. Last week I noticed that drive 12 temp wasn't reading out on the dashboard and then disk disabled. I had tried running SMART tests but it wouldn't allow it. I then removed that drive and replaced it with another same model ST8000VN0022 (as I had 2 more of these as spares which had been precleared and ready to go) and parity rebuilt drive 12 normally and completed with 0 errors. Stupidly, I set that drive to spin down at default which is 1 hour, and when it spun up again the read errors started and the disk went into a disabled state. I have not written anything to any disk since before the rebuild. I attempted to find the trust my array procedure to reenable the disk and use the parity as it had just rebuilt with zero errors. But I ended up reassigning the disks in the original configuration and when starting the array, it would only allow it with a parity sync. I left the check box for correct sync errors and was expecting it to run and complete with 0 errors as it had just rebuilt drive 12 with no errors and nothing has been written to any drive. Well, it is 7 hours in and there are 581 sync errors corrected so far. The data on drive 12 is mostly just movies and the few movies I've tested seem to play fine. What would be the best thing to do at this point? I still have the original drive 12 which I plugged into my other test server and is reading normally and appears to have no SMART errors. Should I let it finish rebuilding parity with all these new errors or just replace the original disk 12 back into the array and rebuild the parity again? Maybe a cable wasn't seated properly? Looks like I don't have any data loss at this point... I'd like to keep it that way and get an error free protected array back online ASAP. Thanks for any help, as I'm still a noob with this, eventhough I've been running unRaid successfully for over 10 years... syslog.txt
  5. I tried updating to 6.9.1, 2 weeks ago from 6.8.3 which has been stable for years. I had the Seagate drive issue after the drives spin down and then spin back up, I start getting read errors on 2 versions of my Seagate drives (ST8000VN004 and ST8000VN0022). Once reverting, no issues and stable again. I just tried to upgrade to 6.9.2 and immediately back to same 2 drives and spin up read errors, so I back to stable on 6.8.3. Hopefully fixed on next update. Those are the only type of Seagate drives affected on my system. My 4TB Seagates work fine. My drive controllers are Supermicro (MV88SX6081 8-port SATA II PCI-X Controller and MV64460/64461/64462 System Controller, Revision B).
  6. I've tried to set this up on unRaid 6.5.0 on scheduler without success. I'd like the parity check to run 4 times a year automatically, at midnight once every 3 months. Is this possible in the settings? Thanks! -Rich
  7. OK, thanks for the info! The parity upgrade completed and all still reports 0 errors. So when I get home from work I will upgrade and retire disk 17 that is having read errors.
  8. Hi, I've gotten great help in the past, so I always appreciate this forum. I can upload logs if needed, but this is the scenario, just need to know how I should proceed. I am on unRAID 6.5.0. I am planning to expand my parity to 8TB from 4TB, so I can upgrade a couple of other HDDs to to 8 TBs afterwards. I bought 5, 8TB HGST NAS on black Friday that I just got around to preclearing in my other rig. I was going to recycle the old 4TB drive to my Windows machine, because I needed it there, when the parity upgrade completed, and my goal was to add a second 8TB parity, soon after. The old 4TB parity drive is sitting unused awaiting the completion of this process, and the server has not been used (100% sure no writes to it, and 98% sure no reads from it) I completed a parity sync/check with 0 errors 2 days ago. Yesterday, about 6pm, I powered down and replaced the old 4TB parity with the newly precleared 8TB, checked all cables and started up. I reassigned the parity slot to the new drive and started a rebuild. About 6 hours in, I started to get notifications that disk 17 is giving off SMART raw read errors which are slowly starting to climb. From 32774 at 12:16am to 32956 at 8:11am. Disk 17 is a Samsung 2TB HD203WI. The main GUI screen still reports 0 errors from all drives and the parity rebuild is still moving along about 14 hours in and 24.4% with an estimated finish in 1-2 days as the speed is currently around 42 MB/sec. Disk 17 has 0 pending and 0 reallocated sectors. (SMART report included). So, should I let it finish and make disk 17 the first drive upgraded? In this scenario of letting it finish, am I writing corrupted data to my parity, with these raw read errors on disk 17? Or should I stop, reinsert the old 4TB parity, set it to trust that parity is correct as the server has not been used, and rebuild disk 17 with a spare 4TB I have on hand, then try again to upgrade parity, maybe adding a second parity first with an 8TB drive then upgrade the P1 slot with an 8TB. Or is there something else I should be doing or thinking? Thanks! tower-smart-20180417-0738.zip
  9. Hi, sorry for the late reply. Just saw this. It's mostly for the slow parity check, but several months ago, I had an issue with drives dropping out during a monthly check. That was the only time and since then, no more problems, just slow. I guess I will play with the tunables and wait for now. Thanks!
  10. I haven't had issues in years with my server, but recently have had a few problems including slow parity checks, which may be in part due to the 2 current SAS2LP cards I have. I'd like to change them out for the Dell H310 cards flashed to the LSi firmware in IT mode. I looked on eBay, but since flashing might be an issue, due to lack of hardware other than my server, I'd rather save myself some time and a headache. Anyone selling? I live in Florida.
  11. According to the GUI, the parity check ended by "error code:user abort" finding 718 errors. But I didn't stop it, it stopped itself, 9.5 hours in. I guess this happened when cable to disk 8 and several other cables got unseated? So even though it was doing a monthly parity check with correct errors set to yes, did incorrect parity get written? My question at this point is? Is there away to restart the array with disk 8 in a green state and to recheck the parity? If it comes back all ok, then chalk it up to loose cables and if there are errors this time then assume disk 8 is bad and rebuild with a new disk. I'm not sure I'm explaining my question correctly but it makes sense to me in my delirious state. Thanks.
  12. According to scheduler, in the unRAID GUI, it was set to write corrections. So does that mean that parity is incorrect and rebuilding from parity will place errors on the rebuilt disk? My gut says that the data disks are actually good (even disk 8 which is red balled).
  13. It took awhile to unmount everything and a bunch of errors popped up on several other disks. I went to the cmd console and forced unmount then did a powerdown. All cables seemed fine and I double checked and replugged everything. I powered up and everything came online ok with disk 8 in red ball state but everything else ok. I checked SMART on all the drives and all pass and seem good. 0 reallocated sectors and 0 pending sectors on all 23 drives. The parity check was going well for 9 and a half hours before the disks errored. I'm wondering if a cable of connection came loose which caused the issues. I'm sure nothing was written to the server since yesterday when everything was working fine and before 9.5 hours of parity check. At this point, should I rebuild Disk 8 with a new disk or bring it back online and run another non correcting parity check?
  14. Silly question, but should I powerdown and try reseating all the cables? I just looked and everything seems to be well connected. Will I be in danger of losing any data powering down at this point?
  15. Hi guys, I need help on how to best proceed. My monthly parity started at midnight as usual and my system was fully up to date and has been working well without any errors. I'm on 6.2.4 and it's been up for 51 days since last reboot. Around 9:35am this morning, I got 3 email notifications that said I had problems: 1) Disk 8 in error state 2) Disk 6 - 78 errors, Disk 9 - 78 errors, Disk 11 - 78 errors, Disk 8 - 690 errors 3) Parity check ended with 718 errors. I am waiting to power down, reboot, disable disks, rebuild disks, etc until I get your good advice. Let me know if I forgot anything or if more info is needed. Thanks! tower-diagnostics-20170201-1818.zip
  16. I was searching around and this seems to do what I want. (The previous work around was telnet into server and type lsof). I'm a total noob with linux. http://lime-technology.com/forum/index.php?topic=42881.0
  17. Hi. I just upgraded to Pro 6.1.9 from 5.0.6. I have the same question as stourwalk. I had a script on unMenu that Joe had written that would show the active stream including the disk number. I was not planning on reinstalling unMenu. Is there a way to have the active stream include the disk number? Thanks!
  18. Also don't forget that if purchased with some credit cards, those credit cards will extend the warranty by 1 year. I know this is true with my amex plat and visa preferred.
  19. As always, thanks for the quick input. Much appreciated. I love this forum!
  20. Hi, I was looking through the last couple of pages and it seems like this new drive I just did a 1 round preclear on, is pretty similar to axeman's, but the seek error rate and raw read error rate seem really, really high. I know the important reallocated markers are 0. Should I just consider this a healthy drive and use it as normal? It is a Seagate removed from an enclosure. I also did a preclear on a 2TB refurb WD drive that has way better looking numbers. Thanks for your input! preclear_results-3tb6-4-13.txt Fatboy_2TB_WD-WMC300165370_preclear_results.txt
  21. I also use the yearly plan for usenetserver. I added ssl-news as my fill server (EU based) about 2 months ago when the mass of incompletes were happening, and it seemed to be solid then and has a group or 2 that usenetserver doesn't.