ratmice

Members
  • Posts

    332
  • Joined

  • Last visited

Everything posted by ratmice

  1. Unraid - 6.7.2 So I woke up this morning to an array that has a disabled disk (single parity system) and seven disks all with millions of read errors. The array isn't mounted, and is unreachable from the network. In the shares pane only the disk shares are showing up, none of the other shares. I pulled a diagnostic report (attached below), and now wonder what the safe thing is to do. I did start a read test as indicated on the main page, but paused it almost immediately, not knowing if it would screw things up. Last night I rebooted the server, as I was having some trouble with the TV system, as well. All seemed OK, and Plex was able to rescan the TV shows to find some new items that had been added. Watched a couple of episodes, and went to bed. My inclination is that I need to shut it down and check cabling, reseat controllers, etc... as that seems the most likely cause for millions of read errors all of a sudden, but I don't want to do anything that might compromise the system. I need to figure out if all the disks are on the same controller, but need to shut it down to get at the disks to see. In poking through the forums, this morning, I see that Marvell controllers can be an issue, so I assume it's due to one of those. The trouble appears to start around 5:59 in the log, and there are indications that the controller is the issue. I am woefully deficient at understanding these logs, however. This server has been running faithfully for many years, with the current HW, just FYI. Also, I am assuming that the system disabled that one particular disk just because it can only do one with single parity, and it randomly chose it when the array crapped out. Additionally, the system log tool has only one entry: Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 134115360 bytes) in /usr/local/emhttp/plugins/dynamix/include/Syslog.php on line 20 Any help in how to proceed would be greatly appreciated, I am not very knowledgable about the deep inner working of the UnRAID system, and Linux, in general, but I am a quick learner. Hopefully some of you forum denizens will be able to help me out and point me in the right direction. Thanks for listening. tower-diagnostics-20200625-1429.zip
  2. Thanks again Johnnie. Just to be extra clear (paranoid) the UnRAID managed device number should always be the same as the disk number, correct? So if I need to zero disk 16, I would use md16. Sorry for the cluelessness.
  3. OK, so back again. I am trying to use the 'clear array drive' script in order to shrink my array. I added a drive to the array earlier today and shortly realized that another drive was acting up. I am in the process of trying to remove the newly added drive by the clear drive and then redeploy it for the dodgy drive to be rebuilt. When I run the script, it finishes instantly and the folder 'clear-me' still remains on the drive in question. This drive was only added to the array and formatted (to do so) so does not have any data on it. I don't see any pesky hidden files, so I am wondering how to get it to zero the drive?
  4. Thanks, Johnnie. You always seem to be around to answer these questions and I really appreciate it. Have a great day.
  5. Thanks for the explanation. Also, what happens if I screw up the exclusion/inclusion thing?
  6. Thanks. Just one stupid question, where the clear and remove option says "Make sure that the drive you are removing has been removed from any inclusions or exclusions for all shares, including in the global share settings." Does this apply to settings that are set to "all", as well. SO basically just change all the inclusions and exclusions to "none".
  7. So, I had a precleared disk laying about and decided to add it to a open slot in my array. No problem there, it formatted and I was off to the races. However, just after (of course), I noticed one of my older disks is showing signs of age. Is there an easy (safe) way to remove that newly added, empty disk and just rebuild the dodgy disk onto it without having to rebuild parity. Nothing has been written to the array since adding the new disk.
  8. So all of a sudden the buttons on the front panel of my Norco 4220 enclosure are not working anymore, I did play around with the connector, but they seem dead still. MB is X8SIL-F, been working fine for years. Any help would be appreciated by this not very savvy user.
  9. So my UPC battery died yesterday and forced an unclean shutdown. When I started the array again, a parity check was performed automatically. It came back with ~1500 sync errors. The other interesting thing is that last parity check (~ 2 mos. ago) there were exactly the same number of sync errors. I have included the diagnostics. I was under the impression that, unless you specifically uncheck the option, that parity checks were correcting. This does not seem to be the case as i see NOCORRECT in the syslog. In the main UnRAID window the correcting box was checked, when I went to look aft it completed. My questions are: 1. why wasn't a correcting check done, as I thought that was the default? 2. Does anyone see anything helpful in the syslog? 3. should I go ahead and do a correcting check, or is something else warranted? 4. can anyone tell if it is a particular disk that may be the culprit? Thanks for any wisdom people are willing to impart. tower-diagnostics-20180619-2146.zip
  10. @Limy: truthfully, that sounds like a decent option, however I am unsure that my current ability is up to the task. I have been avoiding VM utilization on my Mac, but may have to rethink it. Thanks for the answer.
  11. Thank you , John. Maybe I should give OSXFUSE another try.. I only need to read so that the data doesn't get dead-ended on the drives if something gets messed up.
  12. SSIA. I would really like a way to at least get at data if the server craps out. I would prefer a native method, as VMs may be beyond my knowledge level at this point. In the past I tried OSXFUSE, but my knowledge, to get it running, was obviously inadequate. Any help would be appreciated.
  13. I thought it had remained offline through a previous power cycle. Anyway, it's zeroing and will be removed ASAP. Once again, thanks all who contributed to this thread, I am always glad that this community has so many helpful, and smart, souls.
  14. Overnight the pre-failing disk was zeroed, I removed it from the array this morning . All went well. Interesting side note is that now the red-balled disk is back online like nothing is wrong? Not sure what to make of that.
  15. So, this brings up an interesting dilemma. In terms of probability of having another issue while extricating myself from the current situation, what would be the best way to proceed? 1. zero and remove the emulated drive first - worried about the stress on the pre-failing while zeroing emulated drive making it actually fail 2. zero and remove the pre-failing drive first - worried about the stress of zeroing on the drive making it actually fail 3. just bite the bullet and pull both, add the new one, new config and live without parity protection for a day. 4. I suppose if 1 or 2 fails I can always fall back to 3. am I waaaaaaay overthinking this?
  16. I'm not surprised, as I'm a bit confused, as well. What I really want to do is first get rid of the 2 flaky disks, hopefully while still having parity protection. I think part of my (infectious) confusion is unfamiliarity with the trusting parity. I think Johnnie has helped in that regard. Adding the new larger disk will happen soon, but doesn't have to be at the same time. I will start with getting rid of the emulated disk, and thus regain parity protection for the other disks in the array once it's gone. That way if anything gets FUBAR'd, I at least will have parity protection for the other disks. Then I can get to removing the failing disk and adding the new disk. Thank you for this, I thought I had read the script was having trouble. Just to be extra sure do you have to remove all files from the disk before manual zeroing, or was that just to get the script to run? (not that I'm using the script) Good info about the controllers, I will definitely look into it. I have a drive ready to become my second parity, I was waiting for the array to get into some semblance of stability before I deploy it.
  17. Thank you for taking the time to help me out. Happy Holidays! Just confirming that it is OK to zero the emulated drive (in maintenance mode) and then use a trust parity procedure to remove the disk without borking parity. I think that's what you said above.
  18. here is a new set of diagnostics after reboot. tower-diagnostics-20171216-1752.zip
  19. I was editing above as you were replying. Thanks, I see that now.
  20. Johnnie, in the write up you link to, above, the following is in the procedure: One quick way to clean a drive is reformat it! To format an array drive, you stop the array and then on the Main page click on the link for the drive and change the file system type to something different than it currently is, then restart the array. You will then be presented with an option to format it. Formatting a drive removes all of its data, and the parity drive is updated accordingly, so the data cannot be easily recovered. is this not way quicker than writing all zeros? edit: OK I see the issue, still needs to be cleared by writing all zeros, but could save a bunch of time by not having to erase all files.
  21. So , just to reiterate, all the info on these 2 drives (#1 and #16) has been copied elsewhere. So, I can reformat the failing one, and thus remove it from the array , I think by utilizing the trust parity procedure. The drive should be empty after the format and won't have any effect on parity. For the red-balled drive, can you follow a similar procedure by removing all files from the (emulated) disk and than removing it from the array, and trust parity again? I'm just not sure if that works with emulated drives that are offline. I understand (mostly) your procedure using UD. However, the data is already copied elsewhere, so I don't need to retain any data currently on them. I have attached the diagnostics to this post. redballed drive: (sdv) ST3000DM001-9YN166_S1F0KX93 Failing drive: (sdl) ST2000DM001-9YN164_S1E06XWM tower-diagnostics-20171216-1711.zip
  22. So I have a situation where there are 2 disks I want to remove from my array. It started the other day when I noticed 2 disks were showing lots of reallocated sectors. At the same time I had trouble shutting down the server, and when I restarted a forced parity check was initiated. During this parity check one of the failing drives dropped, and the parity check aborted. So the current situation is one failed drive, and one about to fail. I have backed up all information from both drives. I also have a pre-cleared, larger drive ready to deploy into the array. My question is, can the 2 drives be removed while keeping parity? I know I can get the failing drive, that is currently mountable, out of the array by emptying, or formatting, it first. But, is there a way to get rid of the faulty drive, as well, while keeping parity intact? I'm extra paranoid right now about another drive failing during the normal parity build, if I were to just go with a new config. I do not have dual parity yet, but that is the next thing on the list, as soon as I get the array back in shape. Some help, from someone who understands the finer nuances of these kind of procedures would be greatly appreciated
  23. No comments here? Flip a coin? go by age? any guidance would be much appreciated.