mort78 Posted January 19, 2022 Share Posted January 19, 2022 Good evening everyone, I was hoping someone might know what's the situation with my server. So I was cleaning up my study, and I wanted to clean underneath my server. My server was running at the time, when I lifted the bottom of the computer up, so that I could get the dust etc. Later that evening, I logged into my server, and I noticed that Unraid was warning me of read errors, and that the parity disk had been disconnected from the array. This was surprising to me, as it is basically a new drive (3 months old). It's an Exos 8TB drive. As I didn't believe the drive was an issue, I went through the process to get that drive working again... so I : 1. Deselected that drive as the Parity 2. Restarted the array 3. Stopped the array 4. Selected the same parity drive again 5. Let the Unraid server do a self check on that Parity. After about 20hrs, the Parity check was done, and 0 errors were found. The server continued to run flawlessly, as always, for about another week, when I suddenly got the exact same thing happen again. I went through the above mentioned steps, and again, 0 errors found. That was about 4 days ago. But it happened again tonight. So I'm just wondering what I can do to fix this situation. Did I "hurt" the drive by tilting/moving the server whilst it was running, when I was cleaning underneath the computer? Any thoughts would be greatly appreciated. Kind regards, Mort Quote Link to comment
trurl Posted January 19, 2022 Share Posted January 19, 2022 Moving your server could cause connection problems. Attach diagnostics to your NEXT post in this thread. Quote Link to comment
mort78 Posted January 19, 2022 Author Share Posted January 19, 2022 Good morning Trurl, Thanks for the reply. The rebuild is still ongoing, but I thought I'd get the diagnostics now. If I need to wait until the Parity is resynced, I'll throw another one up in about 6 hours when it's complete. Many thanks for your time. Kind regards, Mort unraid-diagnostics-20220120-0746.zip Quote Link to comment
trurl Posted January 20, 2022 Share Posted January 20, 2022 SMART for parity looks OK and parity sync had been proceeding for several hours without further I/O errors. Quote Link to comment
mort78 Posted January 20, 2022 Author Share Posted January 20, 2022 Yes, it has 3 hrs to go and so far all is fine. 0 errors on ANY drive. This is what happened the last 2 times also. It would complete with 0 errors and run fine for several days.... but then out of the blue, it would get errors. I'll attach the final diagnostics for your perusal, once it's finished. If the errors pop up again, I'll take a diagnostic in the "error state", and link that too. Again, many thanks for your time! Much appreciated. Quote Link to comment
mort78 Posted January 20, 2022 Author Share Posted January 20, 2022 OK... so the Parity check finished. As expected, it came out error free. I have attached a fresh diagnostic for your perusal. Many thanks, Mort unraid-diagnostics-20220120-1453.zip Quote Link to comment
trurl Posted January 20, 2022 Share Posted January 20, 2022 Looks OK. 12 hours ago, mort78 said: take a diagnostic in the "error state" unrelated Jan 19 19:26:01 unRaid root: Fix Common Problems: Error: Default docker appdata location is not a cache-only share ** Ignored Why do you have appdata all over the array? And system share? Typically, you want appdata and system shares on fast storage (cache) so docker/VM performance won't be affected by slower parity array, and won't keep array disks spunup since these files are always open. domains share is another that you should consider having all on fast storage if it will fit. Quote Link to comment
mort78 Posted January 21, 2022 Author Share Posted January 21, 2022 Hi Trurl, The reason I had Appdata etc all over, was simply because I tried to move it to cache only, but wasn't able to get it moved. That is until today..... I thought I'd have another crack at it, and I came across a post you replied to. I realized that I too was doing it slightly wrong. I had tried using the "cache only" option, but as you mentioned, then the mover can't touch it. I also hadn't turned the Dockers/VM's off in the settings, then hit the move function. I however followed your instructions on that thread, whereby I turned those 3 shares (appdata/domains/system) to 'prefer cache'. Then turned off the docker and vm function. Then hit the move function. It worked like a treat. So now all those files are on the cache disk. I initiated the common trouble shoot app, and it came back with no issues. Thanks for the tip. So far my server hasn't showed any read errors since the parity check yesterday, so we'll see how it goes. Many thanks again, Mort Quote Link to comment
mort78 Posted January 24, 2022 Author Share Posted January 24, 2022 Good evening.... As I expected, it happened again. Haven't touched the array since doing the parity check the other day.... ie: haven't installed or messed with it. I just logged in and was greeted with the errors I have attached the diagnostics of the current situation. As always, I really appreciate your time, in helping out. Kind regards, Mort unraid-diagnostics-20220124-1909.zip Quote Link to comment
mort78 Posted January 24, 2022 Author Share Posted January 24, 2022 (edited) btw, I just noticed that there is no temperature readouts since the error messages have popped up!!! Also, I noticed the parity drive didn't change to the orange triangle. I did a simple shutdown and restarted the array up again and it was back to normal! No idea why the errors are being posted in the first place! Edited January 24, 2022 by mort78 Quote Link to comment
JorgeB Posted January 24, 2022 Share Posted January 24, 2022 Problem with the HBA: Jan 24 09:33:38 unRaid kernel: mpt3sas_cm0: SAS host is non-operational !!!! Make sure it's well seated and sufficiently cooled, you can also try a different PCIe slot if available, failing that try a different HBA. Quote Link to comment
mort78 Posted January 24, 2022 Author Share Posted January 24, 2022 Hi JorgeB, Many thanks for the reply. Much appreciated. Hmmmm.... I did clean this entire computer out, about 3 weeks ago, using a portable air blower to get rid of all the dust collected inside the case. Perhaps it got slightly dislodged during the cleaning. Weird it only pops up with the error every week or so, and not more frequent. I'll open the case up tomorrow and give everything the once over. I'll report back what I find. Many thanks, Mort Quote Link to comment
trurl Posted January 24, 2022 Share Posted January 24, 2022 2 hours ago, mort78 said: no temperature readout Normal with spundown drive. Quote Link to comment
mort78 Posted January 24, 2022 Author Share Posted January 24, 2022 Hi Trurl, I wasn't aware of that! Hadn't noticed it before. Just as well I'm not an investigator Regards, Mort Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.