PrisonMike Posted January 23 Share Posted January 23 (edited) Hello, unfortunately I am writing because I am having several problems with my unraid server. About a week ago, I was checking the GUI and noted that the cpu was pinned at %100. I waited for a day and checked again. It was still at %100. I tried rebooting and noticed that the cpu wasnt pinned when the docker service wasnt running. So I concluded there was maybe a problem with a docker. I updated some dockers that needed updating but the problem still persisted. In addition I was now not able to access some of my dockers such as sonarr, radarr, prowlarr. But some of my other dockers like audiobookshelf and mealie worked fine. Continuing on, I decided to update unraid to 6.11.5 from 6.9.2. Once I updated the OS nothing different happened. I normally use unraid with a static IP but I noticed that somehow unraid was reporting a mac address to my router (unifi) that the mac address was different but it was still using the same IP (192.168.1.5). So it seemed that unraid was somehow using two different mac addresses and they were both trying to use 192.168.1.5. So I switched to automatic DHCP un unraid and assigned the origional first mac address that I know is the mac for the adapter as 192.168.1.5. That seemed to work and I'm not having the address problem. However, I still could not access the dockers that I needed. So I figured I would delete one and try to reinstall it with unraid backup/restore. That didn't work so i resorted to deleting the vdisk. After deleting the vdisk I restarted the server. After a restart i got an error that the docker service couldnt be started. I looked for other solutions and some said they get this error when the vdisk is too full, so increased the size to 100gb (was 50gb). That didnt work either. I deleted the vdisk (btrfs) several times and cycled the server to no avail. I also tried switching to xfs, which didnt work either. In a final hail mary I rolled the server back to 6.9.2 via the GUI and I am still stuck. I also tried deleting the vdisk and switching to xfs and that didnt work. So I guess the first issue is figuring out why the docker service wont start. Once I figure that out, I can see what was pinning my CPU if even it is docker. Here is a link to the diagnostics: https://www.mediafire.com/file/n2gh16l3fb53xbu/poseidon-diagnostics-20230121-1955.zip/file Edited January 24 by PrisonMike Quote Link to comment
PrisonMike Posted January 25 Author Share Posted January 25 (edited) Anyone have any suggestions? Did I post the wrong diagnostic? I'm really not sure what to do at this point is my server even salvageable? Edited January 25 by PrisonMike Quote Link to comment
trurl Posted January 25 Share Posted January 25 attach diagnostics to your NEXT post in this thread Quote Link to comment
PrisonMike Posted January 27 Author Share Posted January 27 Hello, here is a copy of my diagnostics attached to the post. poseidon-diagnostics-20230127-1130.zip Quote Link to comment
trurl Posted January 27 Share Posted January 27 Your system share has files on the array. Ideally, appdata, domains, system shares should be on fast pool (cache) and set to stay there so Docker/VM performance isn't impacted by slower parity, and so array disks can spin down since these files are always open. You can worry about that later though. You have completely filled logspace, so nothing has been logged in a few days. But what logs you do have is showing problems communicating with cache. Do you not see Errors in Main - Pool Devices? Shutdown, check all connections, SATA and power, both ends, including splitters. Reboot will clear logs. Quote Link to comment
PrisonMike Posted January 31 Author Share Posted January 31 Hello @trurl, thank you for your response. I do not see any errors in pool devices. It looks like my log file keeps filling up after a few days. Here is a diagnostic file, request about 8 hours after a reboot. poseidon-diagnostics-20230130-2030.zip Quote Link to comment
trurl Posted January 31 Share Posted January 31 Might be cache2 is going bad. Run extended SMART self-test on cache2 Quote Link to comment
PrisonMike Posted February 3 Author Share Posted February 3 Hello, after trying to run a SMART extended test on CACHE 2 I get the following error "Errors occurred - Check SMART report" please find the smart report attached. Thanks for your assistance. poseidon-smart-20230202-1944.zip Quote Link to comment
trurl Posted February 3 Share Posted February 3 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed: read failure 00% 8922 830987072 replace Quote Link to comment
PrisonMike Posted February 6 Author Share Posted February 6 On 2/2/2023 at 8:26 PM, trurl said: Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed: read failure 00% 8922 830987072 replace Hello, thanks for the help! Since I have a pool of two cache drives, can I remove the bad one and run the server off of a single cache drive until I can get a new drive? Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.