Jump to content

[Solved] Please HELP! I think i broke my server


ttttubby

Recommended Posts

Ok, let me start at the beginning.

7-8 years ago I built an unraid server using an ASRock - H77 Pro4/MVP motherboard and an Intel Pentium G630T CPU with 8GB of ram.  I currently have 8 drives in my array (9 with the Parity drive) and a 500GB cache drive that is at least 9 years old (maybe 10).  I've never had a drive failure, though I did remove a 1TB drive about 6 months back because I was running out of SATA connectors between the ones on the motherboard and the two extras that I have from a SYBA Sil3132 PCI-E to SATA II controller card that I installed.  Around mid-September, Unraid became unresponsive and I narrowed it down to the SYBA expansion card which had always been a bit wonky, requiring me to physically reseat the SATA connectors any time the server was powered down in order for its drives to be recognized.  I replaced the expansion card with another of the same model and everything seemed to be working again + the SATA connectors no longer needed to be reseated.  At this time I was running Unraid 6.5.1.

 

Fast forward to a couple of weeks ago (early October) and I started noticing that my plex docker wasn't working properly.  One of my libraries seems to have become corrupt, and I couldn't delete it or refresh it from within the plex webgui.  Not a big deal, because I didn't use plex that often, so I put it in the back of my mind and didn't let it bother me.  Then, about a week later I noticed that the mover didn't seem to be moving my downloads from the cache drive to the array on schedule.  It was supposed to do so nightly, but I started to notice downloads hanging out on the cache drive a week after they downloaded.  As one of the uses for my server is to archive photos for my wife's photography business this worried me since stuff that stayed on the cache was unprotected.  I could invoke the mover manually, and everything seemed to work, but this was a more immediate problem than the one facing plex.

 

Yesterday (10/25) I decided to see if I could figure out the problems.  I started by rebooting the server (which may or may not have fixed the mover issue) and then proceeded to update to Unraid 6.6.3.  It was at this time that I started checking the logs and started noticing some pretty significant errors (which I can't really make heads or tails of).  I wanted to reboot again last night, but I downloaded the diagnostics file and the syslog before doing so.  See attachments to this post.  

 

After I rebooted, I noticed brtf errors which, through google, I was lead to believe that the problem was with my docker image.  So I though maybe, if I deleted my docker image and repulled my docker apps, I might be able to fix my problem.  Not so.  I think I may have made my problem worse.  I'll respond to this post to explain why and to upload today's diagnostics/syslog.

 

tower-diagnostics-20181025-2343.zip

tower-syslog-20181025-2358.zip

Link to comment

So I deleted my docker image and started to repull my docker apps (using my user templates).  Couchpotato pulled fine.  So far so good.  SabNZB on the other hand was a disaster.  I started getting the following error:

Error: failed to register layer: ApplyLayer exit status 1 stdout: stderr: unexpected EOF 

 

I looked this up and the consensus was to try again.  This time it seemed to get a little bit further, but then it spat out

Error: failed to register layer: ApplyLayer exit status 1 stdout: stderr: archive/tar: invalid tar header

 

It was then that I looked at my system log and I was getting a ton of CPU errors like the following:

Tower kernel: CPU: 1 PID: 13982 Comm: udevadm Tainted: G B 4.18.15-unRAID #1

 

Followed by a stream of: 

Tower kernel: swap_info_get: Bad swap file entry 3ffffbdffffff

 

and

Tower kernel: swap_info_get: Bad swap file entry 3fffffdffffff

 

I've attached my diagnostics file and syslog file from today, which includes these new errors.

 

Right now my data on the array seems safe, but I'm worried that my cache drive might be done (despite the fact that my main page says no errors).  I'm also worried that something might be bad with my CPU and/or motherboard.  Maybe its my SATA cables?

 

Please help.

tower-diagnostics-20181026-1048.zip

tower-syslog-20181026-1049.zip

Link to comment

When I tried to recreate my docker image I tinkered with the settings a bit, moving the image into a /cache/system/docker/ folder and bumping the image size from 30gb to 50gb.  Just now I deleted that image again, set it back to 30gb and placed it back in /cache/ and I was able to pull my dockers for sabnzb, couchpotato, sickbeard, dropbox, and krusader again without any errors.  So that stuff is working again thankfully.  I'm not going to bother with Plex for now though, and I'd still very much like to get to the bottom of this.

Link to comment

Ok,

 

It looks like I found the bad stick.  How much of a performance impact would I get if I dropped down to 4gb from 8gb on my server?  Does it matter that I'm dropping from dual channel to single?  It just so happens that I have another couple of sticks of 4gb ddr3.  Should I replace the one still in there?  Add them for a total of 12GB?

 

 

Link to comment

Secondary question,

 

What kind of performance differences would I see if I bumped my Pentium dual core (basically a celeron replacement) to a 3rd gen Core i5 quad core?  Thinking about buying one used on ebay and I'm wondering if its worth it for improved docker performance (for things like Plex in particular).

 

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...