Jump to content

Server getting very slow and hanging with bad drive


RockDawg

Recommended Posts

I've been having an issue this week where the server really started acting up.  Wednesday night I was ripping some content to the server and my wife was watching some content on the server via Kodi.  During this time my wife started experiencing a buffering (which we never get any).  It gradually got worse until things were unplayable.  Right around this time I started getting notifications that disk 3 was piling up "current pending sectors" and "offline uncorrectable".  It turns out that my ripping happened to be writing to that drive so I stopped that and assumed my wife's streaming would return to normal but it didn't.  So I tried to stop the array from the GUI and it just hung.  I tried command line and other things and the only way to get the system to do anything was to power it down with the server's physical power button.

 

When it rebooted things seemed to run fine although both sector counts kept increasing.  Yesterday my new drive arrives.  Because of the situation I didn't bother with a preclear, I just replaced the bad drive with the new drive and let unRAID do it's thing.  Everything seemed fine until the middle of the night sometime.  I wake up this morning and kodi can't play anything.  If it even tries it buffers almost instanlty.  I go to the unRAID GUI and see the data rebuild is going as something like 52KB/s.  Once again, unRAID seems somewhat unresponsive.  I can click on different tabs oksay but some like the system log just open an empty window that never fills.  I also try to click on the attributes for the disk in the dashboard but only the first few lines show and nothing else.

 

So again I try to stop the array.  Right away it hangs at "spinning up drives" even though none of my drives are spun down.  I decide to leave it and it eventually goes to "shutting down processes" and sits there for a good while.  All told it must have taken about 15 minutes or so to shutdown.  I power it back up and it automatically starts a data rebuild again and it's at ~50MB/s.  I refresh the page and it's down to 30.  Refresh again and it's down to 25.  So I just cancelled the rebuild.

 

I have had drives go bad in the past and it never made the system slow or unresponsive or affected any streaming.  Now it's happened twice.  One of which was a new drive (although maybe faulty).  I just want to make sure there isn't something more at work here than two faulty drives since my system never acts like this.

 

 

wopr-diagnostics-20180421-1050.zip

syslog-20180421-105808.txt

Link to comment
11 minutes ago, RockDawg said:

I refresh the page and it's down to 30.  Refresh again and it's down to 25

Was that 25MB/s? There are some reports of low write performance with Seagate ST8000DM004 drives, so the canceling might have been premature, there are no disk related errors on the log, try the rebuild again but if it gets to KB/s speeds then grab and post new diags.

Link to comment
18 hours ago, RockDawg said:

Those logs should have been from when the write speed was into KB/s and everything was really slow.

OK, still no disk related errors, there is a strange thing, there are simultaneous reads/writes on the parity disk, all other disks are normal, i.e., writes only for the disk being rebuilt and reads only for the remaining data disks, parity should be also reads only during a rebuild, can't think of what could be causing the writes, but it could explain the low speed.

 

Start the array, cancel the rebuild and look on the main page if disk activity stops on all disks, if yes restart the rebuild and check disk activity again, there should be writes only on the disk being rebuilt and reads only on all remaining disks.

Link to comment

My concern was not with the low write speeds it was the slowdown of the server i.e. severely buffering streaming, some WBGUI pages not loading and the system either unable to be shutdown or taking an extremely long time to do so.  This happened both when the system had a bad data drive and then again when the replacement drive was rebuilding.  The low write speeds of that drive (KB/s) made me think maybe it also happened to be bad.  My main concern is why the system started performing so prooly in both instances.  Like I said, I have had bad drives in the array before and never experienced an issue like that.  The simultaneous read and writes to the parity were probably becasue something was being written to the array during that time.

 

Anyway, I purchased a second drive and replaced the questionable drive and it is just now finishing the rebuild.  So it's been okay so far.  When it's complete I'll run preclear on the questionable drive and see what comes back.

Link to comment
15 hours ago, RockDawg said:

The simultaneous read and writes to the parity were probably becasue something was being written to the array during that time.

Only if you were writing to the disk being rebuilt, as there were no writes to other disks, and this could explain the low read/writes performance, as both parity and the disk being rebuilt use shingled recording and might hit a wall when doing that.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...