Disk spin up woes


lbosley

Recommended Posts

For several months now I have experienced problems with drive spin-up with my unRaid array.  I am hoping that I can get some support from other users who may be experiencing some of these same symptoms.  Considering the intermittent nature of this problem, it is difficult to pinpoint exactly when these crazy spin-ups started for me, but I believe it goes back to 6.2.x releases or earlier.  Apologies for the length of this report.

 

My config:

unRaid 6.3.2 w/ 20 WD Red drives (12x 4TB, 6x 6TB, 1x 8TB parity, 1x 2TB Hitachi cache drive)

SuperMicro X10SLH-F Motherboard w/8GB Kingston memory (upgraded to 16GB yesterday)

Using on-board SATA for 6-drives

SuperMicro SAS-2LP SAS/SATA controller (8-drives)

LSI 9207-8e SAS controller (6-drives in external enclosure)

CyberPower CP1500 UPS

 

Software:

I run a handful of basic plug-ins (which were all removed during my extensive troubleshooting of this problem).  This includes Unassigned Devices, Cache_Dirs, Recycle Bin, Tips and Tweaks, Preclear, Active Streams, System Stats, and UPS.  I also installed the Lime Tech Plex docker only in the past couple of weeks.  I have one share in play – Movies.  The share spans all of the disks and is cached.  I have around 3,100 folders, containing approximately 160,000 files.  My allocation method is high-water, and all but one disk is full.  The disks are set for 30-minute spin-down and I do not use spin-up groups. 

               

The issue:

I am randomly experiencing delays associated with disk spin-up when performing rather straight-forward operations – such as creating a file/folder, playing a movie from the array, or running the Mover.  Probably 90% of the time the system works as I expect.  With the exception of opening a file, browsing operations and file creation works without delay.  Yet, the very next operation may cause me to wait while disks are sequentially spun up.  This is happening not just from a Windows SMB operation, but also when doing little more than poking around the GUI where I might see a sudden locking of the interface as disks are spinning up.  This GUI locking and spin-up would ALWAYS happen when checking dlandon’s File Activity monitor - which was first introduced for this issue.  The logs routinely show most or all of my disks being spun up in the middle of the night when the Mover is running.  I primarily work from a Windows 7 SP1 workstation to upload content to unRaid.  My other client machine is a Windows 10 Kodi workstation.  I should also mention that my cache disk is set to never spin down (location of Plex docker files). 

 

I admit that it is possible that prior to my knowledge my system was unexpectedly spinning up drives.  But I can say with conviction that it never caused much of a delay in my operations.  I was accustomed to browsing into a folder that wasn’t cached and made to wait as the associated disk spun up.  Now these delays seem to be frustratingly longer as multiple disks are brought up one at a time. 

 

I opened an SSH session to my array last night where I was issuing a basic Find command to enumerate my Movies directory (basically, doing what Cache_Dirs does).  This is the only folder set to cache in Cache_Dirs.  When the array was first started last night the find command initially took 20-30 seconds to complete.  Subsequent (cached) runs would finish in about 4 seconds – even with all disks manually spun down.  I re-ran the command several times as I created folders and fired the Mover to see what would happen.  Most times the find operation would spit right through, but occasionally it would stall as a disk would spin up.  On two of the tests the Mover spun up an extra drive.  Later I fired up a movie from my Kodi machine.  It spun for the usual 5 seconds as the associated disk was spun up.  Then I watched the movie sputter for the next couple of minutes, realizing that my array was busy spinning up several other disks.

 

This morning I checked the find command and watched it run several times, completing in 4 seconds without a delay and without spinning up a drive (directory cached).  Literally one minute later I tried to create a new folder in the Movies share from my Windows machine and waited for several minutes as all but one disk spun up.  This is a perfect example of how my system has been performing.

 

I rebuilt my array configuration from scratch several weeks ago.  I also rebuilt my windows machine, just to say that it too is clean.  I do not believe this is related to SMB or client access.  I also do not blame cache_dirs per se.  But I believe the directory caching is being prematurely flushed for no apparent reason.  My array has certainly grown in the past year with more files and more disks.  Maybe I’ve hit a threshold of some kind?  I have made changes to the cache pressure and vm.dirty settings recommended by others in this forum to no avail.  Although I had no expectations for success, I finally gave in and purchased additional RAM.

 

I am not only looking for suggestions, but I would like to hear from others with similar configurations – large array with 15+ disks (w/ spin down enabled) and running a cache drive.  Take a look at your logs and see if your Mover operation needs to spin up more than the target disk that will be written to.  If someone thinks the behavior I am seeing is normal, please explain it to me.  Also, let me know if you are experiencing similar spin up frustration that was previously not an issue. 

 

Thanks for you input.

Link to comment

I'm guessing everyone is just shrugging their shoulders on this one.  Can't blame you, but I think there is a bug in the software or I have something behaving odd in my hardware.

 

This evening I updated the firmware on my HBA's.  The LSI controller firmware was fairly old.  The SuperMicro controller was up to date.

 

I tested again by starting a find command for the entire shared folder to build up the directory cache.  Then I spun down all disks and created a couple of folders and files.  I continued to repeat the find command just to see if anything would spin up.  Only the cache disk spun up during the write operations and they finished fine.  Then I started the Mover from the GUI.  I watched the expected drive (disk 18) start up. Seconds later the parity disk spun up and all appeared normal.  But then a few seconds later disk 1 spun up.  Statistics showed the system had 1 read and 3 writes to disk 1.  

 

I then shut down my only Docker (Plex) and removed every installed plug-in to repeat this test.  Once again the system spun up disk 1 when the Mover rsync'ed the files from my cache disk to disk 18.  There is no new file or any indication of any folder or file being modified in disk 1.  

 

This is not how this system should function.  Can anyone explain what I am seeing?  Looks like a bug to me, folks.

Link to comment

There's another recent thread with a couple of users seeing unexpected spinups, and they have had ongoing issues with it.  I made some comments, but I can't remember what the thread was.  I don't have a solution.

 

The way you describe what happens is what used to happen to me, with v5 when there was heavy I/O on the server.  I didn't have much RAM, 1GB only, and because v5 was 32 bit, the buffering was limited to 'lowmem', what was left in the first gigabyte of RAM not used by the system.  CacheDirs worked fine most of the time, but if I was moving very large files fast enough, then it apparently recovered memory buffers from everywhere to handle the transfers, and I would see Disk 1 spin up, then Disk 2, then possibly more, apparently depending on the demand.  So it does seem as if it's out of the particular memory buffer the directory caching needs, even if it seems the system has plenty of RAM.

 

Any chance you have allocated too much to VM's?  And to confirm, you have CacheDirs only caching Movies, and not the User Shares (the -u option)?

Link to comment

I was one of the two people from the earlier thread.  I think you recommended the vm.dirty settings.  I only recently enabled VM's and Dockers - like within the past couple of weeks.  And yes, Cahe_Dirs is only set to cache the Movies directory.  I also now believe this is not actually related to directory caching.  In my mind the test I ran this evening seems to point in another direction.  I could repeatedly complete a find command of the entire Movies directory in a matter of seconds without any spin-ups.  This tells me the directory is cached in memory.  Yet when I started the Mover process an extra disk spun up.  As I stated I repeated this test with cache_dirs running, and with it (and pretty much everything else) uninstalled.  

 

Maybe I have a weird hardware problem with one of my HBA's where activity on one is triggering something on the other?  It was interesting that I was finally able to reproduce this behavior.  I tried it again just now and got the same result - disk 1 spins up with a handful of reads and writes as the Mover sends a couple of folders to disk 18.  But this time when I initially wrote the files to the cache drive a couple of array disks also spun up.  

Link to comment

@lbosley: I have exactly the same probleme since 6.3.2 - when i open a share over windows, it takes about 25sec and then, "all" drives are spun up - strange.

When i start the mover it happens in a different way. unRAID starts the parity and the disk where the file will be copied (lets say Disk 9) AND THEN it starts ALWAYS Disk 1 and 2.

This never happened in unRAID v5 but since v6 this behavior exists and i cant really say why.

Edited by Zonediver
Link to comment
  • 3 months later...

I have this exact same problem.  No matter what testing and tweaking i do, I cannot get the cache dirs feature to work all the time. Sometimes its fine, sometimes I get a couple of extra disks spun up, sometimes I start a movie and watch all 15 drives spin up....  It's getting super annoying!

 

PS: I never had this problem on Unraid 5.X and on the same hardware.  I moved to 6.3.X and have had it ever since :-(

Link to comment
  • 4 weeks later...
  • 1 year later...

Old thread, but seems to have not been resolved yet?  I found an issue that was causing this on my server - to a lesser impact than I see described here - but maybe this will help. Also, I wasn't using Cache_Dirs prior to posting this, but I just installed it through CA.

 

I found that the disks which spun up unnecessarily had empty folders from shares I had moved to other disks.  So any access to that share would spin up the unnecessary disk(s) as well as the share disks.  In terminal, I made sure that any disks not (supposed to be) associated with the share had no traces of the share's folder structure on it.  I only ever found empty folders, and I'm not sure whether they were left on accident, or created somehow after I moved the share data to other disks.  But removing the empty folders stopped the unnecessary spin-ups.  

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.