Dynamix - V6 Plugins


Recommended Posts

So, I'm setting up unraid on my main media server,  a Asus X-99, I7 6800K,  32 Gb,  machine. 

I installed auto fan but it doesn't detect any pwm controller nor any fans.

Is my MB not supported or am i messing something up? 

System temp is running fine on the same system. 

 

//

 

Link to comment
On 1/20/2018 at 12:00 AM, palmio said:

fstrim error

/sbin/fstrim -v /mnt/cache
gives me:
fstrim: /mnt/cache: FITRIM ioctl failed: Remote I/O error

Any suggestions?
I have never had focus on TRIM before, so I do not know if it has always been an issue or it's from the recent unraid update (like rjorgenson writes about)

Running: unRAID v6.4.0
Cache: 2x Samsung 840 SSD
HBA: Flashed LSI SAS2008-8I (9211-8i)

My problem is really about stalling file transfers, but since it's only when dumping data to unraid, not when reading it back, i suspect TRIM to be the issue.
Only get 30MB/sec transfers (writes) with lots of breaks and retries, but 100MB/sec read and have previously had 100MB/sec writes as well. (FTP on 1Gbit network)

 

On 1/18/2018 at 5:14 PM, rjorgenson said:

Since upgrading to 6.4.0 a couple days ago my trim job has failed every day with


fstrim: /mnt/cache: FITRIM ioctl failed: Remote I/O error

Did something change in the update? Plugin version is 2017.04.23a and has been since before the update. I received this error previously with the SSD's in my cache pool because my HBA didn't support some flag, I replaced the HBA with a newer version and TRIM worked just fine until I updated it. If it's relevant I haven't rebooted it since I installed the HBA until the update was installed.

 

Same error here. Didn't occur in 6.3.5 and now it does in 6.4.1.

Link to comment
On 20/1/2018 at 9:00 AM, palmio said:

fstrim error

/sbin/fstrim -v /mnt/cache
gives me:
fstrim: /mnt/cache: FITRIM ioctl failed: Remote I/O error

Any suggestions?
I have never had focus on TRIM before, so I do not know if it has always been an issue or it's from the recent unraid update (like rjorgenson writes about)

Running: unRAID v6.4.0
Cache: 2x Samsung 840 SSD
HBA: Flashed LSI SAS2008-8I (9211-8i)

My problem is really about stalling file transfers, but since it's only when dumping data to unraid, not when reading it back, i suspect TRIM to be the issue.
Only get 30MB/sec transfers (writes) with lots of breaks and retries, but 100MB/sec read and have previously had 100MB/sec writes as well. (FTP on 1Gbit network)



Someone suggested that my Controllers probably did not support TRIM, and that I should move the cache-drives to be directly connected to my motherboard.
I did that today, and ran TRIM with success

HUGE HUGE difference in transfers now.
Before a simple File Transfer using FTP would stall after a few seconds, sometimes to the extent that FTP did a timeout and the transfer had to be resumed, giving me annoying prompts about overwriting and sh**

Just did a 200 GB transfer with minimal stalls (all less than 4 seconds)
(probably Network Buffer -> SSD cache transfers)

My point: if you are using SSD as cache, you need to use TRIM, it really makes a big difference.
I am very happy.

Connect the cache drives directly to the motherboard if possible
(I did not do any preparation or anything, just unplugged the drives and connected them directly to MB and unraid found them without me doing any configuration)

 

BTW: I am now running v6.4.1 (no trim issues from command prompt)

My system:
X99 MB in Norco 4224 cabinet, Intel i7, 64GB DDR4
4x 500GB Samsung SSD
14x 8TB Western Digital Gold / Seagate Ironwolf

Edited by palmio
Link to comment
5 hours ago, maitaijim said:

Sleep is no longer working for me after the latest update. I'm running 6.4.0 on an Asus P6T Deluxe V2 with an i7 920. Is there a way to revert to the previous version? 

I'm encountering the issue too. Thought it was working OK after the recent sleep plugin update, but now (I'm on 6.4.1) I'm repeatedly finding that the plugin thinks that my SSD cache drive is active, when in fact it is not, and so fails to initiate the sleep sequence when it should.

 

However, if I "spin-up" and then "spin-down" the cache drive from Dynamix Main tab , the sleep plugin will then recognize that the cache drive is inactive and proceeds with putting the server to sleep.

Tue Feb 13 09:54:04 EST 2018: Wake-up now
Tue Feb 13 09:54:04 EST 2018: System woken-up. Reset timers
Tue Feb 13 09:55:04 EST 2018: Disk activity on going: sdb
Tue Feb 13 09:55:04 EST 2018: Disk activity detected. Reset timers.
Tue Feb 13 09:56:04 EST 2018: Disk activity on going: sdg
Tue Feb 13 09:56:04 EST 2018: Disk activity detected. Reset timers.
Tue Feb 13 09:57:04 EST 2018: Disk activity on going: sdg
Tue Feb 13 09:57:04 EST 2018: Disk activity detected. Reset timers.
Tue Feb 13 09:58:04 EST 2018: Disk activity on going: sdg
Tue Feb 13 09:58:04 EST 2018: Disk activity detected. Reset timers.
Tue Feb 13 09:59:04 EST 2018: Disk activity on going: sdg
Tue Feb 13 09:59:04 EST 2018: Disk activity detected. Reset timers.
Tue Feb 13 10:00:04 EST 2018: Disk activity on going: sdg
Tue Feb 13 10:00:04 EST 2018: Disk activity detected. Reset timers.
Tue Feb 13 10:01:04 EST 2018: Disk activity on going: sdg
Tue Feb 13 10:01:04 EST 2018: Disk activity detected. Reset timers.
Tue Feb 13 10:02:04 EST 2018: All monitored HDDs are spun down
Tue Feb 13 10:02:04 EST 2018: Extra delay period running: 30 minute(s)
Tue Feb 13 10:03:04 EST 2018: All monitored HDDs are spun down
Tue Feb 13 10:03:04 EST 2018: Extra delay period running: 29 minute(s)

Note: @  ~10:00 - 10:01+ sdg (cache) is shown inactive per Dynamix Main. After "spin-up" / "spin-down" commands executed, sleep.log then correctly reports "All monitored HDDs are spun down"

 

We had an issue several years back where the sleep plugin was losing track of whether drives were awake. An after wake-up command string was inserted to address the issue. I still had this string active when I noticed this recent problem. I currently have it removed to see if there would be any change, but it does not appear to make any difference.

 

/usr/bin/wget  -q  -O  -  localhost/update.htm?cmdSpinupAll=true >/dev/null

Seems like something similar is going on again, but possibly only related to SDDs or cache?

Edited by lewcass
Link to comment
16 minutes ago, bonienl said:

/usr/bin/wget -q -O - localhost/update.htm?cmdSpinupAll=true >/dev/null

Since unRAID 6.1 security enhancements are in place which prohibit direct commands like the one above without including the csrf token.

OK, but I don't know what a csrf token is or how to include one. 

 

Does it even relate to solving the issue we're reporting?

 

Edited by lewcass
Link to comment

 

36 minutes ago, lewcass said:

the plugin thinks that my SSD cache drive is active, when in fact it is not,

Not that I use this plugin, but I will say that if you have your docker.img file (if you use docker applications) stored on your cache drive, then its pretty much a given that at any point in time your cache drive is active.

Link to comment
4 minutes ago, Squid said:

 

Not that I use this plugin, but I will say that if you have your docker.img file (if you use docker applications) stored on your cache drive, then its pretty much a given that at any point in time your cache drive is active.

Docker is disabled. I am currently not using docker or VM. Cache drive is not active when this issue with sleep plugin occurs, at least not anyway, according to Dynamix Main or Dashboard.

Edited by lewcass
Link to comment
3 hours ago, lewcass said:

However, if I "spin-up" and then "spin-down" the cache drive from Dynamix Main tab , the sleep plugin will then recognize that the cache drive is inactive and proceeds with putting the server to sleep.

 

The debug log shows the cache disk as still being spun up. In other words something previously caused the cache disk to spin up and it will only spin down when the 'regular' spin down time is reached. This usually happens when an application is accessing the cache disk and prevents the spin down timer to kick into action.

 

Link to comment

OK. Did a little testing and it appears that the sleep plugin's confusion about the status of the cache drive may be related to my having two hot spares on-line (but not in the array).

 

Just rebooted with the hot spares removed from the drive cage and tested sleep. This time the plugin did not report the cache drive to be active when Dynamix shows it was not, and sleep script proceeded as it should.

 

#############################################

 

Here are earlier sleep plugin drive monitoring parameters with the hot-spares.

Feb  9 13:14:12 Tower s3_sleep: ----------------------------------------------
Feb  9 13:14:12 Tower s3_sleep: included disks=sdb sdc sdd sdg
Feb  9 13:14:12 Tower s3_sleep: excluded disks=sda sde sdf

i.e., The failure mode, when sleep.log incorrectly shows (sdg) cache drive active.

 

Should note here that Dynamix always shows the hot spares, sde, sdf, as inactive.

 

#############################################

 

Now drive monitoring parameters while testing without the hot spares connected.

 

Feb 13 12:41:41 Tower s3_sleep: ----------------------------------------------
Feb 13 12:41:41 Tower s3_sleep: included disks=sdb sdc sdd sde
Feb 13 12:41:41 Tower s3_sleep: excluded disks=sda

 

Cache drive in this case is sde, which sleep.log appropriately shows as inactive, and sleep proceeds correctly.

 

############################################

 

I'll have to see if this now works consistently, but at first pass it seems likely to be related.

Link to comment
9 minutes ago, bonienl said:

 

The debug log shows the cache disk as still being spun up. In other words something previously caused the cache disk to spin up and it will only spin down when the 'regular' spin down time is reached. This usually happens when an application is accessing the cache disk and prevents the spin down timer to kick into action.

 

Yes, but it was consistently showing it spun up for hours while the Dynamix GUI was continually showing it as inactive. Please consider my previous post and whether having two additional disks connected but not in the array may be causing the plugin to lose track of the status of the final  (cache) disk.

 

Thanks.

Edited by lewcass
Link to comment
2 minutes ago, lewcass said:

Yes, but it was consistently showing it spun up for hours while the Dynamix GUI was continually showing it as inactive. Please consider my previous post and whether having two additional disks connected but not in the array may be causing the plugin to lose track of the status of the final  (cache) disk.

 

Thanks.

 

Do you have your cache disk directly connected to an onboard SATA port, or are you using a controller card?

 

Link to comment
3 minutes ago, lewcass said:

All disks/drives are connected to onboard SATA.

 

Ok.

The default value for the setting Tunable (poll_attributes) is set to 1800 (=30 minutes). This is a relatively long time to poll SMART information and update the disk activity state. You could lower this to a value of 3 to 5 minutes to have it more 'real-time'.

Note: a shorter polling time may affect normal disk operation.

Link to comment
3 minutes ago, bonienl said:

 

Ok.

The default value for the setting Tunable (poll_attributes) is set to 1800 (=30 minutes). This is a relatively long time to poll SMART information and update the disk activity state. You could lower this to a value of 3 to 5 minutes to have it more 'real-time'.

Note: a shorter polling time may affect normal disk operation.

Would the setting of that parameter explain why the Dynamix GUI has been showing the the cache as inactive at the same time the sleep plugin was logging it as Disk activity ongoing?

Link to comment

OK I'll give it a shot. I'm dubious though because the plugin was consistently logging that the cache was active for hours at a time (like all night and then some), when nothing should have been happening. So difficult to understand why polling every thirty minutes would not have been sufficient to update the status correctly.

Link to comment
5 hours ago, bonienl said:

 

Ok.

The default value for the setting Tunable (poll_attributes) is set to 1800 (=30 minutes). This is a relatively long time to poll SMART information and update the disk activity state. You could lower this to a value of 3 to 5 minutes to have it more 'real-time'.

Note: a shorter polling time may affect normal disk operation.

Set the Tunable (poll_attributes) to 180 (~3 min). Shutdown. Reinstalled the hot-spare disks in the cage. Rebooted. Left server do it's thing (idling). Drives spun down. Sleep plugin script did it's thing. Server went to sleep.

 

However, woke server with WOL, played media for an hour using only one disk. Left server idle. Disk that was in use spun down.

 

Then same issue as before.

 

Tue Feb 13 17:43:10 EST 2018: Disk activity on going: sdd
Tue Feb 13 17:43:10 EST 2018: Disk activity detected. Reset timers.
Tue Feb 13 17:44:10 EST 2018: Disk activity on going: sdd
Tue Feb 13 17:44:10 EST 2018: Disk activity detected. Reset timers.
Tue Feb 13 17:45:10 EST 2018: Disk activity on going: sdd
Tue Feb 13 17:45:10 EST 2018: Disk activity detected. Reset timers.
Tue Feb 13 17:46:10 EST 2018: Disk activity on going: sdd
Tue Feb 13 17:46:10 EST 2018: Disk activity detected. Reset timers.
Tue Feb 13 17:47:10 EST 2018: Disk activity on going: sdg
Tue Feb 13 17:47:10 EST 2018: Disk activity detected. Reset timers.
Tue Feb 13 17:48:10 EST 2018: Disk activity on going: sdg
Tue Feb 13 17:48:10 EST 2018: Disk activity detected. Reset timers.
Tue Feb 13 17:49:10 EST 2018: Disk activity on going: sdg
Tue Feb 13 17:49:10 EST 2018: Disk activity detected. Reset timers.
Tue Feb 13 17:50:10 EST 2018: Disk activity on going: sdg
Tue Feb 13 17:50:10 EST 2018: Disk activity detected. Reset timers.
Tue Feb 13 17:51:10 EST 2018: Disk activity on going: sdg
Tue Feb 13 17:51:10 EST 2018: Disk activity detected. Reset timers.
...
Tue Feb 13 19:49:16 EST 2018: Disk activity on going: sdg
Tue Feb 13 19:49:16 EST 2018: Disk activity detected. Reset timers.
Tue Feb 13 19:50:16 EST 2018: Disk activity on going: sdg
Tue Feb 13 19:50:16 EST 2018: Disk activity detected. Reset timers.
Tue Feb 13 19:51:16 EST 2018: Disk activity on going: sdg
Tue Feb 13 19:51:16 EST 2018: Disk activity detected. Reset timers.

As soon as the active array disk spun down the sleep plugin once again began logging the cache disk (sdg) as active and is continuing to do so over two hours later as I write this. All this time the cache disk is shown as inactive in Dynamix GUI.

 

So changing the disk poll attribute has not addressed the issue.

 

Tomorrow I will remove the hot spares again and test longer to see if the sleep plugin will work consistently without non-array drives present.

 

 

Edited by lewcass
Link to comment
13 hours ago, maitaijim said:

I have it set to ignore my cache drive, so that's not my issue. The previous version of the sleep plugin functioned as expected, with the current version sleep is never initiated. 

 

Is there a way to revert to the previous version?

Previous version is not compatible with unRaid 6.4.x

 

If you enable the debug log, you may be able to find out why the current version is not working for you. I have mine set to log to Flash. You can view the log in the logs directory of you flash drive. \\TOWER\flash\logs\s3_sleep.log

 

I don't see any setting to ignore cache drive. If it is possible I would like to know how.

 

 

SleepCache4.JPG

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.