garycase Posted September 24, 2015 Share Posted September 24, 2015 ... By the way, the little green dot that indicates the drive is spun up works just fine => it's only the Temp column that's not working [it's filled with asterisks instead of temperatures ] Temperature display is controlled independently from the other disk readings on the main page and may cause status to go out of sync. To minimize sync issues, you can lower the Tunable (poll_attributes) time under disk settings (default is 30 minutes), at the expense of more disk reading (smartctl). Ultimately an improved disk spin up/down detection may be needed for the temperature reading, but this is something LT has to look into. What's changed from v5 (and for that matter v4 before it) that made this stop working? In the past, if a disk was spun up, you saw it's temp; if not, you didn't. Simple as that. The spin up/down state is clearly recognized (hence the green dot) ... and the temps are shown just fine when the array first starts. It's only if a disk spins down -- and later spins back up from some array activity that the temps are missing. Quote Link to comment
gfjardim Posted September 24, 2015 Share Posted September 24, 2015 ... By the way, the little green dot that indicates the drive is spun up works just fine => it's only the Temp column that's not working [it's filled with asterisks instead of temperatures ] Temperature display is controlled independently from the other disk readings on the main page and may cause status to go out of sync. To minimize sync issues, you can lower the Tunable (poll_attributes) time under disk settings (default is 30 minutes), at the expense of more disk reading (smartctl). Ultimately an improved disk spin up/down detection may be needed for the temperature reading, but this is something LT has to look into. I handle this with this code: function is_disk_running($dev) { $state = trim(shell_exec("hdparm -C $dev 2>/dev/null| grep -c standby")); return ($state == 0) ? TRUE : FALSE; } function get_temp($dev) { $tc = "/tmp/.hdd_temp.json"; $temps = is_file($tc) ? json_decode(file_get_contents($tc),TRUE) : array(); if (is_disk_running($dev)) { if (isset($temps[$dev]) && (time() - $temps[$dev]['timestamp']) < 300 ) { return $temps[$dev]['temp']; } else { $temp = trim(shell_exec("smartctl -A -d sat,12 $dev 2>/dev/null| grep -m 1 -i Temperature_Celsius | awk '{print $10}'")); $temp = (is_numeric($temp)) ? $temp : "*"; $temps[$dev] = array('timestamp' => time(), 'temp' => $temp); file_put_contents($tc, json_encode($temps)); return $temp; } } else { return "*"; } } I didn't computed performance penalties, but it syncs spinning status and probes for new temperatures after 5 minutes. Quote Link to comment
bonienl Posted September 24, 2015 Share Posted September 24, 2015 What's changed from v5 (and for that matter v4 before it) that made this stop working? In the past, if a disk was spun up, you saw it's temp; if not, you didn't. Simple as that. The spin up/down state is clearly recognized (hence the green dot) ... and the temps are shown just fine when the array first starts. It's only if a disk spins down -- and later spins back up from some array activity that the temps are missing. In the older unRAID versions emhttp would query the disk smart info each time a page change or page refresh in the webGUI was done. Hence the advice not to change or update webGUI pages too frequently when - for example - a parity operation is in progress, as it will interfere with the on going disk acitvity. In unRAID v6 the temperature readings (smart info) are done independently from webGUI page changes an hence using the webGUI should (does) not interfere with disk operation. emhttp has some mechanism built-in to keep track of the disk spin up/down status to know whether temperatures need to be read or not, but as observed sometimes it can go out of sync. Quote Link to comment
garycase Posted September 24, 2015 Share Posted September 24, 2015 Understand -- so which is a "better" fix .... (1) Add the code gfjardim posted. [And, if so, HOW and WHERE do I add that? I'm a total non-Linux guy ] or (2) Change the Tunable (poll_attributes) time to a lower value (e.g. 5 or 10 minutes). Do I correctly understand that this should then make the temperatures "appear" after whatever value this is set to? Also, does this mean that temps, when displayed, are only updated at this interval ?? Quote Link to comment
dlandon Posted September 24, 2015 Share Posted September 24, 2015 This item lost a bit its visibility, I guess due to the workarounds which were introduced at the time. It would good to put it on the table again, though don't expect high priority. When I remove a plugin on the command line there is no hang. This is obviously a webgui issue. Quote Link to comment
bonienl Posted September 25, 2015 Share Posted September 25, 2015 Understand -- so which is a "better" fix .... (1) Add the code gfjardim posted. [And, if so, HOW and WHERE do I add that? I'm a total non-Linux guy ] or (2) Change the Tunable (poll_attributes) time to a lower value (e.g. 5 or 10 minutes). Do I correctly understand that this should then make the temperatures "appear" after whatever value this is set to? Also, does this mean that temps, when displayed, are only updated at this interval ?? For unassigned devices emhttp doesn't read the temperature and the solution of gfjardim takes care of that in his plugin. For array devices the GUI depends on emhttp and lowering the timer value will shorten the reading interval AFTER emhttp received the trigger that a disk is in spin up state. Quote Link to comment
RobJ Posted September 25, 2015 Share Posted September 25, 2015 (2) Change the Tunable (poll_attributes) time to a lower value (e.g. 5 or 10 minutes). Do I correctly understand that this should then make the temperatures "appear" after whatever value this is set to? Also, does this mean that temps, when displayed, are only updated at this interval ?? This has come up a number of times, and is one of a number of reasons the upgrade guide was written, to deal with the gotchas and behavioral quirks between v6 and earlier. There are a number of recommendations in the Configuring the Settings section, including 'Tunable (poll_attributes)' in the 'Disk Settings'. I've add a feature request to change the default setting for this, from 30 minutes to 2 or 3 minutes. I like 2 minutes, but would like to see some testing, discover what impact there is with different values. Quote Link to comment
garycase Posted September 25, 2015 Share Posted September 25, 2015 I changed it to 5 (300 seconds) and like that a lot better ... at least when a disk is spinning the temperature is (albeit with a delay of up to 5 min) now displayed. I may drop it to 2, as I can't imagine that a few msec of activity every 120 seconds really matters. Suppose it takes a full msec (unlikely) to grab the SMART data from each disk, and that they're all done sequentially, so you waste a full msec/disk to get the temp. Even with a 20 disk array that would be 20 msec ever 120000 msec ... or about 0.02% of the time that it would be "wasting". Quote Link to comment
garycase Posted September 25, 2015 Share Posted September 25, 2015 Rob => just read your note in the "Upgrading" guide on this, and I definitely agree it should be less than the default. After thinking about that, and what I just wrote as well, I changed mine to 90 seconds Quote Link to comment
gfjardim Posted September 25, 2015 Share Posted September 25, 2015 For unassigned devices emhttp doesn't read the temperature and the solution of gfjardim takes care of that in his plugin. For array devices the GUI depends on emhttp and lowering the timer value will shorten the reading interval AFTER emhttp received the trigger that a disk is in spin up state. I've posted the code only to show the logic I'm using. The bottom line is that emhttp should probe for a new temperature if the device was in standby and now it's spinning, and discard the old temperature record if it's spun down. And keep the "poll_attributes" to temperature updates. Quote Link to comment
RobJ Posted September 25, 2015 Share Posted September 25, 2015 For unassigned devices emhttp doesn't read the temperature and the solution of gfjardim takes care of that in his plugin. For array devices the GUI depends on emhttp and lowering the timer value will shorten the reading interval AFTER emhttp received the trigger that a disk is in spin up state. I've posted the code only to show the logic I'm using. The bottom line is that emhttp should probe for a new temperature if the device was in standby and now it's spinning, and discard the old temperature record if it's spun down. And keep the "poll_attributes" to temperature updates. One issue with that is that hdparm apparently doesn't work with certain controllers (Areca, others?). However, UnMENU's MyMain has been able to get around that with some special detection and programming. Quote Link to comment
gfjardim Posted September 25, 2015 Share Posted September 25, 2015 One issue with that is that hdparm apparently doesn't work with certain controllers (Areca, others?). However, UnMENU's MyMain has been able to get around that with some special detection and programming. UNRAID doesn't support it either. If anyone with an Areca card could send me some code about detection spin status and probing temperature, I'll gladly add it. But let's keep on topic: the question here is how to maintain spin status and temperature probing in sync. IMHO, if the disk is spinning and there's not a valid temperature reading, it should read it immediately, even if the pool_status interval isn't met. Quote Link to comment
garycase Posted September 25, 2015 Share Posted September 25, 2015 ... IMHO, if the disk is spinning and there's not a valid temperature reading, it should read it immediately, even if the pool_status interval isn't met. Absolutely agree !! Quote Link to comment
ogi Posted September 26, 2015 Share Posted September 26, 2015 Just upgraded, but now I'm unable to start my VM, I'm being given the following error: unsupported configuration: host doesn't support VFIO PCI passthrough Under 6.1.2 (and earlier) I had a network card passthrough which was functioning fine: Here is the XML file I have for my VM: <domain type='kvm'> <name>ubuntu-server</name> <uuid>dccdd050-89bc-6f12-491b-86f2872ca517</uuid> <description>Ubuntu Server</description> <metadata> <vmtemplate name="Custom" icon="ubuntu.png" os="ubuntu"/> </metadata> <memory unit='KiB'>8388608</memory> <currentMemory unit='KiB'>8388608</currentMemory> <memoryBacking> <nosharepages/> <locked/> </memoryBacking> <vcpu placement='static'>2</vcpu> <cputune> <vcpupin vcpu='0' cpuset='0'/> <vcpupin vcpu='1' cpuset='1'/> </cputune> <os> <type arch='x86_64' machine='pc-q35-2.3'>hvm</type> </os> <features> <acpi/> <apic/> </features> <cpu mode='host-passthrough'> <topology sockets='1' cores='2' threads='1'/> </cpu> <clock offset='utc'> <timer name='rtc' tickpolicy='catchup'/> <timer name='pit' tickpolicy='delay'/> <timer name='hpet' present='no'/> </clock> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <devices> <emulator>/usr/bin/qemu-system-x86_64</emulator> <disk type='file' device='disk'> <driver name='qemu' type='raw' cache='writeback'/> <source file='/mnt/cache/virtual-machines/ubuntu-server/vdisk1.img'/> <target dev='hdb' bus='virtio'/> <boot order='1'/> <address type='pci' domain='0x0000' bus='0x02' slot='0x04' function='0x0'/> </disk> <controller type='usb' index='0' model='ich9-ehci1'> <address type='pci' domain='0x0000' bus='0x02' slot='0x02' function='0x7'/> </controller> <controller type='usb' index='0' model='ich9-uhci1'> <master startport='0'/> <address type='pci' domain='0x0000' bus='0x02' slot='0x02' function='0x0' multifunction='on'/> </controller> <controller type='sata' index='0'> <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/> </controller> <controller type='pci' index='0' model='pcie-root'/> <controller type='pci' index='1' model='dmi-to-pci-bridge'> <address type='pci' domain='0x0000' bus='0x00' slot='0x1e' function='0x0'/> </controller> <controller type='pci' index='2' model='pci-bridge'> <address type='pci' domain='0x0000' bus='0x01' slot='0x01' function='0x0'/> </controller> <controller type='virtio-serial' index='0'> <address type='pci' domain='0x0000' bus='0x02' slot='0x03' function='0x0'/> </controller> <interface type='bridge'> <mac address='52:54:00:a2:f3:28'/> <source bridge='br0'/> <model type='virtio'/> <address type='pci' domain='0x0000' bus='0x02' slot='0x01' function='0x0'/> </interface> <serial type='pty'> <target port='0'/> </serial> <console type='pty'> <target type='serial' port='0'/> </console> <channel type='unix'> <source mode='bind' path='/var/lib/libvirt/qemu/channel/target/ubuntu-server.org.qemu.guest_agent.0'/> <target type='virtio' name='org.qemu.guest_agent.0'/> <address type='virtio-serial' controller='0' bus='0' port='1'/> </channel> <input type='tablet' bus='usb'/> <input type='mouse' bus='ps2'/> <input type='keyboard' bus='ps2'/> <graphics type='vnc' port='-1' autoport='yes' websocket='-1' listen='0.0.0.0' keymap='en-us'> <listen type='address' address='0.0.0.0'/> </graphics> <video> <model type='vmvga' vram='16384' heads='1'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/> </video> <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x00' slot='0x19' function='0x0'/> </source> <address type='pci' domain='0x0000' bus='0x02' slot='0x06' function='0x0'/> </hostdev> <memballoon model='virtio'> <address type='pci' domain='0x0000' bus='0x02' slot='0x05' function='0x0'/> </memballoon> </devices> </domain> I followed the guide here: http://lime-technology.com/forum/index.php?topic=39638.0 Here is my syslinux.cfg file: default /syslinux/menu.c32 menu title Lime Technology prompt 0 timeout 50 label unRAID OS menu default kernel /bzimage append pci-stub.ids=8086:1502 initrd=/bzroot label unRAID OS Safe Mode (no plugins) kernel /bzimage append initrd=/bzroot unraidsafemode label Memtest86+ kernel /memtest Quote Link to comment
Thornwood Posted September 27, 2015 Share Posted September 27, 2015 Just a comment I will post in help once it happens again. Once I updated to this version now twice the system has lost the prokey and can't read/write to the USB drive. Here is the weirdest part this happens once the array is up for a few days. The glitch happens and it keeps running but the main page turns default white( I use black) VPN goes down and nothing on main page works except stop array. Once stop is hit I get advertising page to buy key? And restart button is gone. I have dynamix power plug in installed so I use right side restart feature but just so I don't have to force power cycle. Ether way comes back up to parity check with everything normal. The only thing my system has that is weird is I am using wd blacks and during parity checks they get 50c and I start getting emails about shutting down but it never really shutdown. On previous versions this error did happen but the us. Never acted like this. Well I thought this was a glitch of the flash so did several tests to make sure it was not but since it restarted no log file and since it loses ability to write to the USB I'm scratching my head on how I am going to get the log once it happens again. But I will post in general help at that time. For now I thought this would be a good FYI Thank you Thornwood Quote Link to comment
bonienl Posted September 27, 2015 Share Posted September 27, 2015 Just a comment I will post in help once it happens again. Once I updated to this version now twice the system has lost the prokey and can't read/write to the USB drive. Here is the weirdest part this happens once the array is up for a few days. The glitch happens and it keeps running but the main page turns default white( I use black) VPN goes down and nothing on main page works except stop array. Once stop is hit I get advertising page to buy key? And restart button is gone. I have dynamix power plug in installed so I use right side restart feature but just so I don't have to force power cycle. Ether way comes back up to parity check with everything normal. The only thing my system has that is weird is I am using wd blacks and during parity checks they get 50c and I start getting emails about shutting down but it never really shutdown. On previous versions this error did happen but the us. Never acted like this. Well I thought this was a glitch of the flash so did several tests to make sure it was not but since it restarted no log file and since it loses ability to write to the USB I'm scratching my head on how I am going to get the log once it happens again. But I will post in general help at that time. For now I thought this would be a good FYI Thank you Thornwood unRAID version 6.1.3 doesn't have a function to shutdown upon disk overheating. Do you have anything installed using unMenu or perhaps still running the v5 version of Dynamix Disk Health ? It also sounds your flash drive at some point becomes unreadable, in that case Dynamix reverts to its default settings, which is the white theme. Quote Link to comment
BRiT Posted September 27, 2015 Share Posted September 27, 2015 Look through your syslog for USB errors, especially for USB disconnects. It seems like maybe you have the USB flash drive plugged into a USB3 slot, try plugging it into a different slot. There have been other users that have had USB disconnect issues and "fixed" it by moving the flash drive to a different USB slot. Quote Link to comment
Thornwood Posted September 27, 2015 Share Posted September 27, 2015 Thank you I will try and report back Quote Link to comment
pyrater Posted September 27, 2015 Share Posted September 27, 2015 Was there a SSD bug introduced? As soon as i rebooted my server it will not boot with ANY ssd's installed. REF: http://lime-technology.com/forum/index.php?topic=43150.0 Quote Link to comment
BRiT Posted September 27, 2015 Share Posted September 27, 2015 Was there a SSD bug introduced? As soon as i rebooted my server it will not boot with ANY ssd's installed. REF: http://lime-technology.com/forum/index.php?topic=43150.0 Nope. Boots fine for me with my 480gb ssd cache drive using Xfs. Quote Link to comment
scottc Posted September 27, 2015 Share Posted September 27, 2015 I have no problem booting with SSD cache drive Quote Link to comment
pyrater Posted September 27, 2015 Share Posted September 27, 2015 Weird ok thank you for the replies i will continue to troubleshoot now that i know my "data" is safe and its just the VM drives.... Quote Link to comment
mr007 Posted September 28, 2015 Share Posted September 28, 2015 I'm running a AOC-SASLP-MV8 in my server. Is it safe to upgrade to to 6.1.3 with the removal of the Adaptec AIC94xx SAS/SATA support from the kernel? Quote Link to comment
sureguy Posted September 29, 2015 Share Posted September 29, 2015 I'm running a AOC-SASLP-MV8 in my server. Is it safe to upgrade to to 6.1.3 with the removal of the Adaptec AIC94xx SAS/SATA support from the kernel? Yes Quote Link to comment
ikosa Posted September 29, 2015 Share Posted September 29, 2015 is it possible that upgrading from 6.1.2 to 6.1.3 slows down my parity check? my old parity checks are between 85-90MB/s but my last parity check speed is 65MB/s. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.