July 26, 201213 yr Hope this is a simple fix for those in the know. The code below is filling up my syslog - i have only done an example as its got up to well over 2500 lines of exactly the same thing now and adds a new entry every couple of mins. So what is it and how do i make it stop ! As a bit of history i think at some point i used this command line to try to spin down my cache drive as it was spun up all the time and i think its a left over from that but where do i go to kill it ! disk 3 is a new drive i just pre cleared and added into the array, all showing green and fine Thanks for any input Jul 26 17:59:30 Backupsvr emhttp: shcmd (201): /usr/sbin/hdparm -y /dev/sda $stuff$> /dev/null (Drive related) Jul 26 18:14:33 Backupsvr kernel: mdcmd (28): spindown 3 (Routine) Jul 26 18:14:35 Backupsvr kernel: mdcmd (29): spindown 3 (Routine) Jul 26 18:14:37 Backupsvr kernel: mdcmd (30): spindown 3 (Routine) Jul 26 18:14:40 Backupsvr kernel: mdcmd (31): spindown 3 (Routine) Jul 26 18:14:45 Backupsvr kernel: mdcmd (32): spindown 3 (Routine) Jul 26 18:14:48 Backupsvr kernel: mdcmd (33): spindown 3 (Routine) Jul 26 18:14:50 Backupsvr kernel: mdcmd (34): spindown 3 (Routine) Jul 26 18:14:55 Backupsvr kernel: mdcmd (35): spindown 3 (Routine) Jul 26 18:14:59 Backupsvr kernel: mdcmd (36): spindown 3 (Routine) Jul 26 18:15:01 Backupsvr kernel: mdcmd (37): spindown 3 (Routine) Jul 26 18:15:05 Backupsvr kernel: mdcmd (38): spindown 3 (Routine) Jul 26 18:15:10 Backupsvr kernel: mdcmd (39): spindown 3 (Routine) Jul 26 18:15:12 Backupsvr kernel: mdcmd (40): spindown 3 (Routine) Jul 26 18:15:15 Backupsvr kernel: mdcmd (41): spindown 3 (Routine) EDIT - Think this is the cause, but why and whats the cure - the below line appears every thousand or so lines .... Jul 26 20:44:09 Backupsvr kernel: mdcmd (2622): spindown 3 Jul 26 20:44:10 Backupsvr emhttp: shcmd (204): /usr/sbin/hdparm -y /dev/sda &> /dev/null Jul 26 20:44:13 Backupsvr kernel: mdcmd (2623): spindown 3 EDIT 2 - even though all disks are shown as spun down i still have a disk activity light on my server and it sounds like the drive is spinning and trying to park over and over every few seconds. I have tried the following commands in case they helped root@Backupsvr:~# /usr/sbin/hdparm -y /dev/sdd /dev/sdd: issuing standby command root@Backupsvr:~# hdparm -C /dev/sdd /dev/sdd: drive state is: active/idle Think i might have to shut it down until i get some feedback as its going to do itself no good i think EDIT 3 - Tried to do a short smart test on the disk and you will see in the attached syslog that at the end it said something about an error with preclear - did not see this before as the result of the preclear was fine and disk went into array etc no problems at all and this did not show up until now - parity is still showing green. Also at the start of the syslog are some "errors" relating to ata1, not sure if these are relevant? Think i need some help as to what is the issue and what steps to take Thanks again syslog-2012-07-26.zip
July 27, 201213 yr Anyone got a view on this - would like to get it fixed today if possible That may not be easy. These lines show the drive was NOT pre-cleared. The exact same output was sent to your screen. Apparently, you missed it. It indicates the script was NOT able to read all zeros back from the drive after clearing it. (could be an indication of a bad drive, or bad disk controller port, or bad memory, or stressed power supply, or just about anything...) Jul 26 22:18:47 Backupsvr preclear_disk-diff[10569]: ========================================================================1.13 Jul 26 22:18:47 Backupsvr preclear_disk-diff[10569]: == invoked as: ./preclear_disk.sh -A /dev/sdd Jul 26 22:18:47 Backupsvr preclear_disk-diff[10569]: == Jul 26 22:18:47 Backupsvr preclear_disk-diff[10569]: == Disk /dev/sdd has NOT been successfully precleared Jul 26 22:18:47 Backupsvr preclear_disk-diff[10569]: == Postread detected un-expected non-zero bytes on disk== unRAID had some issue with disk spin-down in the earlier 5.0releases. This is supposed to be fixed in the current "rc6-test" release you can find here: http://lime-technology.com/forum/index.php?topic=21597.0 There have been a lot of fixes since your beta14 release. I might suggest you upgrade to it before going too far. Since you are already on the 5.0beta series it is just the replacement of bzimage and bzroot from the newest test release. Follow the instructions in the release notes for upgrading. Joe L.
July 27, 201213 yr Author Thank you Joe L If it said failed then i missed it ! i also looked in Unmenu and it showed the drive as precleared and when i put it into the array it accepted and formatted the disk just fine. It was only last night it suddenly showed a different result in unmenu - but i take on board that it now says it did not work correctly, would just have been nice for it to show up less than 2 days after it was put into the array and started. Any reason for the late appearance in the syslog of this info ?? Think i will do the upgrade to the latest Beta (RC6 ?) so i have the current fixes in place (like everyone else i would like to see a stable 5.0 soon so everyone knows which to use). Will also take the disk out of the array and preclear the disk again and see what the results are. I ran a smart report and that showed everything as fine and this disk was running in another unraid server for a couple of months before being taken out, cleared (or at least i thought so) and then put into this array - but you never know. I have no data on the disk so another preclear over the weekend is fine.
July 27, 201213 yr Thank you Joe L If it said failed then i missed it ! i also looked in Unmenu and it showed the drive as precleared and when i put it into the array it accepted and formatted the disk just fine. It was only last night it suddenly showed a different result in unmenu - but i take on board that it now says it did not work correctly, would just have been nice for it to show up less than 2 days after it was put into the array and started. Any reason for the late appearance in the syslog of this info ?? It is date-time-stamped. It finished on Jul 26 22:18:47 Jul 26 22:18:47 Backupsvr preclear_disk-diff[10569]: == Last Cycle's Pre Read Time : 6:14:05 (66 MB/s) Jul 26 22:18:47 Backupsvr preclear_disk-diff[10569]: == Last Cycle's Zeroing time : 5:56:19 (70 MB/s) Jul 26 22:18:47 Backupsvr preclear_disk-diff[10569]: == Last Cycle's Post Read Time : 14:23:01 (28 MB/s) Jul 26 22:18:47 Backupsvr preclear_disk-diff[10569]: == Last Cycle's Total Time : 26:34:27 Jul 26 22:18:47 Backupsvr preclear_disk-diff[10569]: == Jul 26 22:18:47 Backupsvr preclear_disk-diff[10569]: == Total Elapsed Time 26:34:27 Think i will do the upgrade to the latest Beta (RC6 ?) so i have the current fixes in place (like everyone else i would like to see a stable 5.0 soon so everyone knows which to use). Will also take the disk out of the array and preclear the disk again and see what the results are. I ran a smart report and that showed everything as fine and this disk was running in another unraid server for a couple of months before being taken out, cleared (or at least i thought so) and then put into this array - but you never know. I have no data on the disk so another preclear over the weekend is fine. Report the BUG on the report in myMain to bjp999. Clearly he is not showing the correct status upon preclear completion in the myMain display. I did not write that code.... I always use the output written by preclear itself. If the disk is assigned to the array, it will not be possible to run preclear on it. You'll need to un-assign it.
July 27, 201213 yr Author Hmm, Disk 3 (the new one) had been added to array and showed up as a new disk before then, it then showed up as needing to be formatted - this time and date stamp is a long time after i did that so i can only assume that it tried to clear the drive again and that is where the error came in? i can see some stuff in the syslog but dont know enough to be able to really tell, i can just see a new disk a long time before this error timestamp. The error did show in unmenu at the time it is in the syslog but everything reported fine up until then Anyway i will go through the process again and do an update to latest version and hope that reolves the issue Thanks again for the input
July 27, 201213 yr The error did show in unmenu at the time it is in the syslog but everything reported fine up until then Up until then it was reading, then zeroing the drive. It is only in the last phase that the disk is re-read and verified that it contains the expected zeroed content. That third/last re-read phase was enhanced to perform the verify to catch exactly the type of issue you have. A disk that does not return zeros as expected, but shows no other error.
July 29, 201213 yr Author OK, precleared sucessful, added disk to array and started rebuild etc, hope that resolves the issue Could somone just check the following in mt syslog on boot up - does it mean i have a disk mounting at a slow speed as the preclear took 33 hours for a 1.5 TB which is much slower than 2TB's have been Jul 29 10:51:04 Backupsvr kernel: scsi6 : usb-storage 2-3:1.0 (Drive related) Jul 29 10:51:04 Backupsvr kernel: ata5: SATA link down (SStatus 0 SControl 300) (Drive related) Jul 29 10:51:04 Backupsvr kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) (Drive related) Jul 29 10:51:04 Backupsvr kernel: ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300) (Drive related) Jul 29 10:51:04 Backupsvr kernel: ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300) (Drive related) Jul 29 10:51:04 Backupsvr kernel: ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) (Drive related) Jul 29 10:51:04 Backupsvr kernel: ata1.00: failed to enable AA (error_mask=0x1) (Errors) Jul 29 10:51:04 Backupsvr kernel: ata1.00: ATA-8: VB0250EAVER, HPG7, max UDMA/100 (Drive related) Jul 29 10:51:04 Backupsvr kernel: ata1.00: 488397168 sectors, multi 0: LBA48 NCQ (depth 31/32) (Drive related) Jul 29 10:51:04 Backupsvr kernel: ata6.00: ATA-8: ST3500320AS, SD15, max UDMA/133 (Drive related) Jul 29 10:51:04 Backupsvr kernel: ata6.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 31/32) (Drive related) Jul 29 10:51:04 Backupsvr kernel: ata3.00: ATA-8: ST32000542AS, CC34, max UDMA/133 (Drive related) Jul 29 10:51:04 Backupsvr kernel: ata3.00: 3907029168 sectors, multi 0: LBA48 NCQ (depth 31/32) (Drive related) Jul 29 10:51:04 Backupsvr kernel: ata1.00: failed to enable AA (error_mask=0x1) (Errors) Jul 29 10:51:04 Backupsvr kernel: ata1.00: configured for UDMA/100 (Drive related) Jul 29 10:51:04 Backupsvr kernel: scsi 0:0:0:0: Direct-Access ATA VB0250EAVER HPG7 PQ: 0 ANSI: 5 (Drive related) Jul 29 10:51:04 Backupsvr kernel: sd 0:0:0:0: [sda] 488397168 512-byte logical blocks: (250 GB/232 GiB) (Drive related) Jul 29 10:51:04 Backupsvr kernel: sd 0:0:0:0: [sda] Write Protect is off (Drive related)
July 31, 201213 yr Author Seems a good solid pre clear solved the majority of issues. Parity check done and all is now fine with disks spinning down properly and no rapidly growing syslog. Issue still open is the speed errors in the syslog as post above. These errors are consistant anfter recent restart ata1 max udma 100 when others are at udma 133. Does this mean i have a disk on a slow connection as this would affect parity check speed etc - how to diagnose which disk and why would be great if this is an issue Thanks
Archived
This topic is now archived and is closed to further replies.