WeeboTech Posted November 19, 2010 Share Posted November 19, 2010 There seems to be allot of change to the interface driver's output. Does mdVersion still exist? Does it change when the interface changes? That could be a tell tale sign for future validation. Guess it depends on what you think is "allot" The 'kernel oops' bug fix was actually found and fixed in 5.0-beta, so I am using that driver in 4.6-rc1. There are a few other changes in the driver done during 5.0 development (such as elimination of diskNumber variable). I have always considered the driver interface to be "private" though in practice it's used (via mdcmd) obviously by 3rd party add-ons. The intent is that with 5.0 I can make the driver interface truly private (because I'm planning big changes), but I guess this is a 5.0 discussion. In the meantime, I desire to restore any functionality necessary to get 4.6 100% backward compatible. Part of what I was trying to say/ask is, does the version number change in the driver output when interface changes occur? This would be a clear indication that the interface has changed and developers should alert on an interface version change. I.E. A tool could use the version number to handle behavioral changes based on interface options. Link to comment
limetech Posted November 19, 2010 Author Share Posted November 19, 2010 There seems to be allot of change to the interface driver's output. Does mdVersion still exist? Does it change when the interface changes? That could be a tell tale sign for future validation. Guess it depends on what you think is "allot" The 'kernel oops' bug fix was actually found and fixed in 5.0-beta, so I am using that driver in 4.6-rc1. There are a few other changes in the driver done during 5.0 development (such as elimination of diskNumber variable). I have always considered the driver interface to be "private" though in practice it's used (via mdcmd) obviously by 3rd party add-ons. The intent is that with 5.0 I can make the driver interface truly private (because I'm planning big changes), but I guess this is a 5.0 discussion. In the meantime, I desire to restore any functionality necessary to get 4.6 100% backward compatible. Part of what I was trying to say/ask is, does the version number change in the driver output when interface changes occur? This would be a clear indication that the interface has changed and developers should alert on an interface version change. I.E. A tool could use the version number to handle behavioral changes based on interface options. In the past I have not imposed much rigor on this, since I've always considered the driver i/f to be 'private' (a mistake I know). The version number got changed when an incompatibility existed between the driver and the higher level management code (ie, emhttp). This is because the driver is part of the linux kernel 'bzimage' and emhttp is part of the root file system 'bzroot'. Hence emhttp checks md version to ensure someone, e.g., doesn't upgrade just bzroot or just bzimage when updating their software. Link to comment
limetech Posted November 19, 2010 Author Share Posted November 19, 2010 I came home to an unresponsive Unraid, I have been running this new build (move from my old server) for about 3 weeks with no problems. I have IPMI so I logged into and saw this: Is that a server crash? I rebooted and server is coming up and disks still show as mounting and writes are taking place and slowly they are mounting one by one... I'll upgrade to RC2 shortly, but want to make sure I know what the problem was... Thanks. G Yes that's a crash, but doesn't look like the one this release fixes. If you saved the system log please post it. Link to comment
limetech Posted November 19, 2010 Author Share Posted November 19, 2010 Ok, well things do look better. I hate doing this but now it's acting similar to your version 4.5.6 where it mdcmd display a status for all 20 disks (wether installed or not). It's not a big deal because your interface hides them and so does unraid notify but Bubba's shows them all: sbName=/boot/config/super.dat sbVersion=0.95.4 sbCreated=1279688098 sbUpdated=1290115134 sbEvents=121 sbState=0 sbNumDisks=7 sbSynced=1290029380 sbSyncErrs=0 mdVersion=1.1.0 mdState=STARTED mdNumProtected=7 mdNumDisabled=0 mdDisabledDisk=0 mdNumInvalid=0 mdInvalidDisk=0 mdNumMissing=0 mdMissingDisk=0 mdNumNew=0 mdResync=0 mdResyncPos=0 mdResyncPrcnt=0 mdResyncFinish=0 mdResyncSpeed=0 diskNumber.0=0 diskName.0= diskSize.0=1953514552 diskState.0=7 diskModel.0=WDC WD20EARS-00M diskSerial.0=WD-WMAZ20043234 diskId.0=WDC_WD20EARS-00M_WD-WMAZ20043234 rdevActive.0=1 rdevNumber.0=0 rdevStatus.0=DISK_OK rdevName.0=sdc rdevSize.0=1953514552 rdevModel.0=WDC WD20EARS-00M rdevSerial.0=WD-WMAZ20043234 rdevId.0=WDC_WD20EARS-00M_WD-WMAZ20043234 rdevNumReads.0=0 rdevNumWrites.0=0 rdevNumErrors.0=0 rdevLastIO.0=1290115135 rdevSpinupGroup.0=0 diskNumber.1=1 diskName.1=md1 diskSize.1=488386552 diskState.1=7 diskModel.1=Maxtor 6H500R0 diskSerial.1=H80J9HGH diskId.1=Maxtor_6H500R0_H80J9HGH rdevActive.1=1 rdevNumber.1=1 rdevStatus.1=DISK_OK rdevName.1=hda rdevSize.1=488386552 rdevModel.1=Maxtor 6H500R0 rdevSerial.1=H80J9HGH rdevId.1=Maxtor_6H500R0_H80J9HGH rdevNumReads.1=0 rdevNumWrites.1=0 rdevNumErrors.1=0 rdevLastIO.1=1290115162 rdevSpinupGroup.1=0 diskNumber.2=2 diskName.2=md2 diskSize.2=488386552 diskState.2=7 diskModel.2=WDC WD5000AAKS-7 diskSerial.2=WD-WMASY4103612 diskId.2=WDC_WD5000AAKS-7_WD-WMASY4103612 rdevActive.2=1 rdevNumber.2=2 rdevStatus.2=DISK_OK rdevName.2=sdd rdevSize.2=488386552 rdevModel.2=WDC WD5000AAKS-7 rdevSerial.2=WD-WMASY4103612 rdevId.2=WDC_WD5000AAKS-7_WD-WMASY4103612 rdevNumReads.2=0 rdevNumWrites.2=0 rdevNumErrors.2=0 rdevLastIO.2=1290115467 rdevSpinupGroup.2=0 diskNumber.3=3 diskName.3=md3 diskSize.3=976762552 diskState.3=7 diskModel.3=WDC WD1001FALS-0 diskSerial.3=WD-WMATV5570723 diskId.3=WDC_WD1001FALS-0_WD-WMATV5570723 rdevActive.3=1 rdevNumber.3=3 rdevStatus.3=DISK_OK rdevName.3=sda rdevSize.3=976762552 rdevModel.3=WDC WD1001FALS-0 rdevSerial.3=WD-WMATV5570723 rdevId.3=WDC_WD1001FALS-0_WD-WMATV5570723 rdevNumReads.3=0 rdevNumWrites.3=0 rdevNumErrors.3=0 rdevLastIO.3=1290115475 rdevSpinupGroup.3=0 diskNumber.4=4 diskName.4=md4 diskSize.4=1465138552 diskState.4=7 diskModel.4=WDC WD15EARS-00S diskSerial.4=WD-WCAVY1930651 diskId.4=WDC_WD15EARS-00S_WD-WCAVY1930651 rdevActive.4=1 rdevNumber.4=4 rdevStatus.4=DISK_OK rdevName.4=sdf rdevSize.4=1465138552 rdevModel.4=WDC WD15EARS-00S rdevSerial.4=WD-WCAVY1930651 rdevId.4=WDC_WD15EARS-00S_WD-WCAVY1930651 rdevNumReads.4=0 rdevNumWrites.4=0 rdevNumErrors.4=0 rdevLastIO.4=1290115472 rdevSpinupGroup.4=0 diskNumber.5=5 diskName.5=md5 diskSize.5=625131832 diskState.5=7 diskModel.5=WDC WD6400AAKS-0 diskSerial.5=WD-WCASY0373935 diskId.5=WDC_WD6400AAKS-0_WD-WCASY0373935 rdevActive.5=1 rdevNumber.5=5 rdevStatus.5=DISK_OK rdevName.5=sdb rdevSize.5=625131832 rdevModel.5=WDC WD6400AAKS-0 rdevSerial.5=WD-WCASY0373935 rdevId.5=WDC_WD6400AAKS-0_WD-WCASY0373935 rdevNumReads.5=0 rdevNumWrites.5=0 rdevNumErrors.5=0 rdevLastIO.5=1290115190 rdevSpinupGroup.5=0 diskNumber.6=6 diskName.6=md6 diskSize.6=488386552 diskState.6=7 diskModel.6=Hitachi HTS54505 diskSerial.6=100410PBN40017CZKJXE diskId.6=Hitachi_HTS54505_100410PBN40017CZKJXE rdevActive.6=1 rdevNumber.6=6 rdevStatus.6=DISK_OK rdevName.6=sdg rdevSize.6=488386552 rdevModel.6=Hitachi HTS54505 rdevSerial.6=100410PBN40017CZKJXE rdevId.6=Hitachi_HTS54505_100410PBN40017CZKJXE rdevNumReads.6=0 rdevNumWrites.6=0 rdevNumErrors.6=0 rdevLastIO.6=1290115149 rdevSpinupGroup.6=0 diskNumber.7=7 diskName.7= diskSize.7=0 diskState.7=0 diskModel.7= diskSerial.7= diskId.7= rdevActive.7=0 rdevNumber.7=7 rdevStatus.7=DISK_NP rdevName.7= rdevSize.7=0 rdevModel.7= rdevSerial.7= rdevId.7= rdevNumReads.7=0 rdevNumWrites.7=0 rdevNumErrors.7=0 rdevLastIO.7=0 rdevSpinupGroup.7=0 diskNumber.8=8 diskName.8= diskSize.8=0 diskState.8=0 diskModel.8= diskSerial.8= diskId.8= rdevActive.8=0 rdevNumber.8=8 rdevStatus.8=DISK_NP rdevName.8= rdevSize.8=0 rdevModel.8= rdevSerial.8= rdevId.8= rdevNumReads.8=0 rdevNumWrites.8=0 rdevNumErrors.8=0 rdevLastIO.8=0 rdevSpinupGroup.8=0 diskNumber.9=9 diskName.9= diskSize.9=0 diskState.9=0 diskModel.9= diskSerial.9= diskId.9= rdevActive.9=0 rdevNumber.9=9 rdevStatus.9=DISK_NP rdevName.9= rdevSize.9=0 rdevModel.9= rdevSerial.9= rdevId.9= rdevNumReads.9=0 rdevNumWrites.9=0 rdevNumErrors.9=0 rdevLastIO.9=0 rdevSpinupGroup.9=0 diskNumber.10=10 diskName.10= diskSize.10=0 diskState.10=0 diskModel.10= diskSerial.10= diskId.10= rdevActive.10=0 rdevNumber.10=10 rdevStatus.10=DISK_NP rdevName.10= rdevSize.10=0 rdevModel.10= rdevSerial.10= rdevId.10= rdevNumReads.10=0 rdevNumWrites.10=0 rdevNumErrors.10=0 rdevLastIO.10=0 rdevSpinupGroup.10=0 diskNumber.11=11 diskName.11= diskSize.11=0 diskState.11=0 diskModel.11= diskSerial.11= diskId.11= rdevActive.11=0 rdevNumber.11=11 rdevStatus.11=DISK_NP rdevName.11= rdevSize.11=0 rdevModel.11= rdevSerial.11= rdevId.11= rdevNumReads.11=0 rdevNumWrites.11=0 rdevNumErrors.11=0 rdevLastIO.11=0 rdevSpinupGroup.11=0 diskNumber.12=12 diskName.12= diskSize.12=0 diskState.12=0 diskModel.12= diskSerial.12= diskId.12= rdevActive.12=0 rdevNumber.12=12 rdevStatus.12=DISK_NP rdevName.12= rdevSize.12=0 rdevModel.12= rdevSerial.12= rdevId.12= rdevNumReads.12=0 rdevNumWrites.12=0 rdevNumErrors.12=0 rdevLastIO.12=0 rdevSpinupGroup.12=0 diskNumber.13=13 diskName.13= diskSize.13=0 diskState.13=0 diskModel.13= diskSerial.13= diskId.13= rdevActive.13=0 rdevNumber.13=13 rdevStatus.13=DISK_NP rdevName.13= rdevSize.13=0 rdevModel.13= rdevSerial.13= rdevId.13= rdevNumReads.13=0 rdevNumWrites.13=0 rdevNumErrors.13=0 rdevLastIO.13=0 rdevSpinupGroup.13=0 diskNumber.14=14 diskName.14= diskSize.14=0 diskState.14=0 diskModel.14= diskSerial.14= diskId.14= rdevActive.14=0 rdevNumber.14=14 rdevStatus.14=DISK_NP rdevName.14= rdevSize.14=0 rdevModel.14= rdevSerial.14= rdevId.14= rdevNumReads.14=0 rdevNumWrites.14=0 rdevNumErrors.14=0 rdevLastIO.14=0 rdevSpinupGroup.14=0 diskNumber.15=15 diskName.15= diskSize.15=0 diskState.15=0 diskModel.15= diskSerial.15= diskId.15= rdevActive.15=0 rdevNumber.15=15 rdevStatus.15=DISK_NP rdevName.15= rdevSize.15=0 rdevModel.15= rdevSerial.15= rdevId.15= rdevNumReads.15=0 rdevNumWrites.15=0 rdevNumErrors.15=0 rdevLastIO.15=0 rdevSpinupGroup.15=0 diskNumber.16=16 diskName.16= diskSize.16=0 diskState.16=0 diskModel.16= diskSerial.16= diskId.16= rdevActive.16=0 rdevNumber.16=16 rdevStatus.16=DISK_NP rdevName.16= rdevSize.16=0 rdevModel.16= rdevSerial.16= rdevId.16= rdevNumReads.16=0 rdevNumWrites.16=0 rdevNumErrors.16=0 rdevLastIO.16=0 rdevSpinupGroup.16=0 diskNumber.17=17 diskName.17= diskSize.17=0 diskState.17=0 diskModel.17= diskSerial.17= diskId.17= rdevActive.17=0 rdevNumber.17=17 rdevStatus.17=DISK_NP rdevName.17= rdevSize.17=0 rdevModel.17= rdevSerial.17= rdevId.17= rdevNumReads.17=0 rdevNumWrites.17=0 rdevNumErrors.17=0 rdevLastIO.17=0 rdevSpinupGroup.17=0 diskNumber.18=18 diskName.18= diskSize.18=0 diskState.18=0 diskModel.18= diskSerial.18= diskId.18= rdevActive.18=0 rdevNumber.18=18 rdevStatus.18=DISK_NP rdevName.18= rdevSize.18=0 rdevModel.18= rdevSerial.18= rdevId.18= rdevNumReads.18=0 rdevNumWrites.18=0 rdevNumErrors.18=0 rdevLastIO.18=0 rdevSpinupGroup.18=0 diskNumber.19=19 diskName.19= diskSize.19=0 diskState.19=0 diskModel.19= diskSerial.19= diskId.19= rdevActive.19=0 rdevNumber.19=19 rdevStatus.19=DISK_NP rdevName.19= rdevSize.19=0 rdevModel.19= rdevSerial.19= rdevId.19= rdevNumReads.19=0 rdevNumWrites.19=0 rdevNumErrors.19=0 rdevLastIO.19=0 rdevSpinupGroup.19=0 diskNumber.20=20 diskName.20= diskSize.20=0 diskState.20=0 diskModel.20= diskSerial.20= diskId.20= rdevActive.20=0 rdevNumber.20=20 rdevStatus.20=DISK_NP rdevName.20= rdevSize.20=0 rdevModel.20= rdevSerial.20= rdevId.20= rdevNumReads.20=0 rdevNumWrites.20=0 rdevNumErrors.20=0 rdevLastIO.20=0 rdevSpinupGroup.20=0 On the other hand it might be impacting "cache_dir". when I try to stop the array, cache_dir does not get killed. I can't access my logs just yet, i'll do that when I get home but if anyone else can replicate it with only cache_dir running, let us know. Any more info on this? I can release -rc3 that to prevent vars from non-present drives being output if necessary. Link to comment
lewcass Posted November 19, 2010 Share Posted November 19, 2010 Installed 4.6-rc2 and now unMENU (1.3) reports there is a parity check underway, when there actually is not. The official unRAID web menu indicates there is no parity check in progress... I had the same thing happen when I installed 4.5.8. Editing the Send Status Alert package (changed "Send mail even if status is normal" to no) and restarting UnMenu fixed it for me - not sure if I had to do both things, but I no longer get the hourly parity emails (parity wasn't running, like yours the progress was always at 0%) I've since upgraded to 4.6.1 RC and it didn't come back. I appreciate the input. However I don't have the Send Status Alert package installed. I don't think I'll experiment with it either for now unless Tom or Joe think it would be helpful. For now I think I'll wait and see what they come up with. Thanks. Edit: It might be worth mentioning that I am also running FEMUR. Link to comment
flixxx Posted November 19, 2010 Share Posted November 19, 2010 Ok, well things do look better. I hate doing this but now it's acting similar to your version 4.5.6 where it mdcmd display a status for all ... On the other hand it might be impacting "cache_dir". when I try to stop the array, cache_dir does not get killed. I can't access my logs just yet, i'll do that when I get home but if anyone else can replicate it with only cache_dir running, let us know. Any more info on this? I can release -rc3 that to prevent vars from non-present drives being output if necessary. Ok, well things do look better. I hate doing this but now it's acting similar to your version 4.5.6 where it mdcmd display a status for all 20 disks (wether installed or not). It's not a big deal because your interface hides them and so does unraid notify but Bubba's shows them all: sbName=/boot/config/super.dat sbVersion=0.95.4 sbCreated=1279688098 sbUpdated=1290115134 sbEvents=121 sbState=0 sbNumDisks=7 sbSynced=1290029380 sbSyncErrs=0 mdVersion=1.1.0 mdState=STARTED mdNumProtected=7 mdNumDisabled=0 mdDisabledDisk=0 mdNumInvalid=0 mdInvalidDisk=0 mdNumMissing=0 mdMissingDisk=0 mdNumNew=0 mdResync=0 mdResyncPos=0 mdResyncPrcnt=0 mdResyncFinish=0 mdResyncSpeed=0 diskNumber.0=0 diskName.0= diskSize.0=1953514552 diskState.0=7 diskModel.0=WDC WD20EARS-00M diskSerial.0=WD-WMAZ20043234 diskId.0=WDC_WD20EARS-00M_WD-WMAZ20043234 rdevActive.0=1 rdevNumber.0=0 rdevStatus.0=DISK_OK rdevName.0=sdc rdevSize.0=1953514552 rdevModel.0=WDC WD20EARS-00M rdevSerial.0=WD-WMAZ20043234 rdevId.0=WDC_WD20EARS-00M_WD-WMAZ20043234 rdevNumReads.0=0 rdevNumWrites.0=0 rdevNumErrors.0=0 rdevLastIO.0=1290115135 rdevSpinupGroup.0=0 diskNumber.1=1 diskName.1=md1 diskSize.1=488386552 diskState.1=7 diskModel.1=Maxtor 6H500R0 diskSerial.1=H80J9HGH diskId.1=Maxtor_6H500R0_H80J9HGH rdevActive.1=1 rdevNumber.1=1 rdevStatus.1=DISK_OK rdevName.1=hda rdevSize.1=488386552 rdevModel.1=Maxtor 6H500R0 rdevSerial.1=H80J9HGH rdevId.1=Maxtor_6H500R0_H80J9HGH rdevNumReads.1=0 rdevNumWrites.1=0 rdevNumErrors.1=0 rdevLastIO.1=1290115162 rdevSpinupGroup.1=0 diskNumber.2=2 diskName.2=md2 diskSize.2=488386552 diskState.2=7 diskModel.2=WDC WD5000AAKS-7 diskSerial.2=WD-WMASY4103612 diskId.2=WDC_WD5000AAKS-7_WD-WMASY4103612 rdevActive.2=1 rdevNumber.2=2 rdevStatus.2=DISK_OK rdevName.2=sdd rdevSize.2=488386552 rdevModel.2=WDC WD5000AAKS-7 rdevSerial.2=WD-WMASY4103612 rdevId.2=WDC_WD5000AAKS-7_WD-WMASY4103612 rdevNumReads.2=0 rdevNumWrites.2=0 rdevNumErrors.2=0 rdevLastIO.2=1290115467 rdevSpinupGroup.2=0 diskNumber.3=3 diskName.3=md3 diskSize.3=976762552 diskState.3=7 diskModel.3=WDC WD1001FALS-0 diskSerial.3=WD-WMATV5570723 diskId.3=WDC_WD1001FALS-0_WD-WMATV5570723 rdevActive.3=1 rdevNumber.3=3 rdevStatus.3=DISK_OK rdevName.3=sda rdevSize.3=976762552 rdevModel.3=WDC WD1001FALS-0 rdevSerial.3=WD-WMATV5570723 rdevId.3=WDC_WD1001FALS-0_WD-WMATV5570723 rdevNumReads.3=0 rdevNumWrites.3=0 rdevNumErrors.3=0 rdevLastIO.3=1290115475 rdevSpinupGroup.3=0 diskNumber.4=4 diskName.4=md4 diskSize.4=1465138552 diskState.4=7 diskModel.4=WDC WD15EARS-00S diskSerial.4=WD-WCAVY1930651 diskId.4=WDC_WD15EARS-00S_WD-WCAVY1930651 rdevActive.4=1 rdevNumber.4=4 rdevStatus.4=DISK_OK rdevName.4=sdf rdevSize.4=1465138552 rdevModel.4=WDC WD15EARS-00S rdevSerial.4=WD-WCAVY1930651 rdevId.4=WDC_WD15EARS-00S_WD-WCAVY1930651 rdevNumReads.4=0 rdevNumWrites.4=0 rdevNumErrors.4=0 rdevLastIO.4=1290115472 rdevSpinupGroup.4=0 diskNumber.5=5 diskName.5=md5 diskSize.5=625131832 diskState.5=7 diskModel.5=WDC WD6400AAKS-0 diskSerial.5=WD-WCASY0373935 diskId.5=WDC_WD6400AAKS-0_WD-WCASY0373935 rdevActive.5=1 rdevNumber.5=5 rdevStatus.5=DISK_OK rdevName.5=sdb rdevSize.5=625131832 rdevModel.5=WDC WD6400AAKS-0 rdevSerial.5=WD-WCASY0373935 rdevId.5=WDC_WD6400AAKS-0_WD-WCASY0373935 rdevNumReads.5=0 rdevNumWrites.5=0 rdevNumErrors.5=0 rdevLastIO.5=1290115190 rdevSpinupGroup.5=0 diskNumber.6=6 diskName.6=md6 diskSize.6=488386552 diskState.6=7 diskModel.6=Hitachi HTS54505 diskSerial.6=100410PBN40017CZKJXE diskId.6=Hitachi_HTS54505_100410PBN40017CZKJXE rdevActive.6=1 rdevNumber.6=6 rdevStatus.6=DISK_OK rdevName.6=sdg rdevSize.6=488386552 rdevModel.6=Hitachi HTS54505 rdevSerial.6=100410PBN40017CZKJXE rdevId.6=Hitachi_HTS54505_100410PBN40017CZKJXE rdevNumReads.6=0 rdevNumWrites.6=0 rdevNumErrors.6=0 rdevLastIO.6=1290115149 rdevSpinupGroup.6=0 diskNumber.7=7 diskName.7= diskSize.7=0 diskState.7=0 diskModel.7= diskSerial.7= diskId.7= rdevActive.7=0 rdevNumber.7=7 rdevStatus.7=DISK_NP rdevName.7= rdevSize.7=0 rdevModel.7= rdevSerial.7= rdevId.7= rdevNumReads.7=0 rdevNumWrites.7=0 rdevNumErrors.7=0 rdevLastIO.7=0 rdevSpinupGroup.7=0 diskNumber.8=8 diskName.8= diskSize.8=0 diskState.8=0 diskModel.8= diskSerial.8= diskId.8= rdevActive.8=0 rdevNumber.8=8 rdevStatus.8=DISK_NP rdevName.8= rdevSize.8=0 rdevModel.8= rdevSerial.8= rdevId.8= rdevNumReads.8=0 rdevNumWrites.8=0 rdevNumErrors.8=0 rdevLastIO.8=0 rdevSpinupGroup.8=0 diskNumber.9=9 diskName.9= diskSize.9=0 diskState.9=0 diskModel.9= diskSerial.9= diskId.9= rdevActive.9=0 rdevNumber.9=9 rdevStatus.9=DISK_NP rdevName.9= rdevSize.9=0 rdevModel.9= rdevSerial.9= rdevId.9= rdevNumReads.9=0 rdevNumWrites.9=0 rdevNumErrors.9=0 rdevLastIO.9=0 rdevSpinupGroup.9=0 diskNumber.10=10 diskName.10= diskSize.10=0 diskState.10=0 diskModel.10= diskSerial.10= diskId.10= rdevActive.10=0 rdevNumber.10=10 rdevStatus.10=DISK_NP rdevName.10= rdevSize.10=0 rdevModel.10= rdevSerial.10= rdevId.10= rdevNumReads.10=0 rdevNumWrites.10=0 rdevNumErrors.10=0 rdevLastIO.10=0 rdevSpinupGroup.10=0 diskNumber.11=11 diskName.11= diskSize.11=0 diskState.11=0 diskModel.11= diskSerial.11= diskId.11= rdevActive.11=0 rdevNumber.11=11 rdevStatus.11=DISK_NP rdevName.11= rdevSize.11=0 rdevModel.11= rdevSerial.11= rdevId.11= rdevNumReads.11=0 rdevNumWrites.11=0 rdevNumErrors.11=0 rdevLastIO.11=0 rdevSpinupGroup.11=0 diskNumber.12=12 diskName.12= diskSize.12=0 diskState.12=0 diskModel.12= diskSerial.12= diskId.12= rdevActive.12=0 rdevNumber.12=12 rdevStatus.12=DISK_NP rdevName.12= rdevSize.12=0 rdevModel.12= rdevSerial.12= rdevId.12= rdevNumReads.12=0 rdevNumWrites.12=0 rdevNumErrors.12=0 rdevLastIO.12=0 rdevSpinupGroup.12=0 diskNumber.13=13 diskName.13= diskSize.13=0 diskState.13=0 diskModel.13= diskSerial.13= diskId.13= rdevActive.13=0 rdevNumber.13=13 rdevStatus.13=DISK_NP rdevName.13= rdevSize.13=0 rdevModel.13= rdevSerial.13= rdevId.13= rdevNumReads.13=0 rdevNumWrites.13=0 rdevNumErrors.13=0 rdevLastIO.13=0 rdevSpinupGroup.13=0 diskNumber.14=14 diskName.14= diskSize.14=0 diskState.14=0 diskModel.14= diskSerial.14= diskId.14= rdevActive.14=0 rdevNumber.14=14 rdevStatus.14=DISK_NP rdevName.14= rdevSize.14=0 rdevModel.14= rdevSerial.14= rdevId.14= rdevNumReads.14=0 rdevNumWrites.14=0 rdevNumErrors.14=0 rdevLastIO.14=0 rdevSpinupGroup.14=0 diskNumber.15=15 diskName.15= diskSize.15=0 diskState.15=0 diskModel.15= diskSerial.15= diskId.15= rdevActive.15=0 rdevNumber.15=15 rdevStatus.15=DISK_NP rdevName.15= rdevSize.15=0 rdevModel.15= rdevSerial.15= rdevId.15= rdevNumReads.15=0 rdevNumWrites.15=0 rdevNumErrors.15=0 rdevLastIO.15=0 rdevSpinupGroup.15=0 diskNumber.16=16 diskName.16= diskSize.16=0 diskState.16=0 diskModel.16= diskSerial.16= diskId.16= rdevActive.16=0 rdevNumber.16=16 rdevStatus.16=DISK_NP rdevName.16= rdevSize.16=0 rdevModel.16= rdevSerial.16= rdevId.16= rdevNumReads.16=0 rdevNumWrites.16=0 rdevNumErrors.16=0 rdevLastIO.16=0 rdevSpinupGroup.16=0 diskNumber.17=17 diskName.17= diskSize.17=0 diskState.17=0 diskModel.17= diskSerial.17= diskId.17= rdevActive.17=0 rdevNumber.17=17 rdevStatus.17=DISK_NP rdevName.17= rdevSize.17=0 rdevModel.17= rdevSerial.17= rdevId.17= rdevNumReads.17=0 rdevNumWrites.17=0 rdevNumErrors.17=0 rdevLastIO.17=0 rdevSpinupGroup.17=0 diskNumber.18=18 diskName.18= diskSize.18=0 diskState.18=0 diskModel.18= diskSerial.18= diskId.18= rdevActive.18=0 rdevNumber.18=18 rdevStatus.18=DISK_NP rdevName.18= rdevSize.18=0 rdevModel.18= rdevSerial.18= rdevId.18= rdevNumReads.18=0 rdevNumWrites.18=0 rdevNumErrors.18=0 rdevLastIO.18=0 rdevSpinupGroup.18=0 diskNumber.19=19 diskName.19= diskSize.19=0 diskState.19=0 diskModel.19= diskSerial.19= diskId.19= rdevActive.19=0 rdevNumber.19=19 rdevStatus.19=DISK_NP rdevName.19= rdevSize.19=0 rdevModel.19= rdevSerial.19= rdevId.19= rdevNumReads.19=0 rdevNumWrites.19=0 rdevNumErrors.19=0 rdevLastIO.19=0 rdevSpinupGroup.19=0 diskNumber.20=20 diskName.20= diskSize.20=0 diskState.20=0 diskModel.20= diskSerial.20= diskId.20= rdevActive.20=0 rdevNumber.20=20 rdevStatus.20=DISK_NP rdevName.20= rdevSize.20=0 rdevModel.20= rdevSerial.20= rdevId.20= rdevNumReads.20=0 rdevNumWrites.20=0 rdevNumErrors.20=0 rdevLastIO.20=0 rdevSpinupGroup.20=0 On the other hand it might be impacting "cache_dir". when I try to stop the array, cache_dir does not get killed. I can't access my logs just yet, i'll do that when I get home but if anyone else can replicate it with only cache_dir running, let us know. Any more info on this? I can release -rc3 that to prevent vars from non-present drives being output if necessary. Hi, i'd like it to suppress the non existing drives from showing up. This will make Bubba's interface only display the added drives. Cache_dir behaved properly after further tests; Nov 19 11:13:46 kenny cache_dirs: Suspending cache_dirs for 120 seconds to allow for clean shutdown of array Nov 19 11:13:46 kenny cache_dirs: While suspended, pressing "Stop" on the unRAID management web-interface will shutdown the array Nov 19 11:13:49 kenny emhttp: shcmd (85): umount /mnt/disk1 >/dev/null 2>&1 Nov 19 11:13:50 kenny emhttp: shcmd (86): rmdir /mnt/disk1 >/dev/null 2>&1 Nov 19 11:13:50 kenny emhttp: shcmd (87): umount /mnt/disk2 >/dev/null 2>&1 Nov 19 11:13:50 kenny emhttp: shcmd (88): rmdir /mnt/disk2 >/dev/null 2>&1 Nov 19 11:13:50 kenny emhttp: shcmd (89): umount /mnt/disk3 >/dev/null 2>&1 Nov 19 11:13:51 kenny emhttp: shcmd (90): rmdir /mnt/disk3 >/dev/null 2>&1 Nov 19 11:13:51 kenny emhttp: shcmd (91): umount /mnt/disk4 >/dev/null 2>&1 Nov 19 11:13:51 kenny emhttp: shcmd (92): rmdir /mnt/disk4 >/dev/null 2>&1 Nov 19 11:13:51 kenny emhttp: shcmd (93): umount /mnt/disk5 >/dev/null 2>&1 Nov 19 11:13:52 kenny emhttp: shcmd (94): rmdir /mnt/disk5 >/dev/null 2>&1 Nov 19 11:13:52 kenny emhttp: shcmd (95): umount /mnt/disk6 >/dev/null 2>&1 Nov 19 11:13:53 kenny emhttp: shcmd (96): rmdir /mnt/disk6 >/dev/null 2>&1 Nov 19 11:13:53 kenny emhttp: shcmd (97): umount /mnt/cache >/dev/null 2>&1 Nov 19 11:13:53 kenny emhttp: shcmd (98): rmdir /mnt/cache >/dev/null 2>&1 Nov 19 11:13:53 kenny kernel: mdcmd (106): stop Nov 19 11:13:53 kenny kernel: md1: stopping Nov 19 11:13:53 kenny kernel: md2: stopping Nov 19 11:13:53 kenny kernel: md3: stopping Nov 19 11:13:53 kenny kernel: md4: stopping Nov 19 11:13:53 kenny kernel: md5: stopping Nov 19 11:13:53 kenny kernel: md6: stopping Nov 19 11:13:54 kenny emhttp: shcmd (99): rm /etc/samba/smb-shares.conf >/dev/nul l 2>&1 Nov 19 11:13:54 kenny emhttp: shcmd (100): cp /etc/exports- /etc/exports Nov 19 11:13:54 kenny emhttp: shcmd (101): /etc/rc.d/rc.samba start | logger On another note, I have unmenu installed also and I don't see it "parity check in progress" everything looks fine. (Mind you i think i have an old unmenu installed so take it with a grain of salt) Link to comment
gbdesai Posted November 19, 2010 Share Posted November 19, 2010 I came home to an unresponsive Unraid, I have been running this new build (move from my old server) for about 3 weeks with no problems. I have IPMI so I logged into and saw this: Is that a server crash? I rebooted and server is coming up and disks still show as mounting and writes are taking place and slowly they are mounting one by one... I'll upgrade to RC2 shortly, but want to make sure I know what the problem was... Thanks. G Yes that's a crash, but doesn't look like the one this release fixes. If you saved the system log please post it. Couldn't do that as the server was locked up tight as a drum, on restart I lost the log we needed. EDIT: On man this is not good, another crash, I've been running Unraid for a long time now with no problems and suddenly I got a problem. I am running 4.6-rc2, here is a dump of the console screen, I am going to shut down and pull the flash and see if I can get the log on another machine... G Opps. Forgot that the log is held in memory, so there is no way for me to get the log off the machine after a crash... Any ideas on what might me happening? Here is the specs on my system (at least the parts pertinent to this problem)... - 3 x SUPERMICRO AOC-SASLP-MV8 PCI Express x4 Low Profile SAS SAS RAID Controller - SUPERMICRO MBD-X8SIL-F-O Xeon X3400 / L3400 / Core i3 series Dual LAN Micro ATX Server Board - Intel Core i3-560 Clarkdale 3.33GHz 4 x 256KB L2 Cache 4MB L3 Cache LGA 1156 73W Dual-Core Desktop Processor BX80616I35607200 - Kingston ValueRAM 4GB (2 x 2GB) 240-Pin DDR3 SDRAM ECC Unbuffered DDR3 1333 Server Memory I have a combo of 1.5TB 7200RPM, 2TB 7200RPM (Hitachi and Seagate - all these have been running for at least 8 months with no problems), 2TB (Green WD EARS with jumper; these are relatively new, but many have been running for a few weeks with no problem) Thanks. Link to comment
limetech Posted November 20, 2010 Author Share Posted November 20, 2010 What you can do is have an open telnet window on your PC with the syslog being output to it: 1. Start->Run-> telnet <server name> 2. Login 3. Type command: tail -f /var/log/syslog This will print the last 10 lines of the system log & then print any subsequent line as it occurs. If you see the crash again, then select/copy/paste the contents of the telnet window. Link to comment
limetech Posted November 20, 2010 Author Share Posted November 20, 2010 Here is the specs on my system (at least the parts pertinent to this problem)... - 3 x SUPERMICRO AOC-SASLP-MV8 PCI Express x4 Low Profile SAS SAS RAID Controller - SUPERMICRO MBD-X8SIL-F-O Xeon X3400 / L3400 / Core i3 series Dual LAN Micro ATX Server Board - Intel Core i3-560 Clarkdale 3.33GHz 4 x 256KB L2 Cache 4MB L3 Cache LGA 1156 73W Dual-Core Desktop Processor BX80616I35607200 - Kingston ValueRAM 4GB (2 x 2GB) 240-Pin DDR3 SDRAM ECC Unbuffered DDR3 1333 Server Memory I have a combo of 1.5TB 7200RPM, 2TB 7200RPM (Hitachi and Seagate - all these have been running for at least 8 months with no problems), 2TB (Green WD EARS with jumper; these are relatively new, but many have been running for a few weeks with no problem) Thanks. That's a beast of a system. Why do you have 3 of those SAS controllers in there? Link to comment
flambot Posted November 20, 2010 Share Posted November 20, 2010 On man this is not good, another crash, I've been running Unraid for a long time now with no problems and suddenly I got a problem. Me too.... Mine has run perfectly for 3 years, then after upgrading to 4.5.6 I started having issues. Something happens and you can't connect to the server using any means. I'm assuming I'm having a similar problem where the server has crashed, but am unable to grab a log also. Thx Tom for the log advice. Link to comment
gbdesai Posted November 20, 2010 Share Posted November 20, 2010 Here is the specs on my system (at least the parts pertinent to this problem)... - 3 x SUPERMICRO AOC-SASLP-MV8 PCI Express x4 Low Profile SAS SAS RAID Controller - SUPERMICRO MBD-X8SIL-F-O Xeon X3400 / L3400 / Core i3 series Dual LAN Micro ATX Server Board - Intel Core i3-560 Clarkdale 3.33GHz 4 x 256KB L2 Cache 4MB L3 Cache LGA 1156 73W Dual-Core Desktop Processor BX80616I35607200 - Kingston ValueRAM 4GB (2 x 2GB) 240-Pin DDR3 SDRAM ECC Unbuffered DDR3 1333 Server Memory I have a combo of 1.5TB 7200RPM, 2TB 7200RPM (Hitachi and Seagate - all these have been running for at least 8 months with no problems), 2TB (Green WD EARS with jumper; these are relatively new, but many have been running for a few weeks with no problem) Thanks. That's a beast of a system. Why do you have 3 of those SAS controllers in there? My first system was one you hand built for me, I upgraded that system to this new one and I have the 3 cards to support up to 24 disks. Right now I am am 20 + parity. Thanks for the telnet tip. I'll have to set up a laptop to watch for that. BTW, don't know if it makes a difference but I had a disk die on restart this time. It was fine SMART-wise, don't know what happened. But the slot works fine, I put another disk into it to rebuild and it's chugging away... Link to comment
gbdesai Posted November 20, 2010 Share Posted November 20, 2010 On man this is not good, another crash, I've been running Unraid for a long time now with no problems and suddenly I got a problem. Me too.... ... I just setup an old laptop with telnet to watch the log. Hopefully we can track this problem down soon. Link to comment
SSD Posted November 20, 2010 Share Posted November 20, 2010 I have been away for about a week and not keeping up. Can't believe all of the activity! Thanks Tom for your support of the 3rd party efforts! Wanted to request that we upgrade hdparm and smartctl in the 4.6 release (apologies if its already in there). New smartctl New hdparm Thanks Tom! Link to comment
purko Posted November 20, 2010 Share Posted November 20, 2010 Tom, would you kindly update the [ftp=ftp://ftp.osuosl.org/pub/slackware/slackware-current/slackware/a/]hdparm[/ftp] package? Link to comment
gbdesai Posted November 20, 2010 Share Posted November 20, 2010 May nor may not be related to my crashing problem, but I not get regular resync notifications even though no resync is happening... Subject:unRaid Resync Notification Status update for unRAID Tower - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Status: The unRaid array is resync/rebuilding parity. Parity CHECK/RESYNC in progress, 0% complete, est. finish in 0 minutes. Speed: 0 kb/s. Server Name: Tower Server IP: 192.168.111.3 Date: Sat Nov 20 10:47:07 PST 2010 Is there something in the system that makes the notification script think a resync is happening? Link to comment
Joe L. Posted November 20, 2010 Share Posted November 20, 2010 May nor may not be related to my crashing problem, but I not get regular resync notifications even though no resync is happening... Subject:unRaid Resync Notification Status update for unRAID Tower - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Status: The unRaid array is resync/rebuilding parity. Parity CHECK/RESYNC in progress, 0% complete, est. finish in 0 minutes. Speed: 0 kb/s. Server Name: Tower Server IP: 192.168.111.3 Date: Sat Nov 20 10:47:07 PST 2010 Is there something in the system that makes the notification script think a resync is happening? yes, it will need to be modified to deal with the new data structure unRAID is presenting. I'm waiting for rc3 before making any changes though. Link to comment
limetech Posted November 20, 2010 Author Share Posted November 20, 2010 May nor may not be related to my crashing problem, but I not get regular resync notifications even though no resync is happening... Subject:unRaid Resync Notification Status update for unRAID Tower - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Status: The unRaid array is resync/rebuilding parity. Parity CHECK/RESYNC in progress, 0% complete, est. finish in 0 minutes. Speed: 0 kb/s. Server Name: Tower Server IP: 192.168.111.3 Date: Sat Nov 20 10:47:07 PST 2010 Is there something in the system that makes the notification script think a resync is happening? yes, it will need to be modified to deal with the new data structure unRAID is presenting. I'm waiting for rc3 before making any changes though. The driver that fixes the kernel oops is the same code I'm using for 5.0-beta. There were some changes in the driver to output all information, e.g., for disks that are not present. Also it will output the resync variables even if no resync is taking place. This was done to make it easier to generate the php ini-style files for 5.0-beta. But what I'm doing is restoring the original driver used in 4.5.6 & adding the kernel oops fix into that. This will be 4.6-rc3. I will also put in the updates to hdparm & smartctl if you think this won't also break an EXISTING add-on. Link to comment
Joe L. Posted November 20, 2010 Share Posted November 20, 2010 May nor may not be related to my crashing problem, but I not get regular resync notifications even though no resync is happening... Subject:unRaid Resync Notification Status update for unRAID Tower - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Status: The unRaid array is resync/rebuilding parity. Parity CHECK/RESYNC in progress, 0% complete, est. finish in 0 minutes. Speed: 0 kb/s. Server Name: Tower Server IP: 192.168.111.3 Date: Sat Nov 20 10:47:07 PST 2010 Is there something in the system that makes the notification script think a resync is happening? yes, it will need to be modified to deal with the new data structure unRAID is presenting. I'm waiting for rc3 before making any changes though. The driver that fixes the kernel oops is the same code I'm using for 5.0-beta. There were some changes in the driver to output all information, e.g., for disks that are not present. Also it will output the resync variables even if no resync is taking place. This was done to make it easier to generate the php ini-style files for 5.0-beta. But what I'm doing is restoring the original driver used in 4.5.6 & adding the kernel oops fix into that. This will be 4.6-rc3. I will also put in the updates to hdparm & smartctl if you think this won't also break an EXISTING add-on. I have no issue at all with the extra variables all being present. As far as the "resync variables" some older scripts just need a tiny update to not think an update is ongoing when they are present, but also check the percentage>0. Since many of these are deployed through unMENU, once they are updated in the package manager the users will have an easy time getting the status messages they are used to seeing. I like the idea of the updated hdparm and smartctl. I know of nothing they will break. Joe L. Link to comment
bubbaQ Posted November 20, 2010 Share Posted November 20, 2010 I like the idea of the updated hdparm and smartctl. I know of nothing they will break. Same here. Link to comment
SSD Posted November 20, 2010 Share Posted November 20, 2010 I like the idea of the updated hdparm and smartctl. I know of nothing they will break. Same here. I asked for it so obviously I concur. Link to comment
bcbgboy13 Posted November 22, 2010 Share Posted November 22, 2010 I will also put in the updates to hdparm & smartctl if you think this won't also break an EXISTING add-on. And perhaps add the new Memtest86+ 4.10 as this will not change anything. I believe this one also fixes (it is not in the change log) some additional testing/detection of the ECC memory. At least it did for me. Many people may be using or start using ECC memory as a requirement for some of the newer Supermicro boards (X8SIL-F for example). Link to comment
Recommended Posts