fredsherbet Posted August 4, 2010 Share Posted August 4, 2010 Hi I'm having trouble with one of my unraid servers. It stopped being able to boot up, and seemed to freeze at random points during the start sequence, somewhere in the go script. Through trial and error, I've discovered its emhttp that causes the system to freeze. I've been looking for information about what that does, and what is likely to cause it to freeze the entire computer. I am at a point where my go script contains nothing, and I can boot the system and log in. When I manually start emhttp, by logging in on the console and running /usr/local/sbin/emhttp & the system is fine for a few seconds and then freezes. What shall I do next to resolve this problem? Thanks! Fred Link to comment
Joe L. Posted August 4, 2010 Share Posted August 4, 2010 Hi I'm having trouble with one of my unraid servers. It stopped being able to boot up, and seemed to freeze at random points during the start sequence, somewhere in the go script. Through trial and error, I've discovered its emhttp that causes the system to freeze. I've been looking for information about what that does, and what is likely to cause it to freeze the entire computer. I am at a point where my go script contains nothing, and I can boot the system and log in. When I manually start emhttp, by logging in on the console and running /usr/local/sbin/emhttp & the system is fine for a few seconds and then freezes. What shall I do next to resolve this problem? Thanks! Fred Perform a memory test. It is as likely as anything to cause the system to freeze. If it passes a memory test with no errors, (preferably several passes, or overnight) then I'd comment out the emhttp line in the go script and start the array without it being involved. (put a leading "#" on the line with emhttp) Then log in on one telnet session and type tail -f /var/log/syslog and in another type /usr/local/sbin/emhttp & Hopefully the syslog will have the clues you need to fix the cause of the crash. Link to comment
fredsherbet Posted August 4, 2010 Author Share Posted August 4, 2010 Thanks, I'll try tailing syslog while starting emhttp and report back. I've done a memory test. I only let it run for a couple of hours, but the crash reliably happens within seconds of emhttp starting, so I'm reasonably certain bad RAM isn't the cause. Link to comment
fredsherbet Posted August 4, 2010 Author Share Posted August 4, 2010 Here's the output from syslog from the point I start emhttp to the point that the system freezes. To be clear, what I mean by frozen is that no keyboard input has effect. I can't toggle caps lock. However, the prompt cursor on the console continues flashing. The telnet windows timeout. Aug 4 18:37:02 Yoda emhttp: unRAID System Management Utility version 4.5.6 Aug 4 18:37:02 Yoda emhttp: Copyright (C) 2005-2010, Lime Technology, LLC Aug 4 18:37:02 Yoda emhttp: Pro key detected, GUID: 13FE-3123-0799-090894453F91 Aug 4 18:37:02 Yoda emhttp: shcmd (1): udevadm settle Aug 4 18:37:02 Yoda emhttp: Device inventory: Aug 4 18:37:02 Yoda emhttp: pci-0000:00:09.0-scsi-0:0:0:0 host0 (sda) SAMSUNG_HD203WI_S1UYJ1CZ404434 Aug 4 18:37:02 Yoda emhttp: pci-0000:00:09.0-scsi-3:0:0:0 host3 (sdb) SAMSUNG_HD203WI_S1UYJ1CZ404437 Aug 4 18:37:02 Yoda emhttp: pci-0000:00:0a.0-scsi-0:0:0:0 host4 (sdc) HDS725050KLA360_KRVN67ZBGT98RF Aug 4 18:37:02 Yoda emhttp: pci-0000:00:0a.0-scsi-1:0:0:0 host5 (sdd) MAXTOR_STM3250820AS_6QE17JM7 Aug 4 18:37:02 Yoda emhttp: pci-0000:00:0a.0-scsi-2:0:0:0 host6 (sde) WDC_WD20EARS-00S8B1_WD-WCAVY3601452 Aug 4 18:37:02 Yoda emhttp: pci-0000:00:0a.0-scsi-3:0:0:0 host7 (sdf) SAMSUNG_HD501LJ_S0MUJ1NQ139472 Aug 4 18:37:02 Yoda emhttp: pci-0000:00:0f.0-scsi-0:0:0:0 host9 (sdg) WDC_WD4000YS-01MPB0_WD-WMANU1402051 Aug 4 18:37:02 Yoda emhttp: pci-0000:00:0f.0-scsi-1:0:0:0 host10 (sdh) Hitachi_HDS722020ALA330_JK1171YAGRH1XS Aug 4 18:37:02 Yoda emhttp: pci-0000:00:0f.1-ide-0:0 ide0 (hda) ST3300620A_9QF2NWK0 Aug 4 18:37:02 Yoda emhttp: pci-0000:00:0f.1-ide-0:1 ide0 (hdb) HDT722525DLAT80_VD241BT4CR8G2C Aug 4 18:37:02 Yoda emhttp: pci-0000:00:0f.1-ide-1:0 ide1 (hdc) ST3200822A_3LJ0P1DD Aug 4 18:37:02 Yoda emhttp: pci-0000:00:0f.1-ide-1:1 ide1 (hdd) ST3300620A_9QF4BK44 Aug 4 18:37:02 Yoda emhttp: shcmd (2): modprobe -rw md-mod 2>&1 | logger Aug 4 18:37:02 Yoda emhttp: shcmd (3): modprobe md-mod super=/boot/config/super.dat slots=8,112,3,0,3,64,22,0,22,64,8,32,8,48,8,64,8,80,8,16,8,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 2>&1 | logger Aug 4 18:37:02 Yoda kernel: xor: automatically using best checksumming function: pIII_sse Aug 4 18:37:02 Yoda kernel: pIII_sse : 7336.800 MB/sec Aug 4 18:37:02 Yoda kernel: xor: using function: pIII_sse (7336.800 MB/sec) Aug 4 18:37:02 Yoda emhttp: Spinning up all drives... Aug 4 18:37:02 Yoda emhttp: shcmd (4): /usr/sbin/hdparm -S0 /dev/sdg >/dev/null Aug 4 18:37:02 Yoda kernel: md: unRAID driver 0.95.4 installed Aug 4 18:37:02 Yoda kernel: md: import disk0: [8,112] (sdh) Hitachi HDS72202 JK1171YAGRH1XS offset: 63 size: 1953514552 Aug 4 18:37:02 Yoda kernel: md: import disk1: [3,0] (hda) ST3300620A 9QF2NWK0 offset: 63 size: 293036152 Aug 4 18:37:02 Yoda kernel: md: import disk2: [3,64] (hdb) HDT722525DLAT80 VD241BT4CR8G2C offset: 63 size: 244198552 Aug 4 18:37:02 Yoda kernel: md: import disk3: [22,0] (hdc) ST3200822A 3LJ0P1DD offset: 63 size: 195360952 Aug 4 18:37:02 Yoda kernel: md: import disk4: [22,64] (hdd) ST3300620A 9QF4BK44 offset: 63 size: 293036152 Aug 4 18:37:02 Yoda kernel: md: import disk5: [8,32] (sdc) HDS725050KLA360 KRVN67ZBGT98RF offset: 63 size: 488386552 Aug 4 18:37:02 Yoda kernel: md: import disk6: [8,48] (sdd) MAXTOR STM325082 6QE17JM7 offset: 63 size: 244198552 Aug 4 18:37:02 Yoda kernel: md: import disk7: [8,64] (sde) WDC WD20EARS-00S WD-WCAVY3601452 offset: 63 size: 1953514552 Aug 4 18:37:02 Yoda kernel: md: import disk8: [8,80] (sdf) SAMSUNG HD501LJ S0MUJ1NQ139472 offset: 63 size: 488386552 Aug 4 18:37:02 Yoda kernel: md: import disk9: [8,16] (sdb) SAMSUNG HD203WI S1UYJ1CZ404437 offset: 63 size: 1953514552 Aug 4 18:37:02 Yoda kernel: md: import disk10: [8,0] (sda) SAMSUNG HD203WI S1UYJ1CZ404434 offset: 63 size: 1953514552 Aug 4 18:37:02 Yoda kernel: mdcmd (2): set md_num_stripes 1280 Aug 4 18:37:02 Yoda kernel: mdcmd (3): set md_write_limit 768 Aug 4 18:37:02 Yoda kernel: mdcmd (4): set md_sync_window 288 Aug 4 18:37:02 Yoda kernel: mdcmd (5): set spinup_group 0 0 Aug 4 18:37:02 Yoda kernel: mdcmd (6): set spinup_group 1 4 Aug 4 18:37:02 Yoda kernel: mdcmd (7): set spinup_group 2 2 Aug 4 18:37:02 Yoda kernel: mdcmd (: set spinup_group 3 16 Aug 4 18:37:02 Yoda kernel: mdcmd (9): set spinup_group 4 8 Aug 4 18:37:02 Yoda kernel: mdcmd (10): set spinup_group 5 0 Aug 4 18:37:02 Yoda kernel: mdcmd (11): set spinup_group 6 0 Aug 4 18:37:02 Yoda kernel: mdcmd (12): set spinup_group 7 0 Aug 4 18:37:02 Yoda kernel: mdcmd (13): set spinup_group 8 0 Aug 4 18:37:02 Yoda kernel: mdcmd (14): set spinup_group 9 1024 Aug 4 18:37:02 Yoda kernel: mdcmd (15): set spinup_group 10 512 Aug 4 18:37:02 Yoda kernel: mdcmd (16): spinup 0 Aug 4 18:37:02 Yoda kernel: mdcmd (17): spinup 1 Aug 4 18:37:02 Yoda kernel: mdcmd (18): spinup 2 Aug 4 18:37:02 Yoda kernel: mdcmd (19): spinup 3 Aug 4 18:37:02 Yoda kernel: mdcmd (20): spinup 4 Aug 4 18:37:02 Yoda kernel: mdcmd (21): spinup 5 Aug 4 18:37:02 Yoda kernel: mdcmd (22): spinup 6 Aug 4 18:37:02 Yoda kernel: mdcmd (23): spinup 7 Aug 4 18:37:02 Yoda kernel: mdcmd (24): spinup 8 Aug 4 18:37:02 Yoda kernel: mdcmd (25): spinup 9 Aug 4 18:37:02 Yoda kernel: mdcmd (26): spinup 10 Aug 4 18:37:04 Yoda emhttp: shcmd (5): /usr/local/sbin/set_ncq sdh 1 >/dev/null Aug 4 18:37:04 Yoda emhttp: shcmd (6): /usr/local/sbin/set_ncq hda 1 >/dev/null Aug 4 18:37:04 Yoda emhttp: shcmd (7): /usr/local/sbin/set_ncq hdb 1 >/dev/null Aug 4 18:37:04 Yoda emhttp: shcmd (: /usr/local/sbin/set_ncq hdc 1 >/dev/null Aug 4 18:37:04 Yoda emhttp: shcmd (9): /usr/local/sbin/set_ncq hdd 1 >/dev/null Aug 4 18:37:04 Yoda emhttp: shcmd (10): /usr/local/sbin/set_ncq sdc 1 >/dev/null Aug 4 18:37:04 Yoda emhttp: shcmd (11): /usr/local/sbin/set_ncq sdd 1 >/dev/null Aug 4 18:37:04 Yoda emhttp: shcmd (12): /usr/local/sbin/set_ncq sde 1 >/dev/null Aug 4 18:37:04 Yoda emhttp: shcmd (13): /usr/local/sbin/set_ncq sdf 1 >/dev/null Aug 4 18:37:04 Yoda emhttp: shcmd (14): /usr/local/sbin/set_ncq sdb 1 >/dev/null Aug 4 18:37:04 Yoda emhttp: shcmd (15): /usr/local/sbin/set_ncq sda 1 >/dev/null Aug 4 18:37:04 Yoda emhttp: shcmd (16): /usr/local/sbin/set_ncq sdg 1 >/dev/null Aug 4 18:37:04 Yoda kernel: mdcmd (28): start STOPPED Aug 4 18:37:04 Yoda kernel: unraid: allocating 59320K for 1280 stripes (11 disks) Aug 4 18:37:04 Yoda kernel: md1: running, size: 293036152 blocks Aug 4 18:37:04 Yoda kernel: md2: running, size: 244198552 blocks Aug 4 18:37:04 Yoda kernel: md3: running, size: 195360952 blocks Aug 4 18:37:04 Yoda kernel: md4: running, size: 293036152 blocks Aug 4 18:37:04 Yoda kernel: md5: running, size: 488386552 blocks Aug 4 18:37:04 Yoda kernel: md6: running, size: 244198552 blocks Aug 4 18:37:04 Yoda kernel: md7: running, size: 1953514552 blocks Aug 4 18:37:04 Yoda kernel: md8: running, size: 488386552 blocks Aug 4 18:37:04 Yoda kernel: md9: running, size: 1953514552 blocks Aug 4 18:37:04 Yoda kernel: md10: running, size: 1953514552 blocks Aug 4 18:37:04 Yoda kernel: ata6: exception Emask 0x10 SAct 0x0 SErr 0x10000 action 0xe frozen Aug 4 18:37:04 Yoda kernel: ata6: SError: { PHYRdyChg } Aug 4 18:37:04 Yoda kernel: ata6: hard resetting link Aug 4 18:37:05 Yoda emhttp: shcmd (17): udevadm settle Aug 4 18:37:05 Yoda emhttp: shcmd (18): mkdir /mnt/disk3 Aug 4 18:37:05 Yoda emhttp: shcmd (18): mkdir /mnt/disk4 Aug 4 18:37:05 Yoda emhttp: shcmd (18): mkdir /mnt/disk5 Aug 4 18:37:05 Yoda emhttp: shcmd (18): mkdir /mnt/disk6 Aug 4 18:37:05 Yoda emhttp: shcmd (18): mkdir /mnt/disk7 Aug 4 18:37:05 Yoda emhttp: shcmd (18): mkdir /mnt/disk8 Aug 4 18:37:05 Yoda emhttp: shcmd (18): mkdir /mnt/disk9 Aug 4 18:37:05 Yoda emhttp: shcmd (18): mkdir /mnt/disk10 Aug 4 18:37:05 Yoda emhttp: shcmd (18): mkdir /mnt/cache Aug 4 18:37:05 Yoda emhttp: shcmd (18): mkdir /mnt/disk2 Aug 4 18:37:05 Yoda emhttp: shcmd (18): mkdir /mnt/disk1 Aug 4 18:37:05 Yoda emhttp: shcmd (19): set -o pipefail ; mount -t reiserfs -o noacl,nouser_xattr,noatime,nodiratime /dev/md3 /mnt/disk3 2>&1 | logger Aug 4 18:37:05 Yoda emhttp: shcmd (20): set -o pipefail ; mount -t reiserfs -o noacl,nouser_xattr,noatime,nodiratime /dev/md5 /mnt/disk5 2>&1 | logger Aug 4 18:37:05 Yoda kernel: mdcmd (30): check Aug 4 18:37:05 Yoda kernel: md: recovery thread woken up ... Aug 4 18:37:05 Yoda kernel: md: recovery thread checking parity... Aug 4 18:37:05 Yoda emhttp: shcmd (21): set -o pipefail ; mount -t reiserfs -o noacl,nouser_xattr,noatime,nodiratime /dev/md6 /mnt/disk6 2>&1 | logger Aug 4 18:37:05 Yoda emhttp: shcmd (22): set -o pipefail ; mount -t reiserfs -o noacl,nouser_xattr,noatime,nodiratime /dev/md7 /mnt/disk7 2>&1 | logger Aug 4 18:37:05 Yoda emhttp: shcmd (23): set -o pipefail ; mount -t reiserfs -o noacl,nouser_xattr,noatime,nodiratime /dev/md8 /mnt/disk8 2>&1 | logger Aug 4 18:37:05 Yoda emhttp: shcmd (24): set -o pipefail ; mount -t reiserfs -o noacl,nouser_xattr,noatime,nodiratime /dev/md9 /mnt/disk9 2>&1 | logger Aug 4 18:37:05 Yoda emhttp: shcmd (25): set -o pipefail ; mount -t reiserfs -o noacl,nouser_xattr,noatime,nodiratime /dev/md10 /mnt/disk10 2>&1 | logger Aug 4 18:37:05 Yoda kernel: md: using 1152k window, over a total of 1953514552 blocks. Aug 4 18:37:05 Yoda emhttp: shcmd (26): set -o pipefail ; mount -t reiserfs -o noacl,nouser_xattr,noatime,nodiratime /dev/md2 /mnt/disk2 2>&1 | logger Aug 4 18:37:05 Yoda emhttp: shcmd (27): set -o pipefail ; mount -t reiserfs -o noacl,nouser_xattr,noatime,nodiratime /dev/md1 /mnt/disk1 2>&1 | logger Aug 4 18:37:05 Yoda emhttp: shcmd (28): set -o pipefail ; mount -t reiserfs -o noacl,nouser_xattr,noatime,nodiratime /dev/md4 /mnt/disk4 2>&1 | logger Aug 4 18:37:05 Yoda emhttp: shcmd (29): set -o pipefail ; mount -t reiserfs -o noacl,nouser_xattr,noatime,nodiratime /dev/sdg1 /mnt/cache 2>&1 | logger Aug 4 18:37:05 Yoda kernel: REISERFS (device md7): found reiserfs format "3.6" with standard journal Aug 4 18:37:05 Yoda kernel: REISERFS (device md7): using ordered data mode Aug 4 18:37:05 Yoda kernel: REISERFS (device md9): found reiserfs format "3.6" with standard journal Aug 4 18:37:05 Yoda kernel: REISERFS (device md9): using ordered data mode Aug 4 18:37:05 Yoda kernel: REISERFS (device md10): found reiserfs format "3.6" with standard journal Aug 4 18:37:05 Yoda kernel: REISERFS (device md10): using ordered data mode Aug 4 18:37:05 Yoda kernel: REISERFS (device md3): found reiserfs format "3.6" with standard journal Aug 4 18:37:05 Yoda kernel: REISERFS (device md3): using ordered data mode Aug 4 18:37:05 Yoda kernel: REISERFS (device sdg1): found reiserfs format "3.6" with standard journal Aug 4 18:37:05 Yoda kernel: REISERFS (device sdg1): using ordered data mode Aug 4 18:37:05 Yoda kernel: REISERFS (device sdg1): journal params: device sdg1, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30 Aug 4 18:37:05 Yoda kernel: REISERFS (device sdg1): checking transaction log (sdg1) Aug 4 18:37:05 Yoda kernel: REISERFS (device sdg1): replayed 2 transactions in 0 seconds Aug 4 18:37:05 Yoda kernel: REISERFS (device sdg1): Using r5 hash to sort names Aug 4 18:37:05 Yoda kernel: REISERFS (device md10): journal params: device md10, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30 Aug 4 18:37:05 Yoda kernel: REISERFS (device md10): checking transaction log (md10) Aug 4 18:37:05 Yoda kernel: REISERFS (device md9): journal params: device md9, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30 Aug 4 18:37:05 Yoda kernel: REISERFS (device md9): checking transaction log (md9) Aug 4 18:37:05 Yoda kernel: REISERFS (device md7): journal params: device md7, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30 Aug 4 18:37:05 Yoda kernel: REISERFS (device md7): checking transaction log (md7) Aug 4 18:37:05 Yoda kernel: REISERFS (device md3): journal params: device md3, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30 Aug 4 18:37:05 Yoda kernel: REISERFS (device md3): checking transaction log (md3) Aug 4 18:37:05 Yoda emhttp: shcmd (30): cp /var/spool/cron/crontabs/root- /var/spool/cron/crontabs/root Aug 4 18:37:05 Yoda emhttp: shcmd (31): echo '# Generated mover schedule:' >>/var/spool/cron/crontabs/root Aug 4 18:37:05 Yoda emhttp: shcmd (32): echo '40 3 * * * /usr/local/sbin/mover 2>&1 | logger' >>/var/spool/cron/crontabs/root Aug 4 18:37:05 Yoda emhttp: shcmd (33): crontab /var/spool/cron/crontabs/root Link to comment
Joe L. Posted August 4, 2010 Share Posted August 4, 2010 Since I see this error in the log lines you posted, I'd disconnect the one drive associated with ata6 and see if it is not the one causing the whole server to freeze. Aug 4 18:37:04 Yoda kernel: ata6: exception Emask 0x10 SAct 0x0 SErr 0x10000 action 0xe frozen Aug 4 18:37:04 Yoda kernel: ata6: SError: { PHYRdyChg } Aug 4 18:37:04 Yoda kernel: ata6: hard resetting link You'll need to look further back in the syslog to figure out which drive is affiliated with ata6. Joe L. Link to comment
fredsherbet Posted August 4, 2010 Author Share Posted August 4, 2010 Thanks for the help tracking down the cause. Do you have any suggestions for what might be wrong with the drive, or how to find out? Thanks lots! Link to comment
klipsch Posted August 4, 2010 Share Posted August 4, 2010 reconnect the drive, telnet in, don't run emhttp and then smartctl -a -d ata /dev/drive example: smartctl -a -d ata /dev/sda You could awk out and write the SMART report to a file as well if you'd like to access it from windows ... it would be in \\tower\flash example: smartctl -a -d ata /dev/sda >/boot/emhttpdrive.txt Hopefully the SMART report can give some details on the drive Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.