October 2, 201015 yr I am having a problem where I cannot get a valid array to start on my unRAID system. When the system boots, the configuration is seen as valid, but the array refuses to start. The issue started when I upgraded my CPU, but evidently did not have the heatsink on tight enough. The system rebooted while in the middle of a parity check and restarted several times during the boot process, so I am not sure if something on the flash drive has become corrupt. I assume this is the interesting part of the syslog: Oct 2 19:32:39 Backup emhttp: shcmd (22): /usr/local/sbin/set_ncq sde 1 >/dev/null Oct 2 19:32:39 Backup emhttp: shcmd (23): /usr/local/sbin/set_ncq sdc 1 >/dev/null Oct 2 19:32:39 Backup emhttp: shcmd (24): /usr/local/sbin/set_ncq hda 1 >/dev/null Oct 2 19:32:39 Backup emhttp: shcmd (25): /usr/local/sbin/set_ncq hdb 1 >/dev/null Oct 2 19:32:39 Backup emhttp: shcmd (26): /usr/local/sbin/set_ncq hdc 1 >/dev/null Oct 2 19:32:40 Backup emhttp: shcmd (27): /usr/local/sbin/set_ncq hdd 1 >/dev/null Oct 2 19:32:40 Backup emhttp: shcmd (28): /usr/local/sbin/set_ncq sdf 1 >/dev/null Oct 2 19:32:40 Backup emhttp: shcmd (29): /usr/local/sbin/set_ncq sdb 1 >/dev/null Oct 2 19:32:40 Backup emhttp: shcmd (30): /usr/local/sbin/set_ncq sda 1 >/dev/null Oct 2 19:32:40 Backup emhttp: shcmd (31): /usr/local/sbin/set_ncq sdd 1 >/dev/null Oct 2 19:32:40 Backup kernel: mdcmd (31): start STOPPED Oct 2 19:32:40 Backup kernel: md: do_run: lock_rdev error: -6 Oct 2 19:32:40 Backup emhttp: shcmd (32): rm /etc/samba/smb-shares.conf >/dev/null 2>&1 Oct 2 19:32:40 Backup emhttp: shcmd (33): cp /etc/exports- /etc/exports Oct 2 19:32:40 Backup emhttp: shcmd (34): killall -HUP smbd Oct 2 19:32:40 Backup emhttp: shcmd (35): /etc/rc.d/rc.nfsd restart | logger But, I have attached the syslog for review. Where do I go from here? syslog-2010-10-02.zip
October 2, 201015 yr Is your IDE maxtor disk still working fine? this error from lock_rdev() looks to me is the issue, although i have no clues what error code -6 meant. Oct 2 19:32:21 Backup emhttp: Device inventory: Oct 2 19:32:21 Backup emhttp: pci-0000:00:1f.1-ide-0:0 ide0 (hda) ST3300831A_3NF01BZB Oct 2 19:32:21 Backup emhttp: pci-0000:00:1f.1-ide-0:1 ide0 (hdb) Maxtor_6L300R0_L627Z3JG Oct 2 19:32:21 Backup emhttp: pci-0000:00:1f.1-ide-1:0 ide1 (hdc) WDC_WD2000JB-32FUA0_WD-WMAEP1092479 Oct 2 19:32:21 Backup emhttp: pci-0000:00:1f.1-ide-1:1 ide1 (hdd) ST3300831A_5NF1JR0Q Oct 2 19:32:21 Backup emhttp: pci-0000:00:1f.2-scsi-0:0:0:0 host9 (sde) WDC_WD10EADS-00L5B1_WD-WCAU49187988 Oct 2 19:32:21 Backup emhttp: pci-0000:00:1f.2-scsi-1:0:0:0 host10 (sdf) WDC_WD10EACS-00ZJB0_WD-WCASJ0939774 Oct 2 19:32:21 Backup emhttp: pci-0000:02:0b.0-scsi-0:0:0:0 host4 (sda) WDC_WD10EAVS-00D7B0_WD-WCAU41073998 Oct 2 19:32:21 Backup emhttp: pci-0000:02:0b.0-scsi-1:0:0:0 host5 (sdb) WDC_WD10EACS-00D6B1_WD-WCAU43963169 Oct 2 19:32:21 Backup emhttp: pci-0000:02:0b.0-scsi-2:0:0:0 host6 (sdc) Maxtor_7H500F0_H819VMCH Oct 2 19:32:21 Backup emhttp: pci-0000:02:0b.0-scsi-3:0:0:0 host7 (sdd) WDC_WD10EAVS-00D7B0_WD-WCAU41173764 Oct 2 19:32:21 Backup emhttp: get_fstype: open /dev/hdb1: No such file or directory Oct 2 19:32:40 Backup kernel: mdcmd (31): start STOPPED Oct 2 19:32:40 Backup kernel: md: do_run: lock_rdev error: -6
October 2, 201015 yr Author Is your IDE maxtor disk still working fine? this error from lock_rdev() looks to me is the issue, although i have no clues what error code -6 meant. Everything shows up as "green" in the unraid interface. I had no problems with the system before I upgraded the CPU.
October 2, 201015 yr Is your IDE maxtor disk still working fine? this error from lock_rdev() looks to me is the issue, although i have no clues what error code -6 meant. Everything shows up as "green" in the unraid interface. I had no problems with the system before I upgraded the CPU. If device driver is still working as the way it is. for a disk, for example this hdb. hdb1 meant first partition on this disk, if unRAID can not open this hdb1, it looks to me there are problems in accessing this partition on this disk. you can double check other IDE disks in your system under /dev directory
October 2, 201015 yr Author How do I check other drives? Is there a command I can use to check hdb1 to ensure no issues with the file system? I have minimal Linux knowledge outside of some basic commands, but I am good at following instructions.
October 2, 201015 yr How do I check other drives? login to system, at console, type "ls -l /dev/hd*", from output, Do you see hda1, hdc1, hdd1? how about hdb1? Is there a command I can use to check hdb1 to ensure no issues with the file system? The relationship is disk----> partition(s) ----> file systems (s). You can do a read-only file system check on hdb by following instructions in this link. this hdb is imported as md3, so use /dev/md3 in your checking. http://lime-technology.com/wiki/index.php?title=Check_Disk_Filesystems Oct 2 19:32:21 Backup kernel: md: import disk3: [3,64] (hdb) Maxtor 6L300R0 L627Z3JG offset: 63 size: 293057320
October 3, 201015 yr Author I turned the system on to try the commands you provided and it booted up and started a parity check... then the system powered off after a couple of minutes. So, I swapped CPU's to something different and everything seems to be running fine now. I'll let the parity check run for a couple of hours before celebrating though. Thanks for the help. CPU's can cause strange issues.
October 3, 201015 yr I turned the system on to try the commands you provided and it booted up and started a parity check... then the system powered off after a couple of minutes. So, I swapped CPU's to something different and everything seems to be running fine now. I'll let the parity check run for a couple of hours before celebrating though. Thanks for the help. CPU's can cause strange issues. It could be as well as memory since unRAID is running on RAM file system. i will suggest also do a memtest.
October 3, 201015 yr The issue started when I upgraded my CPU, but evidently did not have the heatsink on tight enough. The system rebooted while in the middle of a parity check and restarted several times during the boot process, ... I turned the system on to try the commands you provided and it booted up and started a parity check... then the system powered off after a couple of minutes. So, I swapped CPU's to something different and everything seems to be running fine now. I'll let the parity check run for a couple of hours before celebrating though. Thanks for the help. CPU's can cause strange issues. If the heatsink really was not properly attached initially, it could be that the cpu suffered some damage.
October 3, 201015 yr The issue started when I upgraded my CPU, but evidently did not have the heatsink on tight enough. The system rebooted while in the middle of a parity check and restarted several times during the boot process, ... I turned the system on to try the commands you provided and it booted up and started a parity check... then the system powered off after a couple of minutes. So, I swapped CPU's to something different and everything seems to be running fine now. I'll let the parity check run for a couple of hours before celebrating though. Thanks for the help. CPU's can cause strange issues. Or, it might still not be on correctly... If the heatsink really was not properly attached initially, it could be that the cpu suffered some damage.
October 3, 201015 yr If the heatsink really was not properly attached initially, it could be that the cpu suffered some damage. It could but nowadays most of MB has built-in facility to automatically shut down system when CPU temperature across certain threshold. and this threshold usually is way lower than CPU vendor's spec. You usually can find this setting in BIOS.
October 3, 201015 yr If the heatsink really was not properly attached initially, it could be that the cpu suffered some damage. It could but nowadays most of MB has built-in facility to automatically shut down system when CPU temperature across certain threshold. and this threshold usually is way lower than CPU vendor's spec. You usually can find this setting in BIOS. Yes, and that is exactly the symptom the user is describing.
October 3, 201015 yr Yes, and that is exactly the symptom the user is describing. the system on/off frequently is the symptom of this auto-shutdown. however i am more interesting to know why there is no /dev/hdb1, although when CPU is overheating, some strange side effects could happen. I usually will stay in BIOS for a while every time after i change HW components to make sure i am comfortable with those reading like CPU temperature, voltage, fan speed...etc in BIOS.
October 3, 201015 yr This message seems to indicate that unRAID was unable to get exclusive use of one of the /dev/sdX devices. I have absolutely no idea what would cause that issue, other than some kind of hardware issue. Do you have a disk controller configured as a RAID controller rather than acting a just a bunch-of-disks. ?? Joe L.
October 4, 201015 yr It could but nowadays most of MB has built-in facility to automatically shut down system when CPU temperature across certain threshold. and this threshold usually is way lower than CPU vendor's spec. You usually can find this setting in BIOS. Indeed, and as Joe points out, the symptoms could indicate that a thermal problem still exists. However, I wonder whether the auto shutdown really would react quickly enough in the case that there is no thermal sink capacity???
October 4, 201015 yr However, I wonder whether the auto shutdown really would react quickly enough in the case that there is no thermal sink capacity??? Given OP mentioned, s/he can boot whole system to parity check, that looks to be is long enough and heat accumulation is not that quick the reason system start to fail at parity check is also understandable because that is the most busy time in unRAID and CPU utilization has picked up.
October 4, 201015 yr When I built my System that is in my signature I forgot to plug in the fan that was mounted to the CPU and within a matter or minutes it would shut off. Another PC I have when I put the fan/heat sync on the processor some how it did not clip down and the machine would shut down within seconds off boot. Don't ask me how/why I didn't check it during cpu install. Yeah I look like an idiot admitting to faults with 2 PC's, but out of the dozens of machines at least the only two problems I've ever had where/are my own.
October 4, 201015 yr Author I didn't check these forums yesterday and sorry not to respond to everyone's questions. To answer Joe - No, there are no RAID controllers in the system, all are standard PCI SATA boards. This is an older system with no PCIe option. After digging into the issue, I believe the problem is due to a BIOS incompatibility with the processor I was using. I have an older ASUS board and was attempting to use a Pentium D 820 CPU. Checking ASUS' website, this is not a supported CPU for the board I am using. I dropped down to a Pentium 640 and everything seems to be working fine. What is weird is the BIOS knew exactly what CPU I was using, but in unmenu it only showed 1 CPU. Thanks to everyone for the help in narrowing down the cause. Hopefully this thread will help someone else with a similar problem in the future. As with most of my unRAID issues, they are self inflicted.
Archived
This topic is now archived and is closed to further replies.