gbdesai Posted September 26, 2011 Share Posted September 26, 2011 I concur, I have 3 of those cards supporting 22 drives total, 1 parity (3TB), 20 array drives (18x2TB,2x3TB), and 1 (1.5TB) cache drive, with no problems on 5b12 so far (been running for at least 4 weeks... I am running 9 (Nine) 3TB drives using this card, and 5B12 with no issues of any kinds. So maybe your issues start with other hardware choices you made. After I run pre_clear I did have a couple issues with one drive. It was showing 723TB free at one point, I wish I had taken a screen shot. I did take one and post it here when showing much less than that, but well over the 3TB capacity of the drive. But I have been up and running for a couple weeks now, and all is good on the home front. @Lars Olof, don't be sorry for the 'rant' .. I can take it ;-) I didn't start with 4.7 as I wanted to start with 3TB drives, but along the road I came to the conclusion that my "Supermicro 8-Port SAS/SATA Cards (3xAOC-SASLP-MV8) don't support >2TB. I am fully aware of the BETA stages and I respect them, but let's say 4.7 works without the "BLK_NOT_HANDLED" issue, why would this be an issue in the new beta serie (5bxx)? We perform regression tests to keep the current functional design operational not sure how that works for Lime. We will wait and see. Quote Link to comment
cyrnel Posted September 26, 2011 Share Posted September 26, 2011 I concur, I have 3 of those cards supporting 22 drives total, 1 parity (3TB), 20 array drives (18x2TB,2x3TB), and 1 (1.5TB) cache drive, with no problems on 5b12 so far (been running for at least 4 weeks... Have any of those drives thrown errors? If not - if the drives are all behaving perfectly - then this is something of a false negative. Quote Link to comment
mtruffa Posted September 26, 2011 Share Posted September 26, 2011 I am having trouble with the PowerDown. It seems like the system is hanging on the powerdown. It does not matter if it is a cron job or a manual powerdown from unmenu. I am able to access \\tower:8081 -sickbeard and \\tower:8082 - sabnzbd but cannot access \\tower or \\tower:8080. I can access the shares just to the gui's. It is set to powerdown at 6:00 am, it runs the script but never powers down. I have attached a syslog. Mike Does it power down without problems if you do not run sickbeard and sabnzbd (and any other extra software) ? no I shutdown both and It still will not power down. I even tried rc.unRAID stop and I get the following. root@Tower:/etc/rc.d# rc.unRAID stop Capturing information to syslog. Please wait... version[23578]: Linux version 3.0.3-unRAID (root@Develop) (gcc version 4.4.4 (GC C) ) #7 SMP Fri Sep 2 16:44:33 MDT 2011 ls: cannot access /dev/hd[a-z]: No such file or directory and then it hangs. When it was working I would get this but it would power down. Mike Now I am getting errors when trying to access a share. I will be 1/2 way through something and it will disconnect. I found the following in the syslog. Permission denied Sep 26 09:20:10 Tower cnid_dbd[22459]: main: fatal db lock error Sep 26 09:20:10 Tower afpd[22127]: read: Connection reset by peer Sep 26 09:20:10 Tower afpd[22127]: transmit: Request to dbd daemon (db_dir /mnt/user/TV) timed out. Sep 26 09:20:10 Tower afpd[22127]: =============================================================== Sep 26 09:20:10 Tower afpd[22127]: INTERNAL ERROR: Signal 11 in pid 22127 (2-2-0-p6) Sep 26 09:20:10 Tower afpd[22127]: =============================================================== Sep 26 09:20:10 Tower afpd[22127]: BACKTRACE: 10 stack frames: Sep 26 09:20:10 Tower afpd[22127]: #0 /usr/sbin/afpd(netatalk_panic+0x2d) [0x8097dfd] Sep 26 09:20:10 Tower afpd[22127]: #1 /usr/sbin/afpd() [0x8097f6d] Sep 26 09:20:10 Tower afpd[22127]: #2 [0xb7897400] Sep 26 09:20:10 Tower afpd[22127]: #3 /usr/sbin/afpd(dir_add+0x453) [0x8061873] Sep 26 09:20:10 Tower afpd[22127]: #4 /usr/sbin/afpd() [0x806529d] Sep 26 09:20:10 Tower afpd[22127]: #5 /usr/sbin/afpd(afp_over_dsi+0x55e) [0x8054aee] Sep 26 09:20:10 Tower afpd[22127]: #6 /usr/sbin/afpd() [0x8053c55] Sep 26 09:20:10 Tower afpd[22127]: #7 /usr/sbin/afpd(main+0x660) [0x8070bd0] Sep 26 09:20:10 Tower afpd[22127]: #8 /lib/libc.so.6(__libc_start_main+0xe6) [0xb7480b86] Sep 26 09:20:10 Tower afpd[22127]: #9 /usr/sbin/afpd() [0x8053a51] Sep 26 09:26:40 Tower cnid_dbd[23201]: Set syslog logging to level: LOG_NOTE I have attached syslog also. Mike syslog.txt Quote Link to comment
mtruffa Posted September 27, 2011 Share Posted September 27, 2011 I am having trouble with the PowerDown. It seems like the system is hanging on the powerdown. It does not matter if it is a cron job or a manual powerdown from unmenu. I am able to access \\tower:8081 -sickbeard and \\tower:8082 - sabnzbd but cannot access \\tower or \\tower:8080. I can access the shares just to the gui's. It is set to powerdown at 6:00 am, it runs the script but never powers down. I have attached a syslog. Mike Does it power down without problems if you do not run sickbeard and sabnzbd (and any other extra software) ? no I shutdown both and It still will not power down. I even tried rc.unRAID stop and I get the following. root@Tower:/etc/rc.d# rc.unRAID stop Capturing information to syslog. Please wait... version[23578]: Linux version 3.0.3-unRAID (root@Develop) (gcc version 4.4.4 (GC C) ) #7 SMP Fri Sep 2 16:44:33 MDT 2011 ls: cannot access /dev/hd[a-z]: No such file or directory and then it hangs. When it was working I would get this but it would power down. Mike While looking at the powerdown file I noticed something: #!/bin/bash alias logger="/usr/bin/logger -is -plocal7.info -tpowerdown" if [ ${DEBUG:=0} -gt 0 ] then set -x -v fi if [ -z "${1}" ] then OPT="-h" else OPT="${1}" fi logger "Powerdown initiated" if [ -f /var/run/powerdown.pid ] then logger "Powerdown already active, this one is exiting" exit else echo $$ > /var/run/powerdown.pid fi trap "rm -f /var/run/powerdown.pid" EXIT HUP INT QUIT /etc/rc.d/rc.unRAID stop # /sbin/poweroff logger "Initiating Shutdown with ${1}" /sbin/shutdown -t5 ${OPT} now When I look into /var/run/ there is no file called powerdown.pid. Could this be the problem why it is not powering down? Mike Quote Link to comment
Joe L. Posted September 27, 2011 Share Posted September 27, 2011 No. It is not the reason. The file is tested for in these lines in the script, and if it exists, the script does not continue as it is already running. If it does not exist, the file is created, and then the script continues. If you attempted to run the same command while the first is still running, it would find the file and exit. if [ -f /var/run/powerdown.pid ] then logger "Powerdown already active, this one is exiting" exit else echo $$ > /var/run/powerdown.pid fi This is the line that creates the file when the powerdown script is in progress to prevent you from running a second instance of it at the same time: echo $$ > /var/run/powerdown.pid Joe L. Quote Link to comment
dgaschk Posted September 27, 2011 Share Posted September 27, 2011 I am having trouble with the PowerDown. It seems like the system is hanging on the powerdown. It does not matter if it is a cron job or a manual powerdown from unmenu. I am able to access \\tower:8081 -sickbeard and \\tower:8082 - sabnzbd but cannot access \\tower or \\tower:8080. I can access the shares just to the gui's. It is set to powerdown at 6:00 am, it runs the script but never powers down. I have attached a syslog. Mike Does it power down without problems if you do not run sickbeard and sabnzbd (and any other extra software) ? no I shutdown both and It still will not power down. I even tried rc.unRAID stop and I get the following. root@Tower:/etc/rc.d# rc.unRAID stop Capturing information to syslog. Please wait... version[23578]: Linux version 3.0.3-unRAID (root@Develop) (gcc version 4.4.4 (GC C) ) #7 SMP Fri Sep 2 16:44:33 MDT 2011 ls: cannot access /dev/hd[a-z]: No such file or directory and then it hangs. When it was working I would get this but it would power down. Mike While looking at the powerdown file I noticed something: #!/bin/bash alias logger="/usr/bin/logger -is -plocal7.info -tpowerdown" if [ ${DEBUG:=0} -gt 0 ] then set -x -v fi if [ -z "${1}" ] then OPT="-h" else OPT="${1}" fi logger "Powerdown initiated" if [ -f /var/run/powerdown.pid ] then logger "Powerdown already active, this one is exiting" exit else echo $$ > /var/run/powerdown.pid fi trap "rm -f /var/run/powerdown.pid" EXIT HUP INT QUIT /etc/rc.d/rc.unRAID stop # /sbin/poweroff logger "Initiating Shutdown with ${1}" /sbin/shutdown -t5 ${OPT} now When I look into /var/run/ there is no file called powerdown.pid. Could this be the problem why it is not powering down? Mike Powerdown is an add-on. Please take this discussion to the User Customizations forum. Quote Link to comment
tr0910 Posted September 27, 2011 Share Posted September 27, 2011 I have a b12a server starting the testing process. Just preclearing 3 tb drives now. Sorry, no problems to report yet. Supermicro X9SCM-F Intel i3-2100 and 4 gig RAM Supermicro AOC-SASLP-MV8 8-Port all in a Norco 4224 But I do have a question. With ver 5, how large a drive will unRaid support? 4tb drives are just announced, and 5tb drives are likely soon after that. When do we hit the next wall requiring major software retooling like is going on right now with the ver 5 betas.... Quote Link to comment
prostuff1 Posted September 27, 2011 Share Posted September 27, 2011 I have a b12a server starting the testing process. Just preclearing 3 tb drives now. Sorry, no problems to report yet. But I do have a question. Once we get past ver 5 beta, how large a drive will unRaid support? 4tb drives are just announced, and 5tb drives are likely soon after that. When do we hit the next wall requiring major software retooling like is going on right now with the ver 5 betas.... This probably should have been asked in the Lounge but since I can't move it there I will say that there is nothing to worry about... version 5 of unRAID uses the GPT partition table (http://en.wikipedia.org/wiki/GUID_Partition_Table) for drives about 2.2TB. Quote Link to comment
tr0910 Posted September 27, 2011 Share Posted September 27, 2011 Great, that is equivalent to 4 billion - 2tb drives. Not in my lifetime am I likely to have to worry about it.... This reminds me of: "640K should be enough for anybody" - Bill Gates 1981 http://en.wikiquote.org/wiki/Bill_Gates Quote Link to comment
tyrindor Posted September 27, 2011 Share Posted September 27, 2011 I'm on Beta12a, and I just upgraded to another 4x 3TB, this time the SATA3 EZRX models. Precleared all drives using v1.13 of the preclear script, no errors on any. Added them to array one at a time. Everything went fine. Restarted my server, started array, one of the disc just sits at "resizing" even though the array has been up for over a day. This feels like a rather critical bug that may only be affecting the SATA3 3TB drives, but my knowledge with this stuff is slim. The drive isn't shown in my share in Windows, and doesn't seem to be exporting. Yet I can access all my other drives, and there is no red dot next to any drive. I would assume this would throw parity off, causing parity sync issues, resulting in complete corruption of rebuilt data? This is pretty scary stuff. I think this kernal error is the cause, but i'm not sure. Full system log is attached. UPDATE: Stopping the array hanged the entire server. Let it sit there for an hour before improperly restarting the server. I did a cold boot, and everything seems to be working now - but this did happen for a reason and it seems like a software bug, not a hardware bug. This drive was completely empty with just an empty share folder in it. Sep 22 09:09:28 SERVER kernel: BUG: unable to handle kernel NULL pointer dereference at (null) Sep 22 09:09:28 SERVER kernel: IP: [] queue_delayed_work_on+0x33/0xbf Sep 22 09:09:28 SERVER kernel: *pdpt = 000000002fbee001 *pde = 0000000000000000 Sep 22 09:09:28 SERVER kernel: Oops: 0000 [#1] SMP Sep 22 09:09:28 SERVER kernel: Modules linked in: md_mod xor sata_mv e1000e i2c_i801 i2c_core [last unloaded: md_mod] Sep 22 09:09:28 SERVER kernel: Sep 22 09:09:28 SERVER kernel: Pid: 2170, comm: emhttp Not tainted 3.0.3-unRAID #7 Supermicro X7SB4/E/X7SB4/E Sep 22 09:09:28 SERVER kernel: EIP: 0060:[] EFLAGS: 00210246 CPU: 1 Sep 22 09:09:28 SERVER kernel: EIP is at queue_delayed_work_on+0x33/0xbf Sep 22 09:09:28 SERVER kernel: EAX: f8a2c138 EBX: ffffffff ECX: f8a2c134 EDX: 00000000 Sep 22 09:09:28 SERVER kernel: ESI: 00000000 EDI: f8a2c134 EBP: ec8fbe40 ESP: ec8fbe34 Sep 22 09:09:28 SERVER kernel: DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Sep 22 09:09:28 SERVER kernel: Process emhttp (pid: 2170, ti=ec8fa000 task=f0092f40 task.ti=ec8fa000) Sep 22 09:09:28 SERVER kernel: Stack: Sep 22 09:09:28 SERVER kernel: f8a1c000 f0370f54 f0370f00 ec8fbe4c c1038c8e 0000000a ec8fbeb0 c10d87fe Sep 22 09:09:28 SERVER kernel: f8a1c000 00000012 00000000 f6c42a10 00000000 03b26a10 00000000 00000010 Sep 22 09:09:28 SERVER kernel: f0370f18 00000012 00000004 f8a1c000 0000004b 00000000 f7baa000 f0f841e0 Sep 22 09:09:28 SERVER kernel: Call Trace: Sep 22 09:09:28 SERVER kernel: [] queue_delayed_work+0x1b/0x1e Sep 22 09:09:28 SERVER kernel: [] do_journal_end+0x747/0x92a Sep 22 09:09:28 SERVER kernel: [] journal_end_sync+0x5b/0x63 Sep 22 09:09:28 SERVER kernel: [] reiserfs_sync_fs+0x32/0x51 Sep 22 09:09:28 SERVER kernel: [] __sync_filesystem+0x53/0x65 Sep 22 09:09:28 SERVER kernel: [] sync_filesystem+0x2c/0x3f Sep 22 09:09:28 SERVER kernel: [] do_remount_sb+0x4c/0xd3 Sep 22 09:09:28 SERVER kernel: [] do_remount+0x74/0xc6 Sep 22 09:09:28 SERVER kernel: [] do_mount+0x10b/0x1c6 Sep 22 09:09:28 SERVER kernel: [] sys_mount+0x61/0x94 Sep 22 09:09:28 SERVER kernel: [] syscall_call+0x7/0xb Sep 22 09:09:28 SERVER kernel: [] ? quirk_usb_disable_ehci+0x84/0x129 Sep 22 09:09:28 SERVER kernel: Code: d6 53 89 c3 f0 0f ba 29 00 19 d2 31 c0 85 d2 0f 85 9d 00 00 00 83 79 10 00 74 04 0f 0b eb fe 8d 41 04 39 41 04 74 04 0f 0b eb fe 06 02 b8 08 00 00 00 75 19 89 c8 e8 02 e5 ff ff 85 c0 74 08 Sep 22 09:09:28 SERVER kernel: EIP: [] queue_delayed_work_on+0x33/0xbf SS:ESP 0068:ec8fbe34 Sep 22 09:09:28 SERVER kernel: CR2: 0000000000000000 Sep 22 09:09:28 SERVER kernel: ---[ end trace f8be7fa413f5555b ]--- Sep 22 09:09:28 SERVER kernel: REISERFS (device md17): Using r5 hash to sort names Sep 22 09:09:28 SERVER kernel: REISERFS (device md18): Using r5 hash to sort names Sep 22 09:09:28 SERVER kernel: REISERFS (device md13): Using r5 hash to sort names Sep 22 09:09:28 SERVER emhttp: shcmd (72): chmod 770 '/mnt/disk17' Sep 22 09:09:28 SERVER kernel: REISERFS (device md19): Using r5 hash to sort names Sep 22 09:09:28 SERVER kernel: REISERFS (device md2): Using r5 hash to sort names Sep 22 09:09:28 SERVER kernel: REISERFS (device md6): Using r5 hash to sort names Sep 22 09:09:28 SERVER kernel: REISERFS (device md9): Using r5 hash to sort names Sep 22 09:09:28 SERVER kernel: REISERFS (device md16): Using r5 hash to sort names Sep 22 09:09:28 SERVER kernel: REISERFS (device md4): Using r5 hash to sort names Sep 22 09:09:28 SERVER emhttp: shcmd (73): chown nobody:users '/mnt/disk17' Sep 22 09:09:28 SERVER kernel: REISERFS (device md14): Using r5 hash to sort names Sep 22 09:09:28 SERVER kernel: REISERFS (device md3): Using r5 hash to sort names Sep 22 09:09:28 SERVER kernel: REISERFS (device md11): Using r5 hash to sort names Sep 22 09:09:28 SERVER kernel: REISERFS (device md12): Using r5 hash to sort names Sep 22 09:09:28 SERVER kernel: REISERFS (device md1): Using r5 hash to sort names Sep 22 09:09:28 SERVER kernel: REISERFS (device md8): Using r5 hash to sort names Sep 22 09:09:28 SERVER kernel: REISERFS (device md5): Using r5 hash to sort names Sep 22 09:09:28 SERVER kernel: REISERFS (device md15): Using r5 hash to sort names Sep 22 09:09:28 SERVER kernel: REISERFS (device md10): Using r5 hash to sort names Sep 22 09:09:28 SERVER kernel: REISERFS (device md20): Using r5 hash to sort names Sep 22 09:09:28 SERVER emhttp: shcmd (74): chmod 770 '/mnt/disk18' Sep 22 09:09:28 SERVER emhttp: shcmd (75): chown nobody:users '/mnt/disk18' Sep 22 09:09:28 SERVER emhttp: shcmd (76): chmod 770 '/mnt/disk13' Sep 22 09:09:28 SERVER emhttp: shcmd (77): chmod 770 '/mnt/disk2' Sep 22 09:09:28 SERVER emhttp: shcmd (78): chmod 770 '/mnt/disk9' Sep 22 09:09:28 SERVER emhttp: shcmd (79): chmod 770 '/mnt/disk14' Sep 22 09:09:28 SERVER emhttp: shcmd (80): chown nobody:users '/mnt/disk13' Sep 22 09:09:28 SERVER emhttp: shcmd (81): chmod 770 '/mnt/disk8' Sep 22 09:09:28 SERVER emhttp: shcmd (82): chown nobody:users '/mnt/disk14' Sep 22 09:09:28 SERVER emhttp: shcmd (83): chown nobody:users '/mnt/disk8' Sep 22 09:09:28 SERVER emhttp: shcmd (84): chown nobody:users '/mnt/disk2' Sep 22 09:09:28 SERVER emhttp: shcmd (85): chown nobody:users '/mnt/disk9' Sep 22 09:09:28 SERVER emhttp: shcmd (86): chmod 770 '/mnt/disk16' Sep 22 09:09:28 SERVER emhttp: shcmd (87): chmod 770 '/mnt/disk1' Sep 22 09:09:28 SERVER emhttp: shcmd (89): chmod 770 '/mnt/disk3' Sep 22 09:09:28 SERVER emhttp: shcmd (90): chmod 770 '/mnt/disk4' Sep 22 09:09:28 SERVER emhttp: shcmd (88): chmod 770 '/mnt/disk5' Sep 22 09:09:28 SERVER emhttp: shcmd (91): chmod 770 '/mnt/disk20' Sep 22 09:09:28 SERVER emhttp: shcmd (92): chown nobody:users '/mnt/disk16' Sep 22 09:09:28 SERVER emhttp: shcmd (93): chmod 770 '/mnt/disk10' Sep 22 09:09:28 SERVER emhttp: shcmd (94): chown nobody:users '/mnt/disk20' Sep 22 09:09:28 SERVER emhttp: shcmd (95): chown nobody:users '/mnt/disk4' Sep 22 09:09:28 SERVER emhttp: shcmd (96): chown nobody:users '/mnt/disk3' Sep 22 09:09:28 SERVER emhttp: shcmd (97): chmod 770 '/mnt/disk15' Sep 22 09:09:28 SERVER emhttp: shcmd (98): chown nobody:users '/mnt/disk5' Sep 22 09:09:28 SERVER emhttp: shcmd (99): chown nobody:users '/mnt/disk1' Sep 22 09:09:28 SERVER emhttp: shcmd (100): chown nobody:users '/mnt/disk10' Sep 22 09:09:28 SERVER emhttp: shcmd (101): chown nobody:users '/mnt/disk15' Sep 22 09:09:29 SERVER emhttp: shcmd (102): chmod 770 '/mnt/disk6' Sep 22 09:09:29 SERVER emhttp: shcmd (103): chmod 770 '/mnt/disk19' Sep 22 09:09:29 SERVER emhttp: shcmd (104): chmod 770 '/mnt/disk12' Sep 22 09:09:29 SERVER emhttp: shcmd (105): chmod 770 '/mnt/disk11' Sep 22 09:09:29 SERVER emhttp: shcmd (106): chown nobody:users '/mnt/disk6' Sep 22 09:09:29 SERVER emhttp: shcmd (107): chown nobody:users '/mnt/disk19' Sep 22 09:09:29 SERVER emhttp: shcmd (108): chown nobody:users '/mnt/disk11' Sep 22 09:09:29 SERVER emhttp: shcmd (109): chown nobody:users '/mnt/disk12' Sep 22 09:09:29 SERVER emhttp: shcmd (110): mkdir /mnt/user Sep 22 09:09:29 SERVER emhttp: shcmd (111): /usr/local/sbin/shfs /mnt/user -disks 2097022 -o noatime,big_writes,allow_other,default_permissions,use_ino Sep 22 09:09:29 SERVER emhttp: shcmd (112): crontab -c /etc/cron.d -d &> /dev/null Sep 22 09:09:29 SERVER emhttp: shcmd (113): /usr/local/sbin/emhttp_event disks_mounted Sep 22 09:09:29 SERVER emhttp_event: disks_mounted Sep 22 09:09:29 SERVER emhttp: shcmd (114): :>/etc/samba/smb-shares.conf Sep 22 09:09:29 SERVER emhttp: Restart SMB... Sep 22 09:09:29 SERVER emhttp: shcmd (115): killall -HUP smbd Sep 22 09:09:29 SERVER emhttp: shcmd (116): ps axc | grep -q rpc.mountd Sep 22 09:09:29 SERVER emhttp: _shcmd: shcmd (116): exit status: 1 Sep 22 09:09:29 SERVER emhttp: shcmd (117): /usr/local/sbin/emhttp_event svcs_restarted Sep 22 09:09:29 SERVER emhttp_event: svcs_restarted Just bumping my post here. Full system log is attached to original post. This just happened again when starting array, this time it was a 2TB drive that also has no errors, and is on a completely different SATA controller. This all started with 12a. I can't be the only one? Syslog says kernel errors. I'm 99.9% positive that it is caused by having a empty drive. This has been the cases both times. I only have a folder on the drive called "Movies", which is the share, and no data files in that share. Quote Link to comment
gbdesai Posted September 28, 2011 Share Posted September 28, 2011 I concur, I have 3 of those cards supporting 22 drives total, 1 parity (3TB), 20 array drives (18x2TB,2x3TB), and 1 (1.5TB) cache drive, with no problems on 5b12 so far (been running for at least 4 weeks... Have any of those drives thrown errors? If not - if the drives are all behaving perfectly - then this is something of a false negative. No, no errors on any drive. I do have a spindown (or lack of spindown) problem, but no other issues. Quote Link to comment
hackztor Posted September 28, 2011 Share Posted September 28, 2011 I have had the ASUS E35M1-I for a few months and have been on the 5.0 beta 12. Just the other day it seems that my unraid no longer gets an ip address. I think this is one of the problematic realtek nic motherboards, but I have had no issue until just recently. Do you think my NIC is dead or what? Quote Link to comment
ambly Posted September 28, 2011 Share Posted September 28, 2011 Logged in to my Unraid today and saw that a disk was marked red. Can't see anything in the logg and i can browse the disk. Any idea? Quote Link to comment
Joe L. Posted September 28, 2011 Share Posted September 28, 2011 Logged in to my Unraid today and saw that a disk was marked red. Can't see anything in the logg and i can browse the disk. Any idea? If it is "red" a "write" to that disk failed and it was marked as being invalid. You are now reading and writing to the "simulated" drive made possible by reading parity in combination with all the other data disks. Being able to browse the disk is not an indication of that disk's health. It is an indication your unRAID is protecting you from a single disk failure. Now, the "write" might have failed because the disk itself failed, or it might be because a cable to it is loose, or because the power supply used is inadequate for the combined set of disks used, and it just did not get enough power. Step one would be to get a SMART report on the drive. If it responds, it might give an indication of its health. Id it does not respond, time to power down and re-check the cabling. See here: http://lime-technology.com/wiki/index.php?title=FAQ#What_does_the_Red_Ball_mean.3F Quote Link to comment
hackztor Posted September 29, 2011 Share Posted September 29, 2011 nm, replaced the motherboard bad onboard nic. Only lasted 2 months. Made in China.. figures. Quote Link to comment
ambly Posted September 29, 2011 Share Posted September 29, 2011 Logged in to my Unraid today and saw that a disk was marked red. Can't see anything in the logg and i can browse the disk. Any idea? If it is "red" a "write" to that disk failed and it was marked as being invalid. You are now reading and writing to the "simulated" drive made possible by reading parity in combination with all the other data disks. Being able to browse the disk is not an indication of that disk's health. It is an indication your unRAID is protecting you from a single disk failure. Now, the "write" might have failed because the disk itself failed, or it might be because a cable to it is loose, or because the power supply used is inadequate for the combined set of disks used, and it just did not get enough power. Step one would be to get a SMART report on the drive. If it responds, it might give an indication of its health. Id it does not respond, time to power down and re-check the cabling. See here: http://lime-technology.com/wiki/index.php?title=FAQ#What_does_the_Red_Ball_mean.3F Found no SMART errors and no faulty cables. Disabled the disk and enabled it back and did a rebuild. Same red dot on the disk. Fond this in the log. Sep 29 14:59:23 HKSRV03 kernel: sd 2:0:0:0: [sdd] Unhandled error code Sep 29 14:59:23 HKSRV03 kernel: sd 2:0:0:0: [sdd] Result: hostbyte=0x04 driverbyte=0x00 Sep 29 14:59:23 HKSRV03 kernel: sd 2:0:0:0: [sdd] CDB: cdb[0]=0x2a: 2a 00 e5 fe 1a 07 00 03 68 00 Shal i replace the disk or can i fix it? Quote Link to comment
tyrindor Posted September 29, 2011 Share Posted September 29, 2011 Logged in to my Unraid today and saw that a disk was marked red. Can't see anything in the logg and i can browse the disk. Any idea? If it is "red" a "write" to that disk failed and it was marked as being invalid. You are now reading and writing to the "simulated" drive made possible by reading parity in combination with all the other data disks. Being able to browse the disk is not an indication of that disk's health. It is an indication your unRAID is protecting you from a single disk failure. Now, the "write" might have failed because the disk itself failed, or it might be because a cable to it is loose, or because the power supply used is inadequate for the combined set of disks used, and it just did not get enough power. Step one would be to get a SMART report on the drive. If it responds, it might give an indication of its health. Id it does not respond, time to power down and re-check the cabling. See here: http://lime-technology.com/wiki/index.php?title=FAQ#What_does_the_Red_Ball_mean.3F Found no SMART errors and no faulty cables. Disabled the disk and enabled it back and did a rebuild. Same red dot on the disk. Fond this in the log. Sep 29 14:59:23 HKSRV03 kernel: sd 2:0:0:0: [sdd] Unhandled error code Sep 29 14:59:23 HKSRV03 kernel: sd 2:0:0:0: [sdd] Result: hostbyte=0x04 driverbyte=0x00 Sep 29 14:59:23 HKSRV03 kernel: sd 2:0:0:0: [sdd] CDB: cdb[0]=0x2a: 2a 00 e5 fe 1a 07 00 03 68 00 Shal i replace the disk or can i fix it? If you are using hotswap bays, try a different one. If you are not, try hooking the drive up to a different SATA port. I believe you either have a faulty SATA port, or a faulty hotswap bay and you should try bypassing both ports. You should also try a 3rd SATA cable, and a new power cable. I've had 2 bad SATA cables in a row more than once... Seems like they come together. Some people on here will claim power cables tend to go bad more than SATA cables, so definitely try that too. It could also be that your PSU is on the way out and is struggling to power everything, but if it was fine before and you aren't using a budget PSU - thats probably not it. Quote Link to comment
ambly Posted September 29, 2011 Share Posted September 29, 2011 Logged in to my Unraid today and saw that a disk was marked red. Can't see anything in the logg and i can browse the disk. Any idea? If it is "red" a "write" to that disk failed and it was marked as being invalid. You are now reading and writing to the "simulated" drive made possible by reading parity in combination with all the other data disks. Being able to browse the disk is not an indication of that disk's health. It is an indication your unRAID is protecting you from a single disk failure. Now, the "write" might have failed because the disk itself failed, or it might be because a cable to it is loose, or because the power supply used is inadequate for the combined set of disks used, and it just did not get enough power. Step one would be to get a SMART report on the drive. If it responds, it might give an indication of its health. Id it does not respond, time to power down and re-check the cabling. See here: http://lime-technology.com/wiki/index.php?title=FAQ#What_does_the_Red_Ball_mean.3F Found no SMART errors and no faulty cables. Disabled the disk and enabled it back and did a rebuild. Same red dot on the disk. Fond this in the log. Sep 29 14:59:23 HKSRV03 kernel: sd 2:0:0:0: [sdd] Unhandled error code Sep 29 14:59:23 HKSRV03 kernel: sd 2:0:0:0: [sdd] Result: hostbyte=0x04 driverbyte=0x00 Sep 29 14:59:23 HKSRV03 kernel: sd 2:0:0:0: [sdd] CDB: cdb[0]=0x2a: 2a 00 e5 fe 1a 07 00 03 68 00 Shal i replace the disk or can i fix it? If you are using hotswap bays, try a different one. If you are not, try hooking the drive up to a different SATA port. I believe you either have a faulty SATA port, or a faulty hotswap bay and you should try bypassing both ports. You should also try a 3rd SATA cable, and a new power cable. I've had 2 bad SATA cables in a row more than once... Seems like they come together. Some people on here will claim power cables tend to go bad more than SATA cables, so definitely try that too. It could also be that your PSU is on the way out and is struggling to power everything, but if it was fine before and you aren't using a budget PSU - thats probably not it. The HW setup has not changed for a year. Checked the cables but did't have any spares at home. Doing a data rebuild now, but it will take some time to run (2Tb disk) Does SATA ports and cables fail, all at a sudden? Quote Link to comment
tyrindor Posted September 29, 2011 Share Posted September 29, 2011 Logged in to my Unraid today and saw that a disk was marked red. Can't see anything in the logg and i can browse the disk. Any idea? If it is "red" a "write" to that disk failed and it was marked as being invalid. You are now reading and writing to the "simulated" drive made possible by reading parity in combination with all the other data disks. Being able to browse the disk is not an indication of that disk's health. It is an indication your unRAID is protecting you from a single disk failure. Now, the "write" might have failed because the disk itself failed, or it might be because a cable to it is loose, or because the power supply used is inadequate for the combined set of disks used, and it just did not get enough power. Step one would be to get a SMART report on the drive. If it responds, it might give an indication of its health. Id it does not respond, time to power down and re-check the cabling. See here: http://lime-technology.com/wiki/index.php?title=FAQ#What_does_the_Red_Ball_mean.3F Found no SMART errors and no faulty cables. Disabled the disk and enabled it back and did a rebuild. Same red dot on the disk. Fond this in the log. Sep 29 14:59:23 HKSRV03 kernel: sd 2:0:0:0: [sdd] Unhandled error code Sep 29 14:59:23 HKSRV03 kernel: sd 2:0:0:0: [sdd] Result: hostbyte=0x04 driverbyte=0x00 Sep 29 14:59:23 HKSRV03 kernel: sd 2:0:0:0: [sdd] CDB: cdb[0]=0x2a: 2a 00 e5 fe 1a 07 00 03 68 00 Shal i replace the disk or can i fix it? If you are using hotswap bays, try a different one. If you are not, try hooking the drive up to a different SATA port. I believe you either have a faulty SATA port, or a faulty hotswap bay and you should try bypassing both ports. You should also try a 3rd SATA cable, and a new power cable. I've had 2 bad SATA cables in a row more than once... Seems like they come together. Some people on here will claim power cables tend to go bad more than SATA cables, so definitely try that too. It could also be that your PSU is on the way out and is struggling to power everything, but if it was fine before and you aren't using a budget PSU - thats probably not it. The HW setup has not changed for a year. Checked the cables but did't have any spares at home. Doing a data rebuild now, but it will take some time to run (2Tb disk) Does SATA ports and cables fail, all at a sudden? SATA ports can die suddenly or randomly start causing issues. SATA cables usually don't cause issues unless its faulty out of the box or someone bent it a little to much.. but i've had some cables go bad in my day. I think I had a similiar issues as you awhile back, where I could rebuild the data on the drive - but it'd still turn red after a little while. No smart errors or any log errors. After replacing the drive it went away. It could be many things though. I usually use a failing drive as an excuse to upgrade. I buy a new one and swap it out, if it fixes it - I RMA the bad drive, if it doesn't then you know it's something else. Quote Link to comment
jbartlett Posted September 29, 2011 Share Posted September 29, 2011 My SATA4 port on my main PC is starting to flake out. Windows will suddenly hang and if I press the reset key, the BIOS detection on the SATA4 port will hang too. Powering down and up corrects. Mental note: I need to move the SATA4 port cable. Quote Link to comment
ambly Posted September 30, 2011 Share Posted September 30, 2011 Logged in to my Unraid today and saw that a disk was marked red. Can't see anything in the logg and i can browse the disk. Any idea? If it is "red" a "write" to that disk failed and it was marked as being invalid. You are now reading and writing to the "simulated" drive made possible by reading parity in combination with all the other data disks. Being able to browse the disk is not an indication of that disk's health. It is an indication your unRAID is protecting you from a single disk failure. Now, the "write" might have failed because the disk itself failed, or it might be because a cable to it is loose, or because the power supply used is inadequate for the combined set of disks used, and it just did not get enough power. Step one would be to get a SMART report on the drive. If it responds, it might give an indication of its health. Id it does not respond, time to power down and re-check the cabling. See here: http://lime-technology.com/wiki/index.php?title=FAQ#What_does_the_Red_Ball_mean.3F Found no SMART errors and no faulty cables. Disabled the disk and enabled it back and did a rebuild. Same red dot on the disk. Fond this in the log. Sep 29 14:59:23 HKSRV03 kernel: sd 2:0:0:0: [sdd] Unhandled error code Sep 29 14:59:23 HKSRV03 kernel: sd 2:0:0:0: [sdd] Result: hostbyte=0x04 driverbyte=0x00 Sep 29 14:59:23 HKSRV03 kernel: sd 2:0:0:0: [sdd] CDB: cdb[0]=0x2a: 2a 00 e5 fe 1a 07 00 03 68 00 Shal i replace the disk or can i fix it? If you are using hotswap bays, try a different one. If you are not, try hooking the drive up to a different SATA port. I believe you either have a faulty SATA port, or a faulty hotswap bay and you should try bypassing both ports. You should also try a 3rd SATA cable, and a new power cable. I've had 2 bad SATA cables in a row more than once... Seems like they come together. Some people on here will claim power cables tend to go bad more than SATA cables, so definitely try that too. It could also be that your PSU is on the way out and is struggling to power everything, but if it was fine before and you aren't using a budget PSU - thats probably not it. The HW setup has not changed for a year. Checked the cables but did't have any spares at home. Doing a data rebuild now, but it will take some time to run (2Tb disk) Does SATA ports and cables fail, all at a sudden? SATA ports can die suddenly or randomly start causing issues. SATA cables usually don't cause issues unless its faulty out of the box or someone bent it a little to much.. but i've had some cables go bad in my day. I think I had a similiar issues as you awhile back, where I could rebuild the data on the drive - but it'd still turn red after a little while. No smart errors or any log errors. After replacing the drive it went away. It could be many things though. I usually use a failing drive as an excuse to upgrade. I buy a new one and swap it out, if it fixes it - I RMA the bad drive, if it doesn't then you know it's something else. After my cable exercise wher i just felt that the cables was connecter i did a rebuild. And it completed without errors this time. Think i will replace the cables but now is everything ok! Quote Link to comment
jayhawk Posted September 30, 2011 Share Posted September 30, 2011 I hate to ring the, "hey can we have an update bell"--but for those of us actively testing this we seem to be sliding off support mountain. Yes, I realize this is a beta, however, updates were historically coming fast enough that if there were a bug or two--one could tolerate the bug, work through it, wait patiently and continue testing. We're pushing a month without a new beta revision, and weeks without a word from LimeTech (last post 15 days ago forum wide). A little discouraging for some that are testing with hardware that isn't currently working properly, and want to continue to help. flame on. Quote Link to comment
Auggie Posted October 1, 2011 Share Posted October 1, 2011 Interesting. But if Tom is truly the only developer working on this product, he is prone to any of life's distractions, such as family emergencies, disruptive interruptions; even the need to step back for a vacation. All of it private moments that does not need a public explanation to us forum members. Or, he could deep in thought and bit twiddling on the next beta release that he lost all concept of time! Quote Link to comment
SSD Posted October 1, 2011 Share Posted October 1, 2011 Or he may be waiting for progress on the Slackware Linux / drivers front, as most if not all of the identified issues are NOT in the unRaid code that Tom developed. As of 5.0b12a, he was at the newest Slackware version. I think he is waiting to see progress on some of the bugs that he has logged before packaging another beta release. Quote Link to comment
mav3r1ck Posted October 1, 2011 Share Posted October 1, 2011 It's all guessing what Tom is currently up to. It would be best if Tom would just inform us on the current unRAID beta activities he is working on. I understand that it's not possible to respond to every user/tester asking for support, but as mentioned above there's been radio silence for quite some time. No is also an answer ;-) Thanks! Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.