unRAID Server Release 5.0-beta12a Available


limetech

Recommended Posts

I concur, I have 3 of those cards supporting 22 drives total, 1 parity (3TB), 20 array drives (18x2TB,2x3TB), and 1 (1.5TB) cache drive, with no problems on 5b12 so far (been running for at least 4 weeks...

 

 

 I am running 9 (Nine) 3TB drives using this card, and 5B12 with no issues of any kinds. So maybe your issues start with other hardware choices you made.

After I run pre_clear I did have a couple issues with one drive. It was showing 723TB free at one point, I wish I had taken a screen shot. I did take one and post it here when showing much less than that, but well over the 3TB capacity of the drive.

But I have been up and running for a couple weeks now, and all is good on the home front.

 

@Lars Olof, don't be sorry for the 'rant' .. I can take it ;-)

 

I didn't start with 4.7 as I wanted to start with 3TB drives, but along the road I came to the conclusion that

my "Supermicro 8-Port SAS/SATA Cards (3xAOC-SASLP-MV8) don't support >2TB.

 

I am fully aware of the BETA stages and I respect them, but let's say 4.7 works without the  "BLK_NOT_HANDLED" issue, why would this be an issue in the new beta serie (5bxx)?

 

We perform regression tests to keep the current functional design operational not sure how that works for Lime.

 

We will wait and see.

 

 

 

 

 

Link to comment
  • Replies 383
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Posted Images

I concur, I have 3 of those cards supporting 22 drives total, 1 parity (3TB), 20 array drives (18x2TB,2x3TB), and 1 (1.5TB) cache drive, with no problems on 5b12 so far (been running for at least 4 weeks...

 

Have any of those drives thrown errors? If not - if the drives are all behaving perfectly - then this is something of a false negative.

Link to comment

I am having trouble with the PowerDown. It seems like the system is hanging on the powerdown. It does not matter if it is a cron job or a manual powerdown from unmenu. I am able to access \\tower:8081 -sickbeard and \\tower:8082 - sabnzbd but cannot access \\tower or \\tower:8080. I can access the shares just to the gui's. It is set to powerdown at 6:00 am, it runs the script but never powers down. I have attached a syslog.

 

Mike

 

Does it power down without problems if you do not run sickbeard and sabnzbd (and any other extra software) ?

 

 

 

no I shutdown both and It still will not power down. I even tried rc.unRAID stop and I get the following.

 

 

root@Tower:/etc/rc.d# rc.unRAID stop

Capturing information to syslog. Please wait...

version[23578]: Linux version 3.0.3-unRAID (root@Develop) (gcc version 4.4.4 (GC                                       C) ) #7 SMP Fri Sep 2 16:44:33 MDT 2011

ls: cannot access /dev/hd[a-z]: No such file or directory

 

and then it hangs. When it was working I would get this but it would power down.

 

Mike

 

Now I am getting errors when trying to access a share. I will be 1/2 way through something and it will disconnect. I found the following in the syslog.

 

Permission denied

Sep 26 09:20:10 Tower cnid_dbd[22459]: main: fatal db lock error

Sep 26 09:20:10 Tower afpd[22127]: read: Connection reset by peer

Sep 26 09:20:10 Tower afpd[22127]: transmit: Request to dbd daemon (db_dir /mnt/user/TV) timed out.

Sep 26 09:20:10 Tower afpd[22127]: ===============================================================

Sep 26 09:20:10 Tower afpd[22127]: INTERNAL ERROR: Signal 11 in pid 22127 (2-2-0-p6)

Sep 26 09:20:10 Tower afpd[22127]: ===============================================================

Sep 26 09:20:10 Tower afpd[22127]: BACKTRACE: 10 stack frames:

Sep 26 09:20:10 Tower afpd[22127]:  #0 /usr/sbin/afpd(netatalk_panic+0x2d) [0x8097dfd]

Sep 26 09:20:10 Tower afpd[22127]:  #1 /usr/sbin/afpd() [0x8097f6d]

Sep 26 09:20:10 Tower afpd[22127]:  #2 [0xb7897400]

Sep 26 09:20:10 Tower afpd[22127]:  #3 /usr/sbin/afpd(dir_add+0x453) [0x8061873]

Sep 26 09:20:10 Tower afpd[22127]:  #4 /usr/sbin/afpd() [0x806529d]

Sep 26 09:20:10 Tower afpd[22127]:  #5 /usr/sbin/afpd(afp_over_dsi+0x55e) [0x8054aee]

Sep 26 09:20:10 Tower afpd[22127]:  #6 /usr/sbin/afpd() [0x8053c55]

Sep 26 09:20:10 Tower afpd[22127]:  #7 /usr/sbin/afpd(main+0x660) [0x8070bd0]

Sep 26 09:20:10 Tower afpd[22127]:  #8 /lib/libc.so.6(__libc_start_main+0xe6) [0xb7480b86]

Sep 26 09:20:10 Tower afpd[22127]:  #9 /usr/sbin/afpd() [0x8053a51]

Sep 26 09:26:40 Tower cnid_dbd[23201]: Set syslog logging to level: LOG_NOTE

 

I have attached syslog also.

Mike

syslog.txt

Link to comment

I am having trouble with the PowerDown. It seems like the system is hanging on the powerdown. It does not matter if it is a cron job or a manual powerdown from unmenu. I am able to access \\tower:8081 -sickbeard and \\tower:8082 - sabnzbd but cannot access \\tower or \\tower:8080. I can access the shares just to the gui's. It is set to powerdown at 6:00 am, it runs the script but never powers down. I have attached a syslog.

 

Mike

 

Does it power down without problems if you do not run sickbeard and sabnzbd (and any other extra software) ?

 

no I shutdown both and It still will not power down. I even tried rc.unRAID stop and I get the following.

 

 

root@Tower:/etc/rc.d# rc.unRAID stop

Capturing information to syslog. Please wait...

version[23578]: Linux version 3.0.3-unRAID (root@Develop) (gcc version 4.4.4 (GC                                       C) ) #7 SMP Fri Sep 2 16:44:33 MDT 2011

ls: cannot access /dev/hd[a-z]: No such file or directory

 

and then it hangs. When it was working I would get this but it would power down.

 

Mike

 

While looking at the powerdown file I noticed something:

 

#!/bin/bash

 

alias logger="/usr/bin/logger -is -plocal7.info -tpowerdown"

 

if [ ${DEBUG:=0} -gt 0 ]

  then set -x -v

fi

 

if [ -z "${1}" ]

  then OPT="-h"

  else OPT="${1}"

fi

 

logger "Powerdown initiated"

 

if [ -f /var/run/powerdown.pid ]

  then logger "Powerdown already active, this one is exiting"

exit

  else echo $$ > /var/run/powerdown.pid

fi

 

trap "rm -f /var/run/powerdown.pid" EXIT HUP INT QUIT

 

/etc/rc.d/rc.unRAID stop

 

# /sbin/poweroff

logger "Initiating Shutdown with ${1}"

/sbin/shutdown -t5 ${OPT} now

 

When I look into /var/run/ there is no file called powerdown.pid. Could this be the problem why it is not powering down?

 

Mike

Link to comment

No.  It is not the reason. 

 

The file is tested for in these lines in the script, and if it exists, the script does not continue as it is already running.

 

If it does not exist, the file is created, and then the script continues.  If you attempted to run the same command while the first is still running, it would find the file and exit.

 

if [ -f /var/run/powerdown.pid ]
   then logger "Powerdown already active, this one is exiting"
   exit
   else echo $$ > /var/run/powerdown.pid
fi

 

This is the line that creates the file when the powerdown script is in progress to prevent you from running a second instance of it at the same time:

echo $$ > /var/run/powerdown.pid

 

Joe L.

Link to comment

I am having trouble with the PowerDown. It seems like the system is hanging on the powerdown. It does not matter if it is a cron job or a manual powerdown from unmenu. I am able to access \\tower:8081 -sickbeard and \\tower:8082 - sabnzbd but cannot access \\tower or \\tower:8080. I can access the shares just to the gui's. It is set to powerdown at 6:00 am, it runs the script but never powers down. I have attached a syslog.

 

Mike

 

Does it power down without problems if you do not run sickbeard and sabnzbd (and any other extra software) ?

 

no I shutdown both and It still will not power down. I even tried rc.unRAID stop and I get the following.

 

 

root@Tower:/etc/rc.d# rc.unRAID stop

Capturing information to syslog. Please wait...

version[23578]: Linux version 3.0.3-unRAID (root@Develop) (gcc version 4.4.4 (GC                                       C) ) #7 SMP Fri Sep 2 16:44:33 MDT 2011

ls: cannot access /dev/hd[a-z]: No such file or directory

 

and then it hangs. When it was working I would get this but it would power down.

 

Mike

 

While looking at the powerdown file I noticed something:

 

#!/bin/bash

 

alias logger="/usr/bin/logger -is -plocal7.info -tpowerdown"

 

if [ ${DEBUG:=0} -gt 0 ]

   then set -x -v

fi

 

if [ -z "${1}" ]

   then OPT="-h"

   else OPT="${1}"

fi

 

logger "Powerdown initiated"

 

if [ -f /var/run/powerdown.pid ]

   then logger "Powerdown already active, this one is exiting"

exit

   else echo $$ > /var/run/powerdown.pid

fi

 

trap "rm -f /var/run/powerdown.pid" EXIT HUP INT QUIT

 

/etc/rc.d/rc.unRAID stop

 

# /sbin/poweroff

logger "Initiating Shutdown with ${1}"

/sbin/shutdown -t5 ${OPT} now

 

When I look into /var/run/ there is no file called powerdown.pid. Could this be the problem why it is not powering down?

 

Mike

 

 

Powerdown is an add-on. Please take this discussion to the User Customizations forum.

Link to comment

I have a b12a server starting the testing process.  Just preclearing 3 tb drives now.  Sorry, no problems to report yet.

 

Supermicro X9SCM-F

Intel i3-2100 and 4 gig RAM

Supermicro AOC-SASLP-MV8 8-Port

all in a Norco 4224

 

But I do have a question.  With ver 5, how large a drive will unRaid support?

 

4tb drives are just announced, and 5tb drives are likely soon after that.

 

When do we hit the next wall requiring major software retooling like is going on right now with the ver 5 betas....

 

 

Link to comment

I have a b12a server starting the testing process.  Just preclearing 3 tb drives now.  Sorry, no problems to report yet.

 

But I do have a question.  Once we get past ver 5 beta, how large a drive will unRaid support?

 

4tb drives are just announced, and 5tb drives are likely soon after that.

 

When do we hit the next wall requiring major software retooling like is going on right now with the ver 5 betas....

 

 

This probably should have been asked in the Lounge but since I can't move it there I will say that there is nothing to worry about... version 5 of unRAID uses the GPT partition  table (http://en.wikipedia.org/wiki/GUID_Partition_Table) for drives about 2.2TB.

Link to comment

I'm on Beta12a, and I just upgraded to another 4x 3TB, this time the SATA3 EZRX models. Precleared all drives using v1.13 of the preclear script, no errors on any. Added them to array one at a time. Everything went fine. Restarted my server, started array, one of the disc just sits at "resizing" even though the array has been up for over a day. This feels like a rather critical bug that may only be affecting the SATA3 3TB drives, but my knowledge with this stuff is slim. The drive isn't shown in my share in Windows, and doesn't seem to be exporting. Yet I can access all my other drives, and there is no red dot next to any drive. I would assume this would throw parity off, causing parity sync issues, resulting in complete corruption of rebuilt data? This is pretty scary stuff.

 

I think this kernal error is the cause, but i'm not sure. Full system log is attached.

 

UPDATE: Stopping the array hanged the entire server. Let it sit there for an hour before improperly restarting the server. I did a cold boot, and everything seems to be working now - but this did happen for a reason and it seems like a software bug, not a hardware bug. This drive was completely empty with just an empty share folder in it.

 

Sep 22 09:09:28 SERVER kernel: BUG: unable to handle kernel NULL pointer dereference at   (null)
Sep 22 09:09:28 SERVER kernel: IP: [] queue_delayed_work_on+0x33/0xbf
Sep 22 09:09:28 SERVER kernel: *pdpt = 000000002fbee001 *pde = 0000000000000000 
Sep 22 09:09:28 SERVER kernel: Oops: 0000 [#1] SMP 
Sep 22 09:09:28 SERVER kernel: Modules linked in: md_mod xor sata_mv e1000e i2c_i801 i2c_core [last unloaded: md_mod]
Sep 22 09:09:28 SERVER kernel: 
Sep 22 09:09:28 SERVER kernel: Pid: 2170, comm: emhttp Not tainted 3.0.3-unRAID #7 Supermicro X7SB4/E/X7SB4/E
Sep 22 09:09:28 SERVER kernel: EIP: 0060:[] EFLAGS: 00210246 CPU: 1
Sep 22 09:09:28 SERVER kernel: EIP is at queue_delayed_work_on+0x33/0xbf
Sep 22 09:09:28 SERVER kernel: EAX: f8a2c138 EBX: ffffffff ECX: f8a2c134 EDX: 00000000
Sep 22 09:09:28 SERVER kernel: ESI: 00000000 EDI: f8a2c134 EBP: ec8fbe40 ESP: ec8fbe34
Sep 22 09:09:28 SERVER kernel:  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Sep 22 09:09:28 SERVER kernel: Process emhttp (pid: 2170, ti=ec8fa000 task=f0092f40 task.ti=ec8fa000)
Sep 22 09:09:28 SERVER kernel: Stack:
Sep 22 09:09:28 SERVER kernel:  f8a1c000 f0370f54 f0370f00 ec8fbe4c c1038c8e 0000000a ec8fbeb0 c10d87fe
Sep 22 09:09:28 SERVER kernel:  f8a1c000 00000012 00000000 f6c42a10 00000000 03b26a10 00000000 00000010
Sep 22 09:09:28 SERVER kernel:  f0370f18 00000012 00000004 f8a1c000 0000004b 00000000 f7baa000 f0f841e0
Sep 22 09:09:28 SERVER kernel: Call Trace:
Sep 22 09:09:28 SERVER kernel:  [] queue_delayed_work+0x1b/0x1e
Sep 22 09:09:28 SERVER kernel:  [] do_journal_end+0x747/0x92a
Sep 22 09:09:28 SERVER kernel:  [] journal_end_sync+0x5b/0x63
Sep 22 09:09:28 SERVER kernel:  [] reiserfs_sync_fs+0x32/0x51
Sep 22 09:09:28 SERVER kernel:  [] __sync_filesystem+0x53/0x65
Sep 22 09:09:28 SERVER kernel:  [] sync_filesystem+0x2c/0x3f
Sep 22 09:09:28 SERVER kernel:  [] do_remount_sb+0x4c/0xd3
Sep 22 09:09:28 SERVER kernel:  [] do_remount+0x74/0xc6
Sep 22 09:09:28 SERVER kernel:  [] do_mount+0x10b/0x1c6
Sep 22 09:09:28 SERVER kernel:  [] sys_mount+0x61/0x94
Sep 22 09:09:28 SERVER kernel:  [] syscall_call+0x7/0xb
Sep 22 09:09:28 SERVER kernel:  [] ? quirk_usb_disable_ehci+0x84/0x129
Sep 22 09:09:28 SERVER kernel: Code: d6 53 89 c3 f0 0f ba 29 00 19 d2 31 c0 85 d2 0f 85 9d 00 00 00 83 79 10 00 74 04 0f 0b eb fe 8d 41 04 39 41 04 74 04 0f 0b eb fe  06 02 b8 08 00 00 00 75 19 89 c8 e8 02 e5 ff ff 85 c0 74 08 
Sep 22 09:09:28 SERVER kernel: EIP: [] queue_delayed_work_on+0x33/0xbf SS:ESP 0068:ec8fbe34
Sep 22 09:09:28 SERVER kernel: CR2: 0000000000000000
Sep 22 09:09:28 SERVER kernel: ---[ end trace f8be7fa413f5555b ]---
Sep 22 09:09:28 SERVER kernel: REISERFS (device md17): Using r5 hash to sort names
Sep 22 09:09:28 SERVER kernel: REISERFS (device md18): Using r5 hash to sort names
Sep 22 09:09:28 SERVER kernel: REISERFS (device md13): Using r5 hash to sort names
Sep 22 09:09:28 SERVER emhttp: shcmd (72): chmod 770 '/mnt/disk17'
Sep 22 09:09:28 SERVER kernel: REISERFS (device md19): Using r5 hash to sort names
Sep 22 09:09:28 SERVER kernel: REISERFS (device md2): Using r5 hash to sort names
Sep 22 09:09:28 SERVER kernel: REISERFS (device md6): Using r5 hash to sort names
Sep 22 09:09:28 SERVER kernel: REISERFS (device md9): Using r5 hash to sort names
Sep 22 09:09:28 SERVER kernel: REISERFS (device md16): Using r5 hash to sort names
Sep 22 09:09:28 SERVER kernel: REISERFS (device md4): Using r5 hash to sort names
Sep 22 09:09:28 SERVER emhttp: shcmd (73): chown nobody:users '/mnt/disk17'
Sep 22 09:09:28 SERVER kernel: REISERFS (device md14): Using r5 hash to sort names
Sep 22 09:09:28 SERVER kernel: REISERFS (device md3): Using r5 hash to sort names
Sep 22 09:09:28 SERVER kernel: REISERFS (device md11): Using r5 hash to sort names
Sep 22 09:09:28 SERVER kernel: REISERFS (device md12): Using r5 hash to sort names
Sep 22 09:09:28 SERVER kernel: REISERFS (device md1): Using r5 hash to sort names
Sep 22 09:09:28 SERVER kernel: REISERFS (device md8): Using r5 hash to sort names
Sep 22 09:09:28 SERVER kernel: REISERFS (device md5): Using r5 hash to sort names
Sep 22 09:09:28 SERVER kernel: REISERFS (device md15): Using r5 hash to sort names
Sep 22 09:09:28 SERVER kernel: REISERFS (device md10): Using r5 hash to sort names
Sep 22 09:09:28 SERVER kernel: REISERFS (device md20): Using r5 hash to sort names
Sep 22 09:09:28 SERVER emhttp: shcmd (74): chmod 770 '/mnt/disk18'
Sep 22 09:09:28 SERVER emhttp: shcmd (75): chown nobody:users '/mnt/disk18'
Sep 22 09:09:28 SERVER emhttp: shcmd (76): chmod 770 '/mnt/disk13'
Sep 22 09:09:28 SERVER emhttp: shcmd (77): chmod 770 '/mnt/disk2'
Sep 22 09:09:28 SERVER emhttp: shcmd (78): chmod 770 '/mnt/disk9'
Sep 22 09:09:28 SERVER emhttp: shcmd (79): chmod 770 '/mnt/disk14'
Sep 22 09:09:28 SERVER emhttp: shcmd (80): chown nobody:users '/mnt/disk13'
Sep 22 09:09:28 SERVER emhttp: shcmd (81): chmod 770 '/mnt/disk8'
Sep 22 09:09:28 SERVER emhttp: shcmd (82): chown nobody:users '/mnt/disk14'
Sep 22 09:09:28 SERVER emhttp: shcmd (83): chown nobody:users '/mnt/disk8'
Sep 22 09:09:28 SERVER emhttp: shcmd (84): chown nobody:users '/mnt/disk2'
Sep 22 09:09:28 SERVER emhttp: shcmd (85): chown nobody:users '/mnt/disk9'
Sep 22 09:09:28 SERVER emhttp: shcmd (86): chmod 770 '/mnt/disk16'
Sep 22 09:09:28 SERVER emhttp: shcmd (87): chmod 770 '/mnt/disk1'
Sep 22 09:09:28 SERVER emhttp: shcmd (89): chmod 770 '/mnt/disk3'
Sep 22 09:09:28 SERVER emhttp: shcmd (90): chmod 770 '/mnt/disk4'
Sep 22 09:09:28 SERVER emhttp: shcmd (88): chmod 770 '/mnt/disk5'
Sep 22 09:09:28 SERVER emhttp: shcmd (91): chmod 770 '/mnt/disk20'
Sep 22 09:09:28 SERVER emhttp: shcmd (92): chown nobody:users '/mnt/disk16'
Sep 22 09:09:28 SERVER emhttp: shcmd (93): chmod 770 '/mnt/disk10'
Sep 22 09:09:28 SERVER emhttp: shcmd (94): chown nobody:users '/mnt/disk20'
Sep 22 09:09:28 SERVER emhttp: shcmd (95): chown nobody:users '/mnt/disk4'
Sep 22 09:09:28 SERVER emhttp: shcmd (96): chown nobody:users '/mnt/disk3'
Sep 22 09:09:28 SERVER emhttp: shcmd (97): chmod 770 '/mnt/disk15'
Sep 22 09:09:28 SERVER emhttp: shcmd (98): chown nobody:users '/mnt/disk5'
Sep 22 09:09:28 SERVER emhttp: shcmd (99): chown nobody:users '/mnt/disk1'
Sep 22 09:09:28 SERVER emhttp: shcmd (100): chown nobody:users '/mnt/disk10'
Sep 22 09:09:28 SERVER emhttp: shcmd (101): chown nobody:users '/mnt/disk15'
Sep 22 09:09:29 SERVER emhttp: shcmd (102): chmod 770 '/mnt/disk6'
Sep 22 09:09:29 SERVER emhttp: shcmd (103): chmod 770 '/mnt/disk19'
Sep 22 09:09:29 SERVER emhttp: shcmd (104): chmod 770 '/mnt/disk12'
Sep 22 09:09:29 SERVER emhttp: shcmd (105): chmod 770 '/mnt/disk11'
Sep 22 09:09:29 SERVER emhttp: shcmd (106): chown nobody:users '/mnt/disk6'
Sep 22 09:09:29 SERVER emhttp: shcmd (107): chown nobody:users '/mnt/disk19'
Sep 22 09:09:29 SERVER emhttp: shcmd (108): chown nobody:users '/mnt/disk11'
Sep 22 09:09:29 SERVER emhttp: shcmd (109): chown nobody:users '/mnt/disk12'
Sep 22 09:09:29 SERVER emhttp: shcmd (110): mkdir /mnt/user
Sep 22 09:09:29 SERVER emhttp: shcmd (111): /usr/local/sbin/shfs /mnt/user -disks 2097022 -o noatime,big_writes,allow_other,default_permissions,use_ino 
Sep 22 09:09:29 SERVER emhttp: shcmd (112): crontab -c /etc/cron.d -d &> /dev/null
Sep 22 09:09:29 SERVER emhttp: shcmd (113): /usr/local/sbin/emhttp_event disks_mounted
Sep 22 09:09:29 SERVER emhttp_event: disks_mounted
Sep 22 09:09:29 SERVER emhttp: shcmd (114): :>/etc/samba/smb-shares.conf
Sep 22 09:09:29 SERVER emhttp: Restart SMB...
Sep 22 09:09:29 SERVER emhttp: shcmd (115): killall -HUP smbd
Sep 22 09:09:29 SERVER emhttp: shcmd (116): ps axc | grep -q rpc.mountd
Sep 22 09:09:29 SERVER emhttp: _shcmd: shcmd (116): exit status: 1
Sep 22 09:09:29 SERVER emhttp: shcmd (117): /usr/local/sbin/emhttp_event svcs_restarted
Sep 22 09:09:29 SERVER emhttp_event: svcs_restarted

 

 

 

Just bumping my post here. Full system log is attached to original post. This just happened again when starting array, this time it was a 2TB drive that also has no errors, and is on a completely different SATA controller. This all started with 12a. I can't be the only one? Syslog says kernel errors.

 

I'm 99.9% positive that it is caused by having a empty drive. This has been the cases both times. I only have a folder on the drive called "Movies", which is the share, and no data files in that share.

Link to comment

I concur, I have 3 of those cards supporting 22 drives total, 1 parity (3TB), 20 array drives (18x2TB,2x3TB), and 1 (1.5TB) cache drive, with no problems on 5b12 so far (been running for at least 4 weeks...

 

Have any of those drives thrown errors? If not - if the drives are all behaving perfectly - then this is something of a false negative.

 

No, no errors on any drive.  I do have a spindown (or lack of spindown) problem, but no other issues.

Link to comment

Logged in to my Unraid today and saw that a disk was marked red.

Can't see anything in the logg and i can browse the disk.

 

Any idea?

If it is "red" a "write" to that disk failed and it was marked as being invalid.  You are now reading and writing to the "simulated" drive made possible  by reading parity in combination with all the other data disks.

 

Being able to browse the disk is not an indication of that disk's health.  It is an indication your unRAID is protecting you from a single disk failure.

 

Now, the "write" might have failed because the disk itself failed, or it might be because a cable to it is loose, or because the power supply used is inadequate for the combined set of disks used, and it just did not get enough power.

 

Step one would be to get a SMART report on the drive.  If it responds, it might give an indication of its health.  Id it does not respond, time to power down and re-check the cabling.

 

See here: http://lime-technology.com/wiki/index.php?title=FAQ#What_does_the_Red_Ball_mean.3F

Link to comment

Logged in to my Unraid today and saw that a disk was marked red.

Can't see anything in the logg and i can browse the disk.

 

Any idea?

If it is "red" a "write" to that disk failed and it was marked as being invalid.   You are now reading and writing to the "simulated" drive made possible  by reading parity in combination with all the other data disks.

 

Being able to browse the disk is not an indication of that disk's health.  It is an indication your unRAID is protecting you from a single disk failure.

 

Now, the "write" might have failed because the disk itself failed, or it might be because a cable to it is loose, or because the power supply used is inadequate for the combined set of disks used, and it just did not get enough power.

 

Step one would be to get a SMART report on the drive.  If it responds, it might give an indication of its health.  Id it does not respond, time to power down and re-check the cabling.

 

See here: http://lime-technology.com/wiki/index.php?title=FAQ#What_does_the_Red_Ball_mean.3F

 

Found no SMART errors and no faulty cables.

Disabled the disk and enabled it back and did a rebuild.

 

Same red dot on the disk.

Fond this in the log.

 

Sep 29 14:59:23 HKSRV03 kernel: sd 2:0:0:0: [sdd] Unhandled error code

Sep 29 14:59:23 HKSRV03 kernel: sd 2:0:0:0: [sdd]  Result: hostbyte=0x04 driverbyte=0x00

Sep 29 14:59:23 HKSRV03 kernel: sd 2:0:0:0: [sdd] CDB: cdb[0]=0x2a: 2a 00 e5 fe 1a 07 00 03 68 00

 

Shal i replace the disk or can i fix it?

 

Link to comment

Logged in to my Unraid today and saw that a disk was marked red.

Can't see anything in the logg and i can browse the disk.

 

Any idea?

If it is "red" a "write" to that disk failed and it was marked as being invalid.   You are now reading and writing to the "simulated" drive made possible  by reading parity in combination with all the other data disks.

 

Being able to browse the disk is not an indication of that disk's health.  It is an indication your unRAID is protecting you from a single disk failure.

 

Now, the "write" might have failed because the disk itself failed, or it might be because a cable to it is loose, or because the power supply used is inadequate for the combined set of disks used, and it just did not get enough power.

 

Step one would be to get a SMART report on the drive.  If it responds, it might give an indication of its health.  Id it does not respond, time to power down and re-check the cabling.

 

See here: http://lime-technology.com/wiki/index.php?title=FAQ#What_does_the_Red_Ball_mean.3F

 

Found no SMART errors and no faulty cables.

Disabled the disk and enabled it back and did a rebuild.

 

Same red dot on the disk.

Fond this in the log.

 

Sep 29 14:59:23 HKSRV03 kernel: sd 2:0:0:0: [sdd] Unhandled error code

Sep 29 14:59:23 HKSRV03 kernel: sd 2:0:0:0: [sdd]  Result: hostbyte=0x04 driverbyte=0x00

Sep 29 14:59:23 HKSRV03 kernel: sd 2:0:0:0: [sdd] CDB: cdb[0]=0x2a: 2a 00 e5 fe 1a 07 00 03 68 00

 

Shal i replace the disk or can i fix it?

 

 

If you are using hotswap bays, try a different one. If you are not, try hooking the drive up to a different SATA port. I believe you either have a faulty SATA port, or a faulty hotswap bay and you should try bypassing both ports. You should also try a 3rd SATA cable, and a new power cable. I've had 2 bad SATA cables in a row more than once... Seems like they come together. Some people on here will claim power cables tend to go bad more than SATA cables, so definitely try that too.

 

It could also be that your PSU is on the way out and is struggling to power everything, but if it was fine before and you aren't using a budget PSU - thats probably not it.

Link to comment

Logged in to my Unraid today and saw that a disk was marked red.

Can't see anything in the logg and i can browse the disk.

 

Any idea?

If it is "red" a "write" to that disk failed and it was marked as being invalid.   You are now reading and writing to the "simulated" drive made possible  by reading parity in combination with all the other data disks.

 

Being able to browse the disk is not an indication of that disk's health.  It is an indication your unRAID is protecting you from a single disk failure.

 

Now, the "write" might have failed because the disk itself failed, or it might be because a cable to it is loose, or because the power supply used is inadequate for the combined set of disks used, and it just did not get enough power.

 

Step one would be to get a SMART report on the drive.  If it responds, it might give an indication of its health.  Id it does not respond, time to power down and re-check the cabling.

 

See here: http://lime-technology.com/wiki/index.php?title=FAQ#What_does_the_Red_Ball_mean.3F

 

Found no SMART errors and no faulty cables.

Disabled the disk and enabled it back and did a rebuild.

 

Same red dot on the disk.

Fond this in the log.

 

Sep 29 14:59:23 HKSRV03 kernel: sd 2:0:0:0: [sdd] Unhandled error code

Sep 29 14:59:23 HKSRV03 kernel: sd 2:0:0:0: [sdd]  Result: hostbyte=0x04 driverbyte=0x00

Sep 29 14:59:23 HKSRV03 kernel: sd 2:0:0:0: [sdd] CDB: cdb[0]=0x2a: 2a 00 e5 fe 1a 07 00 03 68 00

 

Shal i replace the disk or can i fix it?

 

 

If you are using hotswap bays, try a different one. If you are not, try hooking the drive up to a different SATA port. I believe you either have a faulty SATA port, or a faulty hotswap bay and you should try bypassing both ports. You should also try a 3rd SATA cable, and a new power cable. I've had 2 bad SATA cables in a row more than once... Seems like they come together. Some people on here will claim power cables tend to go bad more than SATA cables, so definitely try that too.

 

It could also be that your PSU is on the way out and is struggling to power everything, but if it was fine before and you aren't using a budget PSU - thats probably not it.

The HW setup has not changed for a year.

Checked the cables but did't have any spares at home.

Doing a data rebuild now, but it will take some time to run (2Tb disk)

 

Does SATA ports and cables fail, all at a sudden?

Link to comment

Logged in to my Unraid today and saw that a disk was marked red.

Can't see anything in the logg and i can browse the disk.

 

Any idea?

If it is "red" a "write" to that disk failed and it was marked as being invalid.   You are now reading and writing to the "simulated" drive made possible  by reading parity in combination with all the other data disks.

 

Being able to browse the disk is not an indication of that disk's health.  It is an indication your unRAID is protecting you from a single disk failure.

 

Now, the "write" might have failed because the disk itself failed, or it might be because a cable to it is loose, or because the power supply used is inadequate for the combined set of disks used, and it just did not get enough power.

 

Step one would be to get a SMART report on the drive.  If it responds, it might give an indication of its health.  Id it does not respond, time to power down and re-check the cabling.

 

See here: http://lime-technology.com/wiki/index.php?title=FAQ#What_does_the_Red_Ball_mean.3F

 

Found no SMART errors and no faulty cables.

Disabled the disk and enabled it back and did a rebuild.

 

Same red dot on the disk.

Fond this in the log.

 

Sep 29 14:59:23 HKSRV03 kernel: sd 2:0:0:0: [sdd] Unhandled error code

Sep 29 14:59:23 HKSRV03 kernel: sd 2:0:0:0: [sdd]  Result: hostbyte=0x04 driverbyte=0x00

Sep 29 14:59:23 HKSRV03 kernel: sd 2:0:0:0: [sdd] CDB: cdb[0]=0x2a: 2a 00 e5 fe 1a 07 00 03 68 00

 

Shal i replace the disk or can i fix it?

 

 

If you are using hotswap bays, try a different one. If you are not, try hooking the drive up to a different SATA port. I believe you either have a faulty SATA port, or a faulty hotswap bay and you should try bypassing both ports. You should also try a 3rd SATA cable, and a new power cable. I've had 2 bad SATA cables in a row more than once... Seems like they come together. Some people on here will claim power cables tend to go bad more than SATA cables, so definitely try that too.

 

It could also be that your PSU is on the way out and is struggling to power everything, but if it was fine before and you aren't using a budget PSU - thats probably not it.

The HW setup has not changed for a year.

Checked the cables but did't have any spares at home.

Doing a data rebuild now, but it will take some time to run (2Tb disk)

 

Does SATA ports and cables fail, all at a sudden?

 

SATA ports can die suddenly or randomly start causing issues.

SATA cables usually don't cause issues unless its faulty out of the box or someone bent it a little to much.. but i've had some cables go bad in my day.

 

I think I had a similiar issues as you awhile back, where I could rebuild the data on the drive - but it'd still turn red after a little while. No smart errors or any log errors. After replacing the drive it went away. It could be many things though. I usually use a failing drive as an excuse to upgrade. I buy a new one and swap it out, if it fixes it - I RMA the bad drive, if it doesn't then you know it's something else.

Link to comment

Logged in to my Unraid today and saw that a disk was marked red.

Can't see anything in the logg and i can browse the disk.

 

Any idea?

If it is "red" a "write" to that disk failed and it was marked as being invalid.   You are now reading and writing to the "simulated" drive made possible  by reading parity in combination with all the other data disks.

 

Being able to browse the disk is not an indication of that disk's health.  It is an indication your unRAID is protecting you from a single disk failure.

 

Now, the "write" might have failed because the disk itself failed, or it might be because a cable to it is loose, or because the power supply used is inadequate for the combined set of disks used, and it just did not get enough power.

 

Step one would be to get a SMART report on the drive.  If it responds, it might give an indication of its health.  Id it does not respond, time to power down and re-check the cabling.

 

See here: http://lime-technology.com/wiki/index.php?title=FAQ#What_does_the_Red_Ball_mean.3F

 

Found no SMART errors and no faulty cables.

Disabled the disk and enabled it back and did a rebuild.

 

Same red dot on the disk.

Fond this in the log.

 

Sep 29 14:59:23 HKSRV03 kernel: sd 2:0:0:0: [sdd] Unhandled error code

Sep 29 14:59:23 HKSRV03 kernel: sd 2:0:0:0: [sdd]  Result: hostbyte=0x04 driverbyte=0x00

Sep 29 14:59:23 HKSRV03 kernel: sd 2:0:0:0: [sdd] CDB: cdb[0]=0x2a: 2a 00 e5 fe 1a 07 00 03 68 00

 

Shal i replace the disk or can i fix it?

 

 

If you are using hotswap bays, try a different one. If you are not, try hooking the drive up to a different SATA port. I believe you either have a faulty SATA port, or a faulty hotswap bay and you should try bypassing both ports. You should also try a 3rd SATA cable, and a new power cable. I've had 2 bad SATA cables in a row more than once... Seems like they come together. Some people on here will claim power cables tend to go bad more than SATA cables, so definitely try that too.

 

It could also be that your PSU is on the way out and is struggling to power everything, but if it was fine before and you aren't using a budget PSU - thats probably not it.

The HW setup has not changed for a year.

Checked the cables but did't have any spares at home.

Doing a data rebuild now, but it will take some time to run (2Tb disk)

 

Does SATA ports and cables fail, all at a sudden?

 

SATA ports can die suddenly or randomly start causing issues.

SATA cables usually don't cause issues unless its faulty out of the box or someone bent it a little to much.. but i've had some cables go bad in my day.

 

I think I had a similiar issues as you awhile back, where I could rebuild the data on the drive - but it'd still turn red after a little while. No smart errors or any log errors. After replacing the drive it went away. It could be many things though. I usually use a failing drive as an excuse to upgrade. I buy a new one and swap it out, if it fixes it - I RMA the bad drive, if it doesn't then you know it's something else.

After my cable exercise wher i just felt that the cables was connecter i did a rebuild.

And it completed without errors this time.

Think i will replace the cables but now is everything ok!

Link to comment

I hate to ring the, "hey can we have an update bell"--but for those of us actively testing this we seem to be sliding off support mountain.  Yes, I realize this is a beta, however, updates were historically coming fast enough that if there were a bug or two--one could tolerate the bug, work through it, wait patiently and continue testing. 

 

We're pushing a month without a new beta revision, and weeks without a word from LimeTech (last post 15 days ago forum wide).

 

A little discouraging for some that are testing with hardware that isn't currently working properly, and want to continue to help.

 

flame on.

Link to comment

Interesting.

 

But if Tom is truly the only developer working on this product, he is prone to any of life's distractions, such as family emergencies, disruptive interruptions; even the need to step back for a vacation.

 

All of it private moments that does not need a public explanation to us forum members.

 

Or, he could deep in thought and bit twiddling on the next beta release that he lost all concept of time!  ;D

Link to comment

Or he may be waiting for progress on the Slackware Linux / drivers front, as most if not all of the identified issues are NOT in the unRaid code that Tom developed.

 

As of 5.0b12a, he was at the newest Slackware version. I think he is waiting to see progress on some of the bugs that he has logged before packaging another beta release. 

Link to comment

It's all guessing what Tom is currently up to.

 

It would be best if Tom would just inform us on the current unRAID beta activities he is working on.

I understand that it's not possible to respond to every user/tester asking for support, but as mentioned above there's been radio silence for quite some time.

 

No is also an answer ;-)

 

Thanks!

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.