unRAID Server Release 5.0-beta12 Available


Recommended Posts

  • Replies 154
  • Created
  • Last Reply

Top Posters In This Topic

I hate quoting myself, but has anyone else seen this ?  The Server starts, the Array status is "Starting" and it stays like that until you reboot.

 

I've had an intermittent error with b12 that I didn't have with b11.  I can start the server and the array will come up as "starting" and nothing will happen.  I can can't attach.  The Shares tab tells me the array needs to start fist.

 

I've attached the syslog.

 

It also looks like I may have blown a disk.  I have a post in general help with the log files for that issue.  Not sure if it's related.

 

Any assistance on what to do with the dead drive would be handy.  I don't know if it's a hardware issue or not.

Link to comment

this happened to me once, i rebooted and it was 'fine' -until i got the BLK_UNHANDLED issue with the SAS controller again :)  think I'll be rolling back to 4.7 today.

 

I hate quoting myself, but has anyone else seen this ?  The Server starts, the Array status is "Starting" and it stays like that until you reboot.

 

I've had an intermittent error with b12 that I didn't have with b11.  I can start the server and the array will come up as "starting" and nothing will happen.  I can can't attach.  The Shares tab tells me the array needs to start fist.

 

I've attached the syslog.

 

It also looks like I may have blown a disk.  I have a post in general help with the log files for that issue.  Not sure if it's related.

 

Any assistance on what to do with the dead drive would be handy.  I don't know if it's a hardware issue or not.

Link to comment

I've had an intermittent error with b12 that I didn't have with b11.  I can start the server and the array will come up as "starting" and nothing will happen.  I can can't attach.  The Shares tab tells me the array needs to start fist.

 

I've attached the syslog.

 

It also looks like I may have blown a disk.  I have a post in general help with the log files for that issue.  Not sure if it's related.

 

Any assistance on what to do with the dead drive would be handy.  I don't know if it's a hardware issue or not.

 

 

 

Bad/marginal PSU, cabling or controller.  See here:

http://lime-technology.com/wiki/index.php?title=The_Analysis_of_Drive_Issues#Drive_Interface_Issues

 

Link to comment

When a new entry is added onto the cache drive (e.g. /films/new-entry), this entry becomes visible under the already existing share 'films, regardless of the setting "Use cache disk" in the share settings.

 

I would expect the setting "No" to exclude cache entries, the setting "Yes" to include cache entries. Is this a correct assumption?

My understanding is the setting tells UNRAID to use the cache drive (or not) when writing new content.

Link to comment

this happened to me once, i rebooted and it was 'fine' -until i got the BLK_UNHANDLED issue with the SAS controller again :)  think I'll be rolling back to 4.7 today.

 

 

Not trying to beat a dead horse, but can you just ensure that the backplane for the drive that is failing in the Norco is properly connected to both power connectors.  I say this because I thought I was fine since all the connectors were connected (main and backup power sockets), but found that the cheap splitter I got actually had a pin that was pushed out of it's crimp, so one of the two connectors wasn't actually providing power...  Once I replaced it all was good.  I have the same exact setup as you with 3 of those cards in a Norco and have not seen that problem with b12 running for about 4 days straight...  Good luck friend.

Link to comment

this happened to me once, i rebooted and it was 'fine' -until i got the BLK_UNHANDLED issue with the SAS controller again :)  think I'll be rolling back to 4.7 today.

 

 

Not trying to beat a dead horse, but can you just ensure that the backplane for the drive that is failing in the Norco is properly connected to both power connectors.  I say this because I thought I was fine since all the connectors were connected (main and backup power sockets), but found that the cheap splitter I got actually had a pin that was pushed out of it's crimp, so one of the two connectors wasn't actually providing power...  Once I replaced it all was good.  I have the same exact setup as you with 3 of those cards in a Norco and have not seen that problem with b12 running for about 4 days straight...  Good luck friend.

 

certainly doesnt hurt to check.  any way to tell which mvsas controller its complaining about --or disk thats losing power per the logs?  perhaps if i knew which connector seemed to be shy of power it would help me get in there and hammer it out.

Link to comment

certainly doesnt hurt to check.  any way to tell which mvsas controller its complaining about --or disk thats losing power per the logs?  perhaps if i knew which connector seemed to be shy of power it would help me get in there and hammer it out.

 

I can't be completely certain, but I would definitely start by checking the connections to Disk 4, sdk, the WD20EARS with serial ending in 286.

Link to comment

Just a side comment, I can see how confusing it is when looking at all the "sas eh calling libata port error handler" messages repeated in syslogs of those with the SAS cards.  There appears to be a bug in its driver module that causes spurious eh calls to appear in the syslog.  See the following example:

 

Aug 29 18:42:32 tower kernel: ata10: sas eh calling libata port error handler
Aug 29 18:42:32 tower kernel: ata11: sas eh calling libata port error handler
Aug 29 18:42:32 tower kernel: ata12: sas eh calling libata port error handler
Aug 29 18:42:32 tower kernel: ata13: sas eh calling libata port error handler
Aug 29 18:42:32 tower kernel: ata14: sas eh calling libata port error handler
Aug 29 18:42:32 tower kernel: ata15: sas eh calling libata port error handler
Aug 29 18:42:32 tower kernel: sas: sas_ata_hard_reset: Found ATA device.
Aug 29 18:42:32 tower kernel: ata15.00: ATA-8: WDC WD20EARS-00MVWB0, 51.0AB51, max UDMA/133

 

Above, it looks like 6 different error handler calls, but actually there is only one, the last one.  You should be able to determine the actual call by the subsequent lines, in response to that call.  As you can see, it's the ata15 call that was effective.

 

Just guessing, but by its behavior, I would say there is a byte or word being used to code (bit-mapped?) the ata number of the last eh caller, locally managed per SAS card.  Unfortunately, it is not being cleared after the eh call, so the logging of each subsequent call is reporting all of the ata eh calls made so far, for that card.  This being a very basic bug (but harmless), I would expect it to disappear with the next revision of the SAS driver module (mv_sas.c?).

Link to comment

I recall people posting in previous beta threads that their cache drive doesn't spin down on it's own, that it's handled differently from the data drives.  That said, clicking on the "Spin Down" button in the GUI does spin the cache drive down along with any active data drives. The Disk settings has one selection: "Default spin down delay", with no differentiation between data & cache drives.

 

If things are behaving the way they are supposed to behave, then the GUI should be updated to be more clear or you'll get a never-ending question asking why the Cache drive isn't spinning down.

Link to comment

certainly doesnt hurt to check.  any way to tell which mvsas controller its complaining about --or disk thats losing power per the logs?  perhaps if i knew which connector seemed to be shy of power it would help me get in there and hammer it out.

 

I can't be completely certain, but I would definitely start by checking the connections to Disk 4, sdk, the WD20EARS with serial ending in 286.

 

Anyway to tell from the logs, for certain?  I'd think that if it were a power issue it would be multiple drives that would have the problem, since its an entire row?  If it's a card, I could swapping it / updating firmware--but i just reseated them all last night.  I did have a couple cables that were making pretty tight turns/semi-crimped so i straightened them out.  After doing all that, it did it within a couple hours of coming up... I restarted it again last night and its been up overnight.  

 

I'll wait on the .13 release and wait to see if anyone can tell me definitively which card is showing the ata power up failure.  at least then i could move all the drives off of it to see if i have the same problem.

 

 

Link to comment

certainly doesnt hurt to check.  any way to tell which mvsas controller its complaining about --or disk thats losing power per the logs?  perhaps if i knew which connector seemed to be shy of power it would help me get in there and hammer it out.

 

I can't be completely certain, but I would definitely start by checking the connections to Disk 4, sdk, the WD20EARS with serial ending in 286.

 

Anyway to tell from the logs, for certain?  I'd think that if it were a power issue it would be multiple drives that would have the problem, since its an entire row?  If it's a card, I could swapping it / updating firmware--but i just reseated them all last night.  I did have a couple cables that were making pretty tight turns/semi-crimped so i straightened them out.  After doing all that, it did it within a couple hours of coming up... I restarted it again last night and its been up overnight.  

 

I'll wait on the .13 release and wait to see if anyone can tell me definitively which card is showing the ata power up failure.  at least then i could move all the drives off of it to see if i have the same problem.

 

If you look at your syslog, at the very end, you can see that ata15 is the ONLY drive with an issue, and ata15 maps to the drive I mentioned above.  Because this is all new technology, we simply don't have enough experience to be definitive about it though.  This drive may just be the 'sentinel chicken' for a larger problem, and that's why I did not want to be definitive, especially since I don't have one of these cards myself, and this is the first time I have seen that drive/card/error sequence.

 

I'm off to my minimum wage try-to-pay-my-bills job now, so no time for more, but I would have really liked to comment on the use of 'definitively' here, respectfully of course!

Link to comment

Hello sorry to bother but is there any way I could know how to run the new permissions script in telnet I would like to see what is happening and how far till it ends. (I know my file structure so if I can see where it is working I can approximate.)

 

thank you. Great product

 

Sorry to bother again but I really need this the script seems to fall asleap in my windows IE and eventhough i have left it alown for more than 24hrs it still is on disk 2. I keep geting permision errors on my files, for instance it would not let me delete folders unless user root would allow in windows 7. this was on things on disk one and did clear up but what if i get this on dist 7?

 

Please help.

 

thank you in advance.

 

thornwood.

Link to comment

Hello sorry to bother but is there any way I could know how to run the new permissions script in telnet I would like to see what is happening and how far till it ends. (I know my file structure so if I can see where it is working I can approximate.)

 

thank you. Great product

 

Sorry to bother again but I really need this the script seems to fall asleap in my windows IE and eventhough i have left it alown for more than 24hrs it still is on disk 2. I keep geting permision errors on my files, for instance it would not let me delete folders unless user root would allow in windows 7. this was on things on disk one and did clear up but what if i get this on dist 7?

 

Please help.

 

thank you in advance.

 

thornwood.

 

A search on the forum would have revealed your answer.

 

newperms /mnt/diskX

replace the "X" with the disk number.

 

If you can't get the newperms script to work on a particular disk then I would check for some filesystem corruption and/or a read only filesystem.

Link to comment

Hello sorry to bother but is there any way I could know how to run the new permissions script in telnet I would like to see what is happening and how far till it ends. (I know my file structure so if I can see where it is working I can approximate.)

 

thank you. Great product

 

Sorry to bother again but I really need this the script seems to fall asleap in my windows IE and eventhough i have left it alown for more than 24hrs it still is on disk 2. I keep geting permision errors on my files, for instance it would not let me delete folders unless user root would allow in windows 7. this was on things on disk one and did clear up but what if i get this on dist 7?

 

Please help.

 

thank you in advance.

 

thornwood.

 

A search on the forum would have revealed your answer.

 

newperms /mnt/diskX

replace the "X" with the disk number.

 

If you can't get the newperms script to work on a particular disk then I would check for some filesystem corruption and/or a read only filesystem.

 

Thank you very much.

I did search but i guess i did not word the question correctly.

Link to comment

My early Labor Day start gave me time to do some more beta testing on unRAID.  Here are some parity checks speeds for the various betas, kernels and drivers.  Tests were performed on the following system:

 

MSI 790FX-GD70

3 x BR10i Controllers (all in x8 slots)

20 x 2TB drives (mix of WD and Hitachi; all 5400rpm)

 

unRAID Linux Kernel mptsas Speed MB/s (after about 5 minutes)

version version

5b8d 2.6.37.6 3.04.17 100+

5b9   2.6.37.6 3.04.17 100+

5b10 2.6.39.3 3.04.18 45+

5b11 2.6.37.6 3.04.17 100+

5b12 3.0.3 3.04.19 70+

 

A quote from the 5b11 release notes:

 

""Also, this release uses the last really stable version (for unRaid) of the linux kernel: 2.6.37.6.  Something changed in the kernel starting with 2.6.38/39 which causes a throttling of I/O to hard drives as soon as you get beyond 6 hard drives accessing in parallel.  Eventually I'll figure this out.""

 

Looks like 3.0.3 is better than 2.6.39.3 but still noticeably below that of 2.6.37.6.  In reviewing this thread, those that reported sync speeds close to that of 2.6.37.6 have tested only 6-9 drives in their array.  Anyone else with large arrays seeing the same thing?

 

Regards,  Peter

Link to comment

Upgraded from 4.7 to b12, everything seems to be working fine with the exception that I can not access the disk2 smb share. The others are fine. I ran the fix permission process & rebooted (twice), no change.  I can browse once logged in as root and I can browse via the web gui. The permissions look fine.  I could access the disk2 smb share under 4.7. I can see the files on disk2 in the user shares.

I've done nothing to correct this but now it's working.

Link to comment

My early Labor Day start gave me time to do some more beta testing on unRAID.  Here are some parity checks speeds for the various betas, kernels and drivers.  Tests were performed on the following system:

 

MSI 790FX-GD70

3 x BR10i Controllers (all in x8 slots)

20 x 2TB drives (mix of WD and Hitachi; all 5400rpm)

 

unRAID Linux Kernel mptsas Speed MB/s (after about 5 minutes)

version version

5b8d 2.6.37.6 3.04.17 100+

5b9   2.6.37.6 3.04.17 100+

5b10 2.6.39.3 3.04.18 45+

5b11 2.6.37.6 3.04.17 100+

5b12 3.0.3 3.04.19 70+

 

A quote from the 5b11 release notes:

 

""Also, this release uses the last really stable version (for unRaid) of the linux kernel: 2.6.37.6.  Something changed in the kernel starting with 2.6.38/39 which causes a throttling of I/O to hard drives as soon as you get beyond 6 hard drives accessing in parallel.  Eventually I'll figure this out.""

 

Looks like 3.0.3 is better than 2.6.39.3 but still noticeably below that of 2.6.37.6.  In reviewing this thread, those that reported sync speeds close to that of 2.6.37.6 have tested only 6-9 drives in their array.  Anyone else with large arrays seeing the same thing?

 

Regards,  Peter

 

What I am about to say is less than scientific, and more just a quick observation.  With 4.7 on my 20 drive, 1 partity, 1 cache drive build using AOC-SALSP-MV8 cards I am used to getting around 75MB/sec, with 5b12 I am getting closer to 40MB/sec....

Link to comment

Okay did a little digging today because of my NIC problems (slow transfer speed), this time i attached the syslog of b10 (working NIC speed around 40MB/s) and of b12 (Speed Peak 15MB/s today, dont know why it got better this time).

 

Only thing i found was that b10 used the r8168 driver and b12 used the r8169 driver, and that the r8169 driver complains that he couldn't find a firmware file

r8169 0000:02:00.0: eth0: unable to load firmware patch rtl_nic/rtl8168d-2.fw (-2)

 

But i dont know if this has something to do with my NIC which Asus Labels as RTL8112L and unRaid recognizes as RTL8168d/8111d.

 

After that i loaded the r8168.ko that gfjardim provided (http://lime-technology.com/forum/index.php?topic=14934.msg140628#msg140628) and it is working really good, which might be because it is v8.025 (b10 used v.8.024). It has a slightly higher Performance than the driver from b10 (around 5-10MB/s).

 

By the way what is the difference between r8168 and r8169?

syslog.zip

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.