unRAID Server Release 5.0-beta12a Available


limetech

Recommended Posts

  • Replies 383
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Posted Images

Also in this version I keep getting the BLK_EH_NOT_HANDLED.

 

Sep  4 09:38:25 Goliath kernel: sas: command 0xee53e300, task 0xf74963c0, timed out: BLK_EH_NOT_HANDLED

Sep  4 09:38:25 Goliath kernel: sas: Enter sas_scsi_recover_host

Sep  4 09:38:25 Goliath kernel: sas: trying to find task 0xf74963c0

Sep  4 09:38:25 Goliath kernel: sas: sas_scsi_find_task: aborting task 0xf74963c0

Sep  4 09:38:25 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 1818:<7>mv_abort_task() mvi=f70e0000 task=f74963c0 slot=f70f15d8 slot_idx=x0

Sep  4 09:38:25 Goliath kernel: sas: sas_scsi_find_task: querying task 0xf74963c0

Sep  4 09:38:25 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 1747:mvs_query_task:rc= 5

Sep  4 09:38:25 Goliath kernel: sas: sas_scsi_find_task: task 0xf74963c0 failed to abort

Sep  4 09:38:25 Goliath kernel: sas: task 0xf74963c0 is not at LU: I_T recover

Sep  4 09:38:25 Goliath kernel: sas: I_T nexus reset for dev 0500000000000000

Sep  4 09:38:25 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2198:port 5 ctrl sts=0x89800.

Sep  4 09:38:25 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2200:Port 5 irq sts = 0x1001

Sep  4 09:38:25 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2226:phy5 Unplug Notice

Sep  4 09:38:25 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2198:port 5 ctrl sts=0x199800.

Sep  4 09:38:25 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2200:Port 5 irq sts = 0x1081

Sep  4 09:38:25 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2198:port 5 ctrl sts=0x199800.

Sep  4 09:38:25 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2200:Port 5 irq sts = 0x10000

Sep  4 09:38:25 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2253:notify plug in on phy[5]

Sep  4 09:38:25 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 1338:port 5 attach dev info is 0

Sep  4 09:38:25 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 1340:port 5 attach sas addr is 5

Sep  4 09:38:25 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 379:phy 5 byte dmaded.

Sep  4 09:38:25 Goliath kernel: sas: sas_form_port: phy5 belongs to port1 already(1)!

Sep  4 09:38:28 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 1701:mvs_I_T_nexus_reset for device[1]:rc= 0

Sep  4 09:38:28 Goliath kernel: sas: I_T 0500000000000000 recovered

Sep  4 09:38:28 Goliath kernel: sas: sas_ata_task_done: SAS error 8d

Sep  4 09:38:28 Goliath kernel: ata5: sas eh calling libata port error handler

Sep  4 09:38:28 Goliath kernel: ata6: sas eh calling libata port error handler

Sep  4 09:38:28 Goliath kernel: sas: sas_ata_task_done: SAS error 2

Sep  4 09:38:28 Goliath kernel: ata6: failed to read log page 10h (errno=-5)

Sep  4 09:38:28 Goliath kernel: ata6.00: exception Emask 0x1 SAct 0x1 SErr 0x0 action 0x6 t0

Sep  4 09:38:28 Goliath kernel: ata6.00: failed command: READ FPDMA QUEUED

Sep  4 09:38:28 Goliath kernel: ata6.00: cmd 60/00:00:10:d0:50/02:00:0f:00:00/40 tag 0 ncq 262144 in

Sep  4 09:38:28 Goliath kernel:          res 01/04:04:10:ce:50/00:00:0f:00:00/40 Emask 0x3 (HSM violation)

Sep  4 09:38:28 Goliath kernel: ata6.00: status: { ERR }

Sep  4 09:38:28 Goliath kernel: ata6.00: error: { ABRT }

Sep  4 09:38:28 Goliath kernel: ata6: hard resetting link

Sep  4 09:38:28 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2198:port 5 ctrl sts=0x89800.

Sep  4 09:38:28 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2200:Port 5 irq sts = 0x1001001

Sep  4 09:38:28 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2226:phy5 Unplug Notice

 

etc

 

The length of the drop-down box for assign drives on the Main page is still 20 in my case. Should be 26?

The main page reports beta12a aswell... not sure why I don't see drop-down boxes for 26 drives.. ;-)

 

I only replaced the bzimage and bzroot files, removed the super.dat file (renamed it to super.bak).

After a reboot I reorganized the drives back to their original order. That worked without any problems. Started the parity check again... emhttp stopped responding within 30 mins.  All again due to BLK_EH_NOT_HANDLED.

 

I've got a big system which is ready to go, but failing on this error. Is there a version that works without these issues? I really need to get this started asap. Thanks!

 

 

 

 

According to this url: http://lime-technology.com/wiki/index.php?title=Hardware_Compatibility#PCI_SATA_Controllers

the SuperMicro AOC-SASLP-MV8 - 8 Port SATA II PCIe x4 is fully supported.

 

Rebuilding and even starting a new array ends in the BLK_EH_NOT_HANDLED error. I am not the only one having this issue... what's going on?

 

Tested on Beta11 / 12 and 12a

 

Link to comment

Also in this version I keep getting the BLK_EH_NOT_HANDLED.

 

Sep  4 09:38:25 Goliath kernel: sas: command 0xee53e300, task 0xf74963c0, timed out: BLK_EH_NOT_HANDLED

Sep  4 09:38:25 Goliath kernel: sas: Enter sas_scsi_recover_host

Sep  4 09:38:25 Goliath kernel: sas: trying to find task 0xf74963c0

Sep  4 09:38:25 Goliath kernel: sas: sas_scsi_find_task: aborting task 0xf74963c0

Sep  4 09:38:25 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 1818:<7>mv_abort_task() mvi=f70e0000 task=f74963c0 slot=f70f15d8 slot_idx=x0

Sep  4 09:38:25 Goliath kernel: sas: sas_scsi_find_task: querying task 0xf74963c0

Sep  4 09:38:25 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 1747:mvs_query_task:rc= 5

Sep  4 09:38:25 Goliath kernel: sas: sas_scsi_find_task: task 0xf74963c0 failed to abort

Sep  4 09:38:25 Goliath kernel: sas: task 0xf74963c0 is not at LU: I_T recover

Sep  4 09:38:25 Goliath kernel: sas: I_T nexus reset for dev 0500000000000000

Sep  4 09:38:25 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2198:port 5 ctrl sts=0x89800.

Sep  4 09:38:25 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2200:Port 5 irq sts = 0x1001

Sep  4 09:38:25 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2226:phy5 Unplug Notice

Sep  4 09:38:25 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2198:port 5 ctrl sts=0x199800.

Sep  4 09:38:25 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2200:Port 5 irq sts = 0x1081

Sep  4 09:38:25 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2198:port 5 ctrl sts=0x199800.

Sep  4 09:38:25 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2200:Port 5 irq sts = 0x10000

Sep  4 09:38:25 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2253:notify plug in on phy[5]

Sep  4 09:38:25 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 1338:port 5 attach dev info is 0

Sep  4 09:38:25 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 1340:port 5 attach sas addr is 5

Sep  4 09:38:25 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 379:phy 5 byte dmaded.

Sep  4 09:38:25 Goliath kernel: sas: sas_form_port: phy5 belongs to port1 already(1)!

Sep  4 09:38:28 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 1701:mvs_I_T_nexus_reset for device[1]:rc= 0

Sep  4 09:38:28 Goliath kernel: sas: I_T 0500000000000000 recovered

Sep  4 09:38:28 Goliath kernel: sas: sas_ata_task_done: SAS error 8d

Sep  4 09:38:28 Goliath kernel: ata5: sas eh calling libata port error handler

Sep  4 09:38:28 Goliath kernel: ata6: sas eh calling libata port error handler

Sep  4 09:38:28 Goliath kernel: sas: sas_ata_task_done: SAS error 2

Sep  4 09:38:28 Goliath kernel: ata6: failed to read log page 10h (errno=-5)

Sep  4 09:38:28 Goliath kernel: ata6.00: exception Emask 0x1 SAct 0x1 SErr 0x0 action 0x6 t0

Sep  4 09:38:28 Goliath kernel: ata6.00: failed command: READ FPDMA QUEUED

Sep  4 09:38:28 Goliath kernel: ata6.00: cmd 60/00:00:10:d0:50/02:00:0f:00:00/40 tag 0 ncq 262144 in

Sep  4 09:38:28 Goliath kernel:          res 01/04:04:10:ce:50/00:00:0f:00:00/40 Emask 0x3 (HSM violation)

Sep  4 09:38:28 Goliath kernel: ata6.00: status: { ERR }

Sep  4 09:38:28 Goliath kernel: ata6.00: error: { ABRT }

Sep  4 09:38:28 Goliath kernel: ata6: hard resetting link

Sep  4 09:38:28 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2198:port 5 ctrl sts=0x89800.

Sep  4 09:38:28 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2200:Port 5 irq sts = 0x1001001

Sep  4 09:38:28 Goliath kernel: drivers/scsi/mvsas/mv_sas.c 2226:phy5 Unplug Notice

 

etc

 

The length of the drop-down box for assign drives on the Main page is still 20 in my case. Should be 26?

The main page reports beta12a aswell... not sure why I don't see drop-down boxes for 26 drives.. ;-)

 

I only replaced the bzimage and bzroot files, removed the super.dat file (renamed it to super.bak).

After a reboot I reorganized the drives back to their original order. That worked without any problems. Started the parity check again... emhttp stopped responding within 30 mins.  All again due to BLK_EH_NOT_HANDLED.

 

I've got a big system which is ready to go, but failing on this error. Is there a version that works without these issues? I really need to get this started asap. Thanks!

 

 

 

 

According to this url: http://lime-technology.com/wiki/index.php?title=Hardware_Compatibility#PCI_SATA_Controllers

the SuperMicro AOC-SASLP-MV8 - 8 Port SATA II PCIe x4 is fully supported.

 

Rebuilding and even starting a new array ends in the BLK_EH_NOT_HANDLED error. I am not the only one having this issue... what's going on?

 

Tested on Beta11 / 12 and 12a

 

 

Next time, post your ENTIRE system log like I have asked several times; a snippet does me no good.

 

The AOC-SASLP-MV8 is supported and works fine.  I think there are two problems happening:

1. There is a hardware problem that is causing operation timeouts on the card, that is what this error is saying.

2. The card's driver is not recovering correctly from this particular error.

 

Probably you have a bad or inadequate power supply for 20 drives plus your other components (what PSU model is it?).  As a test, please try building an array with only 5 or 10 drives and see if you still get that error, and report back your result.

 

What I meant by 26 drives in the drop down box is not exactly right.  All it means is that unRaid will inventory up to 26 storage devices from which one will be the Flash device; you can pick 21 of them to be in the array, and another to be the cache drive.  The length of the list will never be longer than the number of installed hard drives however.

Link to comment

Your Flash device is corrupted and you should remove it from the server, backup the contents of 'config' directory, then reformat and re-install.  Then copy your 'config' backup to the flash.  This is probably not the cause of the SAS errors though.

 

The parity check starts at 19:11:17, but the first SAS error is reported 20 min later at 19:31:40.  During this time is the webGui responsive?

 

I see you have 5 drives plugged in.  Please simplify further, maybe a Parity drive and a single Data drive.  If this gets further than 20 min then probably ok, cancel, add another data drive or two, and repeat.  Trying to isolate issue to see if it's a particular hard drive/port.

Link to comment

Flash drive corrupt? Can you point me towards this in the syslog?

 

During that 20mins parity period the webGUI is responsive. After the first sas error it's bye bye

 

Total of 5 disks. 1 parity,  3 storage and 1 250GB cache drive. I will simplify to 1 parity and 1 storage drive.

Will do that tomorrow (GMT+2) overhere.

 

Link to comment

I tried to search around, but came up empty.  Is there a summary of what Realtek drivers are in what releases?  I'm currently on 5b6a and want to upgrade to the latest.  b6a shows that I'm running the r8169 driver.  Is this the Linux or Realtek version?

 

I'd like to help test, but I want to know what I'm getting myself into.  My server is headless in the basement, so pulling a monitor and keyboard ahead of time would be wise if there's a ethernet driver change from b6a.

Link to comment

I tried to search around, but came up empty.  Is there a summary of what Realtek drivers are in what releases?  I'm currently on 5b6a and want to upgrade to the latest.  b6a shows that I'm running the r8169 driver.  Is this the Linux or Realtek version?

 

http://lime-technology.com/forum/index.php?topic=15049.msg141577#msg141577 Is the best description of what's in 5b12a.

 

I'm guessing in earlier betas the official Linux drivers were used except for the 5b10/b11/b12 range that tried using the updated Linux kernel driver and then the official Realtek driver (one model was not included). Perhaps the best source of information for what earlier betas did would be in their respective threads.

Link to comment

Flash drive corrupt? Can you point me towards this in the syslog?

 

During that 20mins parity period the webGUI is responsive. After the first sas error it's bye bye

 

Total of 5 disks. 1 parity,  3 storage and 1 250GB cache drive. I will simplify to 1 parity and 1 storage drive.

Will do that tomorrow (GMT+2) overhere.

 

 

This line identifies device 'sdb':

Sep  4 19:08:24 Goliath kernel: sd 5:0:0:0: [sdb] 15646720 512-byte logical blocks: (8.01 GB/7.46 GiB)

 

These lines show first error and that flash has been set read-only:

Sep  4 19:08:33 Goliath kernel: FAT-fs (sdb1): error, fat_free_clusters: deleting FAT entry beyond EOF

Sep  4 19:08:33 Goliath kernel: FAT-fs (sdb1): Filesystem has been set read-only

 

More errors:

Sep  4 19:09:19 Goliath kernel: fat_get_cluster: 38 callbacks suppressed

Sep  4 19:09:19 Goliath kernel: FAT-fs (sdb1): error, fat_get_cluster: invalid cluster chain (i_pos 2018181)

Sep  4 19:09:19 Goliath last message repeated 9 times

 

Later this shows it can't be written:

Sep  4 19:11:17 Goliath kernel: write_file: error 30 opening /boot/config/super.dat

Sep  4 19:11:17 Goliath kernel: md: could not write superblock from /boot/config/super.dat

 

 

The main reason I ask for entire system log is to check timing of events, and often someone might post a snippet which is all I need, but sometimes is not the first error that triggers it.  I also use it to check to make sure they are running the release they think they are, and to see if there are other things that might be contributing to the problems (like addons).

Link to comment

upgraded to 12a with no issue.  Parity check done.  Going to start data migration from my 4.7 server later on  tonight.

 

J

 

Thank you for the report.

 

As of this post, -beta12a has been downloaded 215 times, -beta12 was downloaded 1707 times.  Version 4.7 has been downloaded 14,020 times.  So, as is often the case, what you hear about most on forums are the cases that fail, when vast majority of cases run fine.

Link to comment

Not sure if this is the place to post this issue, but it didn't show it head until I upgraded from B12, to B12a. One of my drives is now showing 4.07TB of free space, not the 3TB that the drive is. I am posting a screen shot as well as the latest syslog.

Don't get me wrong, if I can squeeze more then 4Tb on a 3TB drive, I want to know how to make the others do the same.

The syslog was to large so I zipped it.

Untitled.jpg.3a8a34a310c787d92f2850f73e0ddaad.jpg

Syslog_09-04-2011.zip

Link to comment

I tried to search around, but came up empty.  Is there a summary of what Realtek drivers are in what releases?  I'm currently on 5b6a and want to upgrade to the latest.  b6a shows that I'm running the r8169 driver.  Is this the Linux or Realtek version?

 

I'd like to help test, but I want to know what I'm getting myself into.  My server is headless in the basement, so pulling a monitor and keyboard ahead of time would be wise if there's a ethernet driver change from b6a.

 

In case anyone else was wondering... after carefully going through the change log, I determined the following:

 

5.0b9 and earlier - Linux r8169

5.0b9 and 5.0b10 - Realtek r8168

5.0b11 and 5.0b12 - Linux r8169

5.0b12a - Realtek r8168 and Realtek r8169

 

So upgrading from 5.0b6a to b12a will definitely be a driver change.  Also, those that had trouble on 5.0b9 and 5.0b10 and had trouble with PCI-Express NICs may be OK on 5.0b12a because of the addition of the Realtek r8169 driver.

Link to comment

What does it say on the standard unRAID webgui?

The same thing as unMENU/mymain.

 

Also, I set the shares but some of the folders are being denied access, like this one for instance "All Movies is available but user account that you are logged on with was denied access.

While the other folders I made are accessible, yet I can move or copy just one file before I get blocked from placing more.

I am hoping that once this Parity-Sync in progress is finished, things will go back to normal.

Link to comment

Not sure if this is the place to post this issue, but it didn't show it head until I upgraded from B12, to B12a. One of my drives is now showing 4.07TB of free space, not the 3TB that the drive is. I am posting a screen shot as well as the latest syslog.

Don't get me wrong, if I can squeeze more then 4Tb on a 3TB drive, I want to know how to make the others do the same.

The syslog was to large so I zipped it.

 

These lines are suspect:

Sep  5 08:15:36 Tower02 kernel: REISERFS error (device md1): reiserfs-2025 reiserfs_cache_bitmap_metadata: bitmap block 593690624 is corrupted: first bit must be 1

Sep  5 08:15:36 Tower02 kernel: REISERFS (device md1): Remounting filesystem read-only

 

I suggest you Stop array, re-start in Maintenance Mode, and run 'reiserfsck' on disk1, ie, from the console type:

 

reiserfsck /dev/md1              <-- md1 corresponds to disk1

 

It will prompt you to type Yes (with capital Y, lower e, lower s).  Let this run, it might show no output on the console for a while, or it might spew out all kinds of messages.  You should see the i/o counters incrementing in the webGui.  When if finishes, if it finds corruptions it will tell you what to do next - usually instructs to re-run specifying a certain switch such as "--fix-fixable".

 

Wouldn't hurt to check all your data drives while you're at it :)

Link to comment

hehe, I'll move to 12a and see if the BLK issue goes away.  I honestly can't say where/why/when it decides to happen.  The box has been up since yesterday afternoon without it appearing in the log.

 

I've also grabbed a 2-ch sata card so I can remove one of the SAS cards and use onboard+2ch.  I'll test beta 12a first though, before I start tearing out parts.

 

 

 

I've precleared two drives, and have been streaming movies/tv since upgrading to this version and (knock on wood) have not seen the blk error and subsequent failure.

 

root@HBO:/mnt/# uptime

21:54:13 up 1 day,  3:21,  2 users,  load average: 1.86, 1.79, 1.65

 

Load is from preclear.

Link to comment

These lines are suspect:

Sep  5 08:15:36 Tower02 kernel: REISERFS error (device md1): reiserfs-2025 reiserfs_cache_bitmap_metadata: bitmap block 593690624 is corrupted: first bit must be 1

Sep  5 08:15:36 Tower02 kernel: REISERFS (device md1): Remounting filesystem read-only

 

I suggest you Stop array, re-start in Maintenance Mode, and run 'reiserfsck' on disk1, ie, from the console type:

 

reiserfsck /dev/md1              <-- md1 corresponds to disk1

 

It will prompt you to type Yes (with capital Y, lower e, lower s).  Let this run, it might show no output on the console for a while, or it might spew out all kinds of messages.  You should see the i/o counters incrementing in the webGui.  When if finishes, if it finds corruptions it will tell you what to do next - usually instructs to re-run specifying a certain switch such as "--fix-fixable".

 

Wouldn't hurt to check all your data drives while you're at it :)

 

 Thank you!

I tried as you instructed, but after I type "Yes" (without the quotes!) I get the message "No such file or directory" then it returns to the prompt.

 

  Never mind me, I had a brain freeze and forgot to go into maintenance mode

Link to comment

Not sure to post this under b12 or b12a, but since 12a only focused on a NIC driver update, believe it should be fine under this thread.

 

After running for a week with my existing hardware setup after upgrading to b12, I performed a Parity Check, got no errors, then upgraded the parity drive from a 2TB to a 3TB.  I noticed that unRAID 5 went straight to building parity without performing any "pre-clear" tasks.  I had forgotten to run the manual pre_clear script but version 4 would always perform it's own "pre clear" processing if the pre_clear script wasn't performed.

 

Anyhow, I let it build parity on the new drive, then when completed immediately performed a Parity Check (correction enabled).  Got 406 errors.

 

Its in the process of performing a second Parity Check, but I had just thought about the fact that I was able to still access my server during the actual parity build as well as parity check and I'm not sure I was able to do this under version 4 but now I'm wondering if accessing the shares while parity build/check was in progress should not have been allowed.

 

During this second Parity Check I'm definitely not going to access the shares whatsoever (I hadn't stopped the array before commencing the second Parity Check) and see if any errors result.

Link to comment

Not sure to post this under b12 or b12a, but since 12a only focused on a NIC driver update, believe it should be fine under this thread.

 

After running for a week with my existing hardware setup after upgrading to b12, I performed a Parity Check, got no errors, then upgraded the parity drive from a 2TB to a 3TB.  I noticed that unRAID 5 went straight to building parity without performing any "pre-clear" tasks.  I had forgotten to run the manual pre_clear script but version 4 would always perform it's own "pre clear" processing if the pre_clear script wasn't performed.

 

Anyhow, I let it build parity on the new drive, then when completed immediately performed a Parity Check (correction enabled).  Got 406 errors.

 

Its in the process of performing a second Parity Check, but I had just thought about the fact that I was able to still access my server during the actual parity build as well as parity check and I'm not sure I was able to do this under version 4 but now I'm wondering if accessing the shares while parity build/check was in progress should not have been allowed.

 

During this second Parity Check I'm definitely not going to access the shares whatsoever (I hadn't stopped the array before commencing the second Parity Check) and see if any errors result.

 

Syslog PLEASE!!!!

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.