prostuff1 Posted August 30, 2011 Share Posted August 30, 2011 preproman are you seeing what I saw in beta 10? http://lime-technology.com/forum/index.php?topic=14158.msg138076#msg138076 It looks like that last disk, depending on disk assignment, pushes the already assigned disk out of the array. Tom, I attached a syslog on that post if you want to see what was happening on my box. I have not fixed this and I currently have my 24th drive (which I'd like to mount outside of the array) hanging out of my machine because if I power it on, it will push my hitachi out of the array therefore screwing it up. I did not look at the link you posted but what your describe is exactly what preproman and I were seeing. I had him remove the drive that he precleared and wanted to use outside the array, but could not because it was pushing the 20th data drive out. I know this did not happen on 4.7 but did on 5.0b12. Once I removed the precleared drive I was able to refresh the page a couple times and pick the other disk up. Quote Link to comment
dalben Posted August 30, 2011 Share Posted August 30, 2011 I've had an intermittent error with b12 that I didn't have with b11. I can start the server and the array will come up as "starting" and nothing will happen. I can can't attach. The Shares tab tells me the array needs to start fist. I've attached the syslog. It also looks like I may have blown a disk. I have a post in general help with the log files for that issue. Not sure if it's related. Any assistance on what to do with the dead drive would be handy. I don't know if it's a hardware issue or not. tdm_array_starting.zip Quote Link to comment
jayhawk Posted August 30, 2011 Share Posted August 30, 2011 Just upgraded to b12 from 4.7. all seemed to be going well, parity was there, all drives present, array started. I added sickbeard, sab, a couple of other plugins. I set afp on, set the disk permissions (ran the script to change them). I was watch a show through SMBd. It stopped, and the unraid box started spiraling out of control with load--it was up to 18 or 20 sustained. it wasnt cpu, and i couldnt stop the array. I tried killing off all non-essential processes, tried stopping the array, nothing worked. I had to manually restart. On restart it shows : array starting and unraidd and mdrecoveryd are using about 20% cpu each. It's been going for about 20 minutes now. Is there a hope this thing is going to start or should I try rolling back 4.7 and doing a parity check? http://notbusy.com/syslog - thats all it wrote before i decided to restart. help? http://notbusy.com/a/pictures/d/HBO_Main-20110829-204624.jpg I'd really appreciate some help here guys... It has done it again this morning, seems to have something to do with bringing the card back up after a spin down or issuing a spindown command? this is from the end of this mornings log. Box is still up, but I'm going to have power it down again as it's unresponsive to powerdown or shutdown commands. Load is currently at 7 --and no cpu is being used. Aug 30 03:50:39 HBO logger: mover finished Aug 30 04:10:38 HBO kernel: mdcmd (55): spindown 9 Aug 30 04:26:47 HBO kernel: sas: command 0xf74e2d80, task 0xf76308c0, timed out: BLK_EH_NOT_HANDLED Aug 30 08:40:31 HBO in.telnetd[2322]: connect from 10.0.1.1 (10.0.1.1) Aug 30 08:40:38 HBO login[2323]: ROOT LOGIN on '/dev/pts/0' from 'awesome' I have 3 AOC-SASLP-MV8, on a supermicro board (ipmi2.0 - I'll have to dig up the model if needed). The powersupply is a corsair gold 750 single 12v rail. I Think I should have sufficient power. Quote Link to comment
dimaestro Posted August 30, 2011 Share Posted August 30, 2011 I had a similar issue in 5b10. Only solution I found was to downgrade. I also have a AOC-SASLP-MV8. Just upgraded to b12 from 4.7. all seemed to be going well, parity was there, all drives present, array started. I added sickbeard, sab, a couple of other plugins. I set afp on, set the disk permissions (ran the script to change them). I was watch a show through SMBd. It stopped, and the unraid box started spiraling out of control with load--it was up to 18 or 20 sustained. it wasnt cpu, and i couldnt stop the array. I tried killing off all non-essential processes, tried stopping the array, nothing worked. I had to manually restart. On restart it shows : array starting and unraidd and mdrecoveryd are using about 20% cpu each. It's been going for about 20 minutes now. Is there a hope this thing is going to start or should I try rolling back 4.7 and doing a parity check? http://notbusy.com/syslog - thats all it wrote before i decided to restart. help? http://notbusy.com/a/pictures/d/HBO_Main-20110829-204624.jpg I'd really appreciate some help here guys... It has done it again this morning, seems to have something to do with bringing the card back up after a spin down or issuing a spindown command? this is from the end of this mornings log. Box is still up, but I'm going to have power it down again as it's unresponsive to powerdown or shutdown commands. Load is currently at 7 --and no cpu is being used. Aug 30 03:50:39 HBO logger: mover finished Aug 30 04:10:38 HBO kernel: mdcmd (55): spindown 9 Aug 30 04:26:47 HBO kernel: sas: command 0xf74e2d80, task 0xf76308c0, timed out: BLK_EH_NOT_HANDLED Aug 30 08:40:31 HBO in.telnetd[2322]: connect from 10.0.1.1 (10.0.1.1) Aug 30 08:40:38 HBO login[2323]: ROOT LOGIN on '/dev/pts/0' from 'awesome' I have 3 AOC-SASLP-MV8, on a supermicro board (ipmi2.0 - I'll have to dig up the model if needed). The powersupply is a corsair gold 750 single 12v rail. I Think I should have sufficient power. Quote Link to comment
Auggie Posted August 30, 2011 Share Posted August 30, 2011 After using AFP yesterday, then switching back to SMB since performance under AFP was abysmal (especially with lots of smaller files numbering in the thousands), I tried logging into the AFP server this morning via OS X's Sidebar but I get CONNECTION FAILED, both when I try the GUEST access and an authorized user. Unfortunately, there are no entries made at all in the SYSLOG in regards to any AFP login activity, so not sure how to provide a better snapshot of the environmental conditions or logs of the situation. However, I can access the AFP server via the CONNECT menu or RECENT SERVERS menu. Speaking of SMB, performance seems to be better than under 4.7, though I have not performance metrics so it's all just subjective. Quote Link to comment
jayhawk Posted August 30, 2011 Share Posted August 30, 2011 did you go back to 4.7 or an earlier beta release? I've disabled spindown of all drives, and removed groups. Perhaps if they dont sleep I'll not see the issue. I've also disabled afp for now and I'll test with just samba until I get things sorted out. Parity check is running now, quite fast 89MB/s - once that's done, I'll try watching some tv etc. again. Is it possible to use this method: http://lime-technology.com/forum/index.php?topic=871.0 with beta5 still? I found this the most solid/prefered it to plugins. Steve I had a similar issue in 5b10. Only solution I found was to downgrade. I also have a AOC-SASLP-MV8. Just upgraded to b12 from 4.7. all seemed to be going well, parity was there, all drives present, array started. I added sickbeard, sab, a couple of other plugins. I set afp on, set the disk permissions (ran the script to change them). I was watch a show through SMBd. It stopped, and the unraid box started spiraling out of control with load--it was up to 18 or 20 sustained. it wasnt cpu, and i couldnt stop the array. I tried killing off all non-essential processes, tried stopping the array, nothing worked. I had to manually restart. On restart it shows : array starting and unraidd and mdrecoveryd are using about 20% cpu each. It's been going for about 20 minutes now. Is there a hope this thing is going to start or should I try rolling back 4.7 and doing a parity check? http://notbusy.com/syslog - thats all it wrote before i decided to restart. help? http://notbusy.com/a/pictures/d/HBO_Main-20110829-204624.jpg I'd really appreciate some help here guys... It has done it again this morning, seems to have something to do with bringing the card back up after a spin down or issuing a spindown command? this is from the end of this mornings log. Box is still up, but I'm going to have power it down again as it's unresponsive to powerdown or shutdown commands. Load is currently at 7 --and no cpu is being used. Aug 30 03:50:39 HBO logger: mover finished Aug 30 04:10:38 HBO kernel: mdcmd (55): spindown 9 Aug 30 04:26:47 HBO kernel: sas: command 0xf74e2d80, task 0xf76308c0, timed out: BLK_EH_NOT_HANDLED Aug 30 08:40:31 HBO in.telnetd[2322]: connect from 10.0.1.1 (10.0.1.1) Aug 30 08:40:38 HBO login[2323]: ROOT LOGIN on '/dev/pts/0' from 'awesome' I have 3 AOC-SASLP-MV8, on a supermicro board (ipmi2.0 - I'll have to dig up the model if needed). The powersupply is a corsair gold 750 single 12v rail. I Think I should have sufficient power. Quote Link to comment
brandon Posted August 30, 2011 Share Posted August 30, 2011 I tried upgrading from 4.7 today based on the instructions in the release notes. I deleted the files listed, and overwrote the bzimage/bzroot files. Now on first boot-up I can access the GUI, but it's partially loaded. ex. The title of the page in the browser is <?=$var['NAME'];?>/<?=$myPage['NAME'];?> and the page header loads up (Lime Tech banner), but nothing else loads. It says unRAID Server version:, but no number. Attached is my syslog. It seems stuck at doing something over and over again. I forgot to disable my plugins, but my version of unMENU is up to date. I can access unMENU at port 8081. I can also login via Telnet. I'm not at home (logged in remotely) otherwise I'd just pull out the flash USB key and go back to 4.7 Anything I can do? syslog-2011-08-30.txt Quote Link to comment
gbdesai Posted August 30, 2011 Share Posted August 30, 2011 I have 3 AOC-SASLP-MV8, on a supermicro board (ipmi2.0 - I'll have to dig up the model if needed). The powersupply is a corsair gold 750 single 12v rail. I Think I should have sufficient power. You don't have a Norco 4224 do you? I had a similar problem when I didn't attach power to both the primary and backup power connectors on the backplanes. Quote Link to comment
jayhawk Posted August 30, 2011 Share Posted August 30, 2011 I have 3 AOC-SASLP-MV8, on a supermicro board (ipmi2.0 - I'll have to dig up the model if needed). The powersupply is a corsair gold 750 single 12v rail. I Think I should have sufficient power--now... short of that last molex. I had seen some power issues before that--which is why i ended up splitting the power. You don't have a Norco 4224 do you? I had a similar problem when I didn't attach power to both the primary and backup power connectors on the backplanes. Yes, I do. I recently added power to all but one (very top plane only has one molex) using splitters. I need to order another set of the stock molex connectors from corsair --my ps is modular. The disk that reported the spinup/down problem is not on the one with the single power connection though.. Quote Link to comment
Joe L. Posted August 30, 2011 Share Posted August 30, 2011 I tried upgrading from 4.7 today based on the instructions in the release notes. I deleted the files listed, and overwrote the bzimage/bzroot files. Now on first boot-up I can access the GUI, but it's partially loaded. ex. The title of the page in the browser is <?=$var['NAME'];?>/<?=$myPage['NAME'];?> and the page header loads up (Lime Tech banner), but nothing else loads. It says unRAID Server version:, but no number. Attached is my syslog. It seems stuck at doing something over and over again. I forgot to disable my plugins, but my version of unMENU is up to date. I can access unMENU at port 8081. I can also login via Telnet. I'm not at home (logged in remotely) otherwise I'd just pull out the flash USB key and go back to 4.7 Anything I can do? do a search on "short tags" You installed a version of php, and it is conflicting with the one unRAID 5.0 uses. The fix is simple. Joe L. Quote Link to comment
brandon Posted August 30, 2011 Share Posted August 30, 2011 I tried upgrading from 4.7 today based on the instructions in the release notes. I deleted the files listed, and overwrote the bzimage/bzroot files. Now on first boot-up I can access the GUI, but it's partially loaded. ex. The title of the page in the browser is <?=$var['NAME'];?>/<?=$myPage['NAME'];?> and the page header loads up (Lime Tech banner), but nothing else loads. It says unRAID Server version:, but no number. Attached is my syslog. It seems stuck at doing something over and over again. I forgot to disable my plugins, but my version of unMENU is up to date. I can access unMENU at port 8081. I can also login via Telnet. I'm not at home (logged in remotely) otherwise I'd just pull out the flash USB key and go back to 4.7 Anything I can do? do a search on "short tags" You installed a version of php, and it is conflicting with the one unRAID 5.0 uses. The fix is simple. Joe L. Hi Joe, thanks for the quick reply. I tried what you said in the following thread: http://lime-technology.com/forum/index.php?topic=10840.msg103297#msg103297 But I get this error: "sed: can't read /boot/custom/php/php.ini: No such file or directory" Quote Link to comment
brandon Posted August 30, 2011 Share Posted August 30, 2011 Hi Joe, thanks for the quick reply. I tried what you said in the following thread: http://lime-technology.com/forum/index.php?topic=10840.msg103297#msg103297 But I get this error: "sed: can't read /boot/custom/php/php.ini: No such file or directory" I ended up just uninstalling PHP 5.2 through unMENU, and now it works fine. thanks! Quote Link to comment
bonienl Posted August 30, 2011 Share Posted August 30, 2011 When I issue the command to spin down the cache drive, according to the table this has index 21, I get an error message. root@tower:/# /root/mdcmd spindown 21 /root/mdcmd: line 11: echo: write error: No such device or address The cache drive does spin down on time-out or when pressing the "spin down" button. Am I doing something wrong? Quote Link to comment
limetech Posted August 30, 2011 Author Share Posted August 30, 2011 When I issue the command to spin down the cache drive, according to the table this has index 21, I get an error message. root@tower:/# /root/mdcmd spindown 21 /root/mdcmd: line 11: echo: write error: No such device or address The cache drive does spin down on time-out or when pressing the "spin down" button. Am I doing something wrong? The 'mdcmd spindown' command only spins down drives that are part of the array (parity and data drives). The cache drive is spun down using the 'hdparm' command. The reason for this is because the unraid driver keeps track of disk spinning/not spinning in order to implement "spinup groups". Quote Link to comment
limetech Posted August 30, 2011 Author Share Posted August 30, 2011 Having a slight issue with a disk not showing up in the dropdown list under 5.0b12. Attach your syslog. It includes the device inventory that unRAID sees. Sorry about that, forgot to attach it to another post, the pictures in the first one took up all the allowed 192KB worth of space. Anyway, syslog is attached now. I do not see your syslog attached anywhere. Make sure it's your entire syslog, so zip it up if need be. Yeah, that was actually my fault. I was helping preproman out with this issue and copied the syslog to the flash drive and then completely failed to attach it to the post. I know he has al 24 slots full on a 4224 case but I am not sure if he has any drives hanging off the system via USB. I don't think he does, but I am not 100% sure about that. Fixed in next beta. Quote Link to comment
JackBauer Posted August 30, 2011 Share Posted August 30, 2011 Is there any posted page where "open" issues are being tracked? It would help me make a determination if I want to take the plunge. (I have Intel based ethernet ports ) Quote Link to comment
dimaestro Posted August 30, 2011 Share Posted August 30, 2011 I went back to 4.7. I had assumed the issue was due to moving to the 3.* kernel. Have you tried B11? That uses 2.6.3 kernel. I'm blindly assuming it's a 3.0 kernel issue, but haven't fully tested it out - I don't have a machine I can do a lot of testing, and I had parity issue after one of the reboots. If I had enough drives to move everything off of the UnRaid box I'd do some more testing, but sadly I don't have enough drives to store everything offline. I don't know if my issue was identical to yours, but had load spiraling out of control, shutdown failed. did you go back to 4.7 or an earlier beta release? I've disabled spindown of all drives, and removed groups. Perhaps if they dont sleep I'll not see the issue. I've also disabled afp for now and I'll test with just samba until I get things sorted out. Parity check is running now, quite fast 89MB/s - once that's done, I'll try watching some tv etc. again. Is it possible to use this method: http://lime-technology.com/forum/index.php?topic=871.0 with beta5 still? I found this the most solid/prefered it to plugins. Steve I had a similar issue in 5b10. Only solution I found was to downgrade. I also have a AOC-SASLP-MV8. Just upgraded to b12 from 4.7. all seemed to be going well, parity was there, all drives present, array started. I added sickbeard, sab, a couple of other plugins. I set afp on, set the disk permissions (ran the script to change them). I was watch a show through SMBd. It stopped, and the unraid box started spiraling out of control with load--it was up to 18 or 20 sustained. it wasnt cpu, and i couldnt stop the array. I tried killing off all non-essential processes, tried stopping the array, nothing worked. I had to manually restart. On restart it shows : array starting and unraidd and mdrecoveryd are using about 20% cpu each. It's been going for about 20 minutes now. Is there a hope this thing is going to start or should I try rolling back 4.7 and doing a parity check? http://notbusy.com/syslog - thats all it wrote before i decided to restart. help? http://notbusy.com/a/pictures/d/HBO_Main-20110829-204624.jpg I'd really appreciate some help here guys... It has done it again this morning, seems to have something to do with bringing the card back up after a spin down or issuing a spindown command? this is from the end of this mornings log. Box is still up, but I'm going to have power it down again as it's unresponsive to powerdown or shutdown commands. Load is currently at 7 --and no cpu is being used. Aug 30 03:50:39 HBO logger: mover finished Aug 30 04:10:38 HBO kernel: mdcmd (55): spindown 9 Aug 30 04:26:47 HBO kernel: sas: command 0xf74e2d80, task 0xf76308c0, timed out: BLK_EH_NOT_HANDLED Aug 30 08:40:31 HBO in.telnetd[2322]: connect from 10.0.1.1 (10.0.1.1) Aug 30 08:40:38 HBO login[2323]: ROOT LOGIN on '/dev/pts/0' from 'awesome' I have 3 AOC-SASLP-MV8, on a supermicro board (ipmi2.0 - I'll have to dig up the model if needed). The powersupply is a corsair gold 750 single 12v rail. I Think I should have sufficient power. Quote Link to comment
limetech Posted August 30, 2011 Author Share Posted August 30, 2011 I went back to 4.7. I had assumed the issue was due to moving to the 3.* kernel. Have you tried B11? That uses 2.6.3 kernel. I'm blindly assuming it's a 3.0 kernel issue, but haven't fully tested it out - I don't have a machine I can do a lot of testing, and I had parity issue after one of the reboots. If I had enough drives to move everything off of the UnRaid box I'd do some more testing, but sadly I don't have enough drives to store everything offline. I don't know if my issue was identical to yours, but had load spiraling out of control, shutdown failed. did you go back to 4.7 or an earlier beta release? I've disabled spindown of all drives, and removed groups. Perhaps if they dont sleep I'll not see the issue. I've also disabled afp for now and I'll test with just samba until I get things sorted out. Parity check is running now, quite fast 89MB/s - once that's done, I'll try watching some tv etc. again. Is it possible to use this method: http://lime-technology.com/forum/index.php?topic=871.0 with beta5 still? I found this the most solid/prefered it to plugins. Steve I had a similar issue in 5b10. Only solution I found was to downgrade. I also have a AOC-SASLP-MV8. Just upgraded to b12 from 4.7. all seemed to be going well, parity was there, all drives present, array started. I added sickbeard, sab, a couple of other plugins. I set afp on, set the disk permissions (ran the script to change them). I was watch a show through SMBd. It stopped, and the unraid box started spiraling out of control with load--it was up to 18 or 20 sustained. it wasnt cpu, and i couldnt stop the array. I tried killing off all non-essential processes, tried stopping the array, nothing worked. I had to manually restart. On restart it shows : array starting and unraidd and mdrecoveryd are using about 20% cpu each. It's been going for about 20 minutes now. Is there a hope this thing is going to start or should I try rolling back 4.7 and doing a parity check? http://notbusy.com/syslog - thats all it wrote before i decided to restart. help? http://notbusy.com/a/pictures/d/HBO_Main-20110829-204624.jpg I'd really appreciate some help here guys... It has done it again this morning, seems to have something to do with bringing the card back up after a spin down or issuing a spindown command? this is from the end of this mornings log. Box is still up, but I'm going to have power it down again as it's unresponsive to powerdown or shutdown commands. Load is currently at 7 --and no cpu is being used. Aug 30 03:50:39 HBO logger: mover finished Aug 30 04:10:38 HBO kernel: mdcmd (55): spindown 9 Aug 30 04:26:47 HBO kernel: sas: command 0xf74e2d80, task 0xf76308c0, timed out: BLK_EH_NOT_HANDLED Aug 30 08:40:31 HBO in.telnetd[2322]: connect from 10.0.1.1 (10.0.1.1) Aug 30 08:40:38 HBO login[2323]: ROOT LOGIN on '/dev/pts/0' from 'awesome' I have 3 AOC-SASLP-MV8, on a supermicro board (ipmi2.0 - I'll have to dig up the model if needed). The powersupply is a corsair gold 750 single 12v rail. I Think I should have sufficient power. I think this issue is because of patch I put into the file "drivers/ata/libata-scsi.c" to increase timeouts. Normally this patch is put into all new kernel updates I do, except somehow it didn't make into version 3.0.3 which -beta12 uses... releasing a -beta13 soon. Quote Link to comment
jayhawk Posted August 30, 2011 Share Posted August 30, 2011 Soon enough that I should hold off on rolling back If i can roll that driver into a boot image and try it --feel free to attach it. Trying to CPIO a new one, but its not finding my disk/by-label in fstab on boot... Quote Link to comment
dimaestro Posted August 30, 2011 Share Posted August 30, 2011 Any chance you would have forgotten this in B10 as well? That's where I was having the issue, rolled back at that point and haven't tried a newer build as of yet. I went back to 4.7. I had assumed the issue was due to moving to the 3.* kernel. Have you tried B11? That uses 2.6.3 kernel. I'm blindly assuming it's a 3.0 kernel issue, but haven't fully tested it out - I don't have a machine I can do a lot of testing, and I had parity issue after one of the reboots. If I had enough drives to move everything off of the UnRaid box I'd do some more testing, but sadly I don't have enough drives to store everything offline. I don't know if my issue was identical to yours, but had load spiraling out of control, shutdown failed. did you go back to 4.7 or an earlier beta release? I've disabled spindown of all drives, and removed groups. Perhaps if they dont sleep I'll not see the issue. I've also disabled afp for now and I'll test with just samba until I get things sorted out. Parity check is running now, quite fast 89MB/s - once that's done, I'll try watching some tv etc. again. Is it possible to use this method: http://lime-technology.com/forum/index.php?topic=871.0 with beta5 still? I found this the most solid/prefered it to plugins. Steve I had a similar issue in 5b10. Only solution I found was to downgrade. I also have a AOC-SASLP-MV8. Just upgraded to b12 from 4.7. all seemed to be going well, parity was there, all drives present, array started. I added sickbeard, sab, a couple of other plugins. I set afp on, set the disk permissions (ran the script to change them). I was watch a show through SMBd. It stopped, and the unraid box started spiraling out of control with load--it was up to 18 or 20 sustained. it wasnt cpu, and i couldnt stop the array. I tried killing off all non-essential processes, tried stopping the array, nothing worked. I had to manually restart. On restart it shows : array starting and unraidd and mdrecoveryd are using about 20% cpu each. It's been going for about 20 minutes now. Is there a hope this thing is going to start or should I try rolling back 4.7 and doing a parity check? http://notbusy.com/syslog - thats all it wrote before i decided to restart. help? http://notbusy.com/a/pictures/d/HBO_Main-20110829-204624.jpg I'd really appreciate some help here guys... It has done it again this morning, seems to have something to do with bringing the card back up after a spin down or issuing a spindown command? this is from the end of this mornings log. Box is still up, but I'm going to have power it down again as it's unresponsive to powerdown or shutdown commands. Load is currently at 7 --and no cpu is being used. Aug 30 03:50:39 HBO logger: mover finished Aug 30 04:10:38 HBO kernel: mdcmd (55): spindown 9 Aug 30 04:26:47 HBO kernel: sas: command 0xf74e2d80, task 0xf76308c0, timed out: BLK_EH_NOT_HANDLED Aug 30 08:40:31 HBO in.telnetd[2322]: connect from 10.0.1.1 (10.0.1.1) Aug 30 08:40:38 HBO login[2323]: ROOT LOGIN on '/dev/pts/0' from 'awesome' I have 3 AOC-SASLP-MV8, on a supermicro board (ipmi2.0 - I'll have to dig up the model if needed). The powersupply is a corsair gold 750 single 12v rail. I Think I should have sufficient power. I think this issue is because of patch I put into the file "drivers/ata/libata-scsi.c" to increase timeouts. Normally this patch is put into all new kernel updates I do, except somehow it didn't make into version 3.0.3 which -beta12 uses... releasing a -beta13 soon. Quote Link to comment
jayhawk Posted August 31, 2011 Share Posted August 31, 2011 I got it again tonight, but it seemed to recover -- or at least hasnt broken everything: Aug 30 20:23:37 HBO kernel: sas: command 0xf3600480, task 0xee282f00, timed out: BLK_EH_NOT_HANDLED Aug 30 20:23:37 HBO kernel: sas: Enter sas_scsi_recover_host Aug 30 20:23:37 HBO kernel: sas: trying to find task 0xee282f00 Aug 30 20:23:37 HBO kernel: sas: sas_scsi_find_task: aborting task 0xee282f00 Aug 30 20:23:37 HBO kernel: drivers/scsi/mvsas/mv_sas.c 1818:<7>mv_abort_task() mvi=f76a0000 task=ee282f00 slot=f76b160c slot_idx=x1 Aug 30 20:23:37 HBO kernel: sas: sas_scsi_find_task: querying task 0xee282f00 Aug 30 20:23:37 HBO kernel: drivers/scsi/mvsas/mv_sas.c 1747:mvs_query_task:rc= 5 Aug 30 20:23:37 HBO kernel: sas: sas_scsi_find_task: task 0xee282f00 failed to abort Aug 30 20:23:37 HBO kernel: sas: task 0xee282f00 is not at LU: I_T recover Aug 30 20:23:37 HBO kernel: sas: I_T nexus reset for dev 0700000000000000 Aug 30 20:23:37 HBO kernel: drivers/scsi/mvsas/mv_sas.c 2198:port 7 ctrl sts=0x89800. Aug 30 20:23:37 HBO kernel: drivers/scsi/mvsas/mv_sas.c 2200:Port 7 irq sts = 0x1001001 Aug 30 20:23:37 HBO kernel: drivers/scsi/mvsas/mv_sas.c 2226:phy7 Unplug Notice Aug 30 20:23:37 HBO kernel: drivers/scsi/mvsas/mv_sas.c 2198:port 7 ctrl sts=0x199800. Aug 30 20:23:37 HBO kernel: drivers/scsi/mvsas/mv_sas.c 2200:Port 7 irq sts = 0x1001081 Aug 30 20:23:37 HBO kernel: drivers/scsi/mvsas/mv_sas.c 2198:port 7 ctrl sts=0x199800. Aug 30 20:23:37 HBO kernel: drivers/scsi/mvsas/mv_sas.c 2200:Port 7 irq sts = 0x10000 Aug 30 20:23:37 HBO kernel: drivers/scsi/mvsas/mv_sas.c 2253:notify plug in on phy[7] Aug 30 20:23:37 HBO kernel: drivers/scsi/mvsas/mv_sas.c 1338:port 7 attach dev info is 0 Aug 30 20:23:37 HBO kernel: drivers/scsi/mvsas/mv_sas.c 1340:port 7 attach sas addr is 7 Aug 30 20:23:37 HBO kernel: drivers/scsi/mvsas/mv_sas.c 379:phy 7 byte dmaded. Aug 30 20:23:37 HBO kernel: sas: sas_form_port: phy7 belongs to port5 already(1)! Aug 30 20:23:39 HBO kernel: drivers/scsi/mvsas/mv_sas.c 1701:mvs_I_T_nexus_reset for device[5]:rc= 0 Aug 30 20:23:39 HBO kernel: sas: I_T 0700000000000000 recovered Aug 30 20:23:39 HBO kernel: sas: sas_ata_task_done: SAS error 8d Aug 30 20:23:39 HBO kernel: ata8: sas eh calling libata port error handler Aug 30 20:23:39 HBO kernel: ata9: sas eh calling libata port error handler Aug 30 20:23:39 HBO kernel: ata10: sas eh calling libata port error handler Aug 30 20:23:39 HBO kernel: ata11: sas eh calling libata port error handler Aug 30 20:23:39 HBO kernel: ata12: sas eh calling libata port error handler Aug 30 20:23:39 HBO kernel: ata13: sas eh calling libata port error handler Aug 30 20:23:39 HBO kernel: ata13.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 t0 Aug 30 20:23:39 HBO kernel: ata13.00: failed command: CHECK POWER MODE Aug 30 20:23:39 HBO kernel: ata13.00: cmd e5/00:00:00:00:00/00:00:00:00:00/40 tag 0 Aug 30 20:23:39 HBO kernel: res 01/04:04:38:6e:f1/00:00:a5:00:00/40 Emask 0x3 (HSM violation) Aug 30 20:23:39 HBO kernel: ata13.00: status: { ERR } Aug 30 20:23:39 HBO kernel: ata13.00: error: { ABRT } Aug 30 20:23:39 HBO kernel: ata13: hard resetting link Aug 30 20:23:39 HBO kernel: drivers/scsi/mvsas/mv_sas.c 2198:port 7 ctrl sts=0x89800. Aug 30 20:23:39 HBO kernel: drivers/scsi/mvsas/mv_sas.c 2200:Port 7 irq sts = 0x1001 Aug 30 20:23:39 HBO kernel: drivers/scsi/mvsas/mv_sas.c 2226:phy7 Unplug Notice Aug 30 20:23:39 HBO kernel: drivers/scsi/mvsas/mv_sas.c 2198:port 7 ctrl sts=0x199800. Aug 30 20:23:39 HBO kernel: drivers/scsi/mvsas/mv_sas.c 2200:Port 7 irq sts = 0x1081 Aug 30 20:23:40 HBO kernel: drivers/scsi/mvsas/mv_sas.c 2198:port 7 ctrl sts=0x199800. Aug 30 20:23:40 HBO kernel: drivers/scsi/mvsas/mv_sas.c 2200:Port 7 irq sts = 0x10000 Aug 30 20:23:40 HBO kernel: drivers/scsi/mvsas/mv_sas.c 2253:notify plug in on phy[7] Aug 30 20:23:40 HBO kernel: drivers/scsi/mvsas/mv_sas.c 1338:port 7 attach dev info is 0 Aug 30 20:23:40 HBO kernel: drivers/scsi/mvsas/mv_sas.c 1340:port 7 attach sas addr is 7 Aug 30 20:23:40 HBO kernel: drivers/scsi/mvsas/mv_sas.c 379:phy 7 byte dmaded. Aug 30 20:23:40 HBO kernel: sas: sas_form_port: phy7 belongs to port5 already(1)! Aug 30 20:23:42 HBO kernel: drivers/scsi/mvsas/mv_sas.c 1701:mvs_I_T_nexus_reset for device[5]:rc= 0 Aug 30 20:23:42 HBO kernel: sas: sas_ata_hard_reset: Found ATA device. Aug 30 20:23:42 HBO kernel: ata13.00: configured for UDMA/133 Aug 30 20:23:42 HBO kernel: ata13: EH complete Aug 30 20:23:42 HBO kernel: sas: --- Exit sas_scsi_recover_host Any chance you would have forgotten this in B10 as well? That's where I was having the issue, rolled back at that point and haven't tried a newer build as of yet. I went back to 4.7. I had assumed the issue was due to moving to the 3.* kernel. Have you tried B11? That uses 2.6.3 kernel. I'm blindly assuming it's a 3.0 kernel issue, but haven't fully tested it out - I don't have a machine I can do a lot of testing, and I had parity issue after one of the reboots. If I had enough drives to move everything off of the UnRaid box I'd do some more testing, but sadly I don't have enough drives to store everything offline. I don't know if my issue was identical to yours, but had load spiraling out of control, shutdown failed. did you go back to 4.7 or an earlier beta release? I've disabled spindown of all drives, and removed groups. Perhaps if they dont sleep I'll not see the issue. I've also disabled afp for now and I'll test with just samba until I get things sorted out. Parity check is running now, quite fast 89MB/s - once that's done, I'll try watching some tv etc. again. Is it possible to use this method: http://lime-technology.com/forum/index.php?topic=871.0 with beta5 still? I found this the most solid/prefered it to plugins. Steve I had a similar issue in 5b10. Only solution I found was to downgrade. I also have a AOC-SASLP-MV8. Just upgraded to b12 from 4.7. all seemed to be going well, parity was there, all drives present, array started. I added sickbeard, sab, a couple of other plugins. I set afp on, set the disk permissions (ran the script to change them). I was watch a show through SMBd. It stopped, and the unraid box started spiraling out of control with load--it was up to 18 or 20 sustained. it wasnt cpu, and i couldnt stop the array. I tried killing off all non-essential processes, tried stopping the array, nothing worked. I had to manually restart. On restart it shows : array starting and unraidd and mdrecoveryd are using about 20% cpu each. It's been going for about 20 minutes now. Is there a hope this thing is going to start or should I try rolling back 4.7 and doing a parity check? http://notbusy.com/syslog - thats all it wrote before i decided to restart. help? http://notbusy.com/a/pictures/d/HBO_Main-20110829-204624.jpg I'd really appreciate some help here guys... It has done it again this morning, seems to have something to do with bringing the card back up after a spin down or issuing a spindown command? this is from the end of this mornings log. Box is still up, but I'm going to have power it down again as it's unresponsive to powerdown or shutdown commands. Load is currently at 7 --and no cpu is being used. Aug 30 03:50:39 HBO logger: mover finished Aug 30 04:10:38 HBO kernel: mdcmd (55): spindown 9 Aug 30 04:26:47 HBO kernel: sas: command 0xf74e2d80, task 0xf76308c0, timed out: BLK_EH_NOT_HANDLED Aug 30 08:40:31 HBO in.telnetd[2322]: connect from 10.0.1.1 (10.0.1.1) Aug 30 08:40:38 HBO login[2323]: ROOT LOGIN on '/dev/pts/0' from 'awesome' I have 3 AOC-SASLP-MV8, on a supermicro board (ipmi2.0 - I'll have to dig up the model if needed). The powersupply is a corsair gold 750 single 12v rail. I Think I should have sufficient power. I think this issue is because of patch I put into the file "drivers/ata/libata-scsi.c" to increase timeouts. Normally this patch is put into all new kernel updates I do, except somehow it didn't make into version 3.0.3 which -beta12 uses... releasing a -beta13 soon. Quote Link to comment
Thornwood Posted August 31, 2011 Share Posted August 31, 2011 Hello sorry to bother but is there any way I could know how to run the new permissions script in telnet I would like to see what is happening and how far till it ends. (I know my file structure so if I can see where it is working I can approximate.) thank you. Great product Quote Link to comment
jbartlett Posted August 31, 2011 Share Posted August 31, 2011 Upgraded from 4.7 to b12, everything seems to be working fine with the exception that I can not access the disk2 smb share. The others are fine. I ran the fix permission process & rebooted (twice), no change. I can browse once logged in as root and I can browse via the web gui. The permissions look fine. I could access the disk2 smb share under 4.7. I can see the files on disk2 in the user shares. Syslog attached. syslog.zip Quote Link to comment
bonienl Posted August 31, 2011 Share Posted August 31, 2011 When a new entry is added onto the cache drive (e.g. /films/new-entry), this entry becomes visible under the already existing share 'films, regardless of the setting "Use cache disk" in the share settings. I would expect the setting "No" to exclude cache entries, the setting "Yes" to include cache entries. Is this a correct assumption? The color of the share indicator changes to "yellow" as soon as a top folder (e.g. /films) is created, even when no contents are present. Isn't it more practical to change color only when real content does exist? Quote Link to comment
Joe L. Posted August 31, 2011 Share Posted August 31, 2011 The color of the share indicator changes to "yellow" as soon as a top folder (e.g. /films) is created, even when no contents are present. Isn't it more practical to change color only when real content does exist? It changes to orange when files are present. (from what has been described in the past) Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.