May 24, 201214 yr Having some issues and apologize for the noobness. Unraid Froze 5.0 rc3 was non responsive via http and shares. Restart button on pc, several drives are red now. Shutdown completely down in unraid menu. Start server, everything is backup. Running for about 3-4 days until the freeze. I have always had freezing issues after I upgraded from 4.7 to the beta 5.x for larger hard drive support. Hardware: SUPERMICRO MBD-X8SIL-F-O Xeon = MBD-X8SIL-F-O RAID CRAD SUPERMICRO AOC-SASLP-MV8 CPU INTEL|CORE I3 540 3.06G 4M R MEM 2Gx2|KST KVR1333D3E9SK2/4G R MODULE NORCO|SS-500 R I don't run any plugins. Thank you for the assistance! unraid_log.txt
May 25, 201214 yr You attached a syslog (thank you!) and are running addon free (thank you again)! Unfortunately, the syslog does not indicate any problems, other than it follows a previous crash (transactions being replayed). I would recommend keeping a syslog tail running on the console, to see if anything useful shows up at the time of the crash, if and when it next occurs. Try: tail -f --lines=120 /var/log/syslog You should probably try an overnight Memtest too, when it is convenient, just to rule out memory problems.
May 28, 201214 yr Author I got another syslog with some errors: sas eh calling libata port error handler I also ran the command in telnet and I attached the output below. I apologize if I ran it incorrectly since I am kind of a newb. Thanks for the assistance! unraid_error2.txt output_unraid.txt
May 31, 201214 yr Those aren't actually errors as such, and your newer syslog does not show any problems either. I think you will have to run that tail and wait for the next crash. Those error handler calls are just the way the SAS module uses to setup the attached drives, not the only one to do it that way either.
June 1, 201214 yr Author Froze again. It's pretty much like clock work, freezing up every 3rd night running. I attached the tail but it seems to get stuck at the end I have have to ctrl-c to break. I really appreciate your help. tail_120.txt
June 1, 201214 yr I just ran into a situation where unraid become un-responsive. i.e. shares not accessible, could not telnet, console attached to server would not function, and webGui was non-responsive. This happened twice and I was not able to get a log. Let me describe a change I made yesterday that may or may not have any bearing on this situation and the sequence of events. I have been running all my shares with the "High-water" allocation method. I wanted to see if the allocation of the files would balance differently, so I changed the allocation method on all my shares to "Most-free". I have the monthly Parity Check enabled in Simple Features so the parity check would have started early this morning because it is the first of the month. The unraid server was frozen up this morning. After trying everything I could to not have to hard power off, I had no choice. When I re-started unraid it started a parity check and soon after locked up once again. I powered off and when unraid re-started, the parity check started again. I stopped the parity check and changed all the allocations on my shares back to "High-water". I then started the parity check and it has been running fine ever since. It's currently at 70% and still going. Might be a fluke, but this information might be useful to someone battling the "lock up" problem.
June 2, 201214 yr Froze again. It's pretty much like clock work, freezing up every 3rd night running. I attached the tail but it seems to get stuck at the end I have have to ctrl-c to break. I really appreciate your help. I wish I could see more than the tail recorded, what immediately preceded it, to confirm that you have the same mvsas 'BLK_EH_NOT_HANDLED' issue as others do (see http://lime-technology.com/forum/index.php?topic=20529). What is visible though looks very much like it. I believe that Tom is shortly going to be releasing a new version to deal with it. The tail command with the -f option is supposed to be 'stuck', it stays live displaying any fresh syslog lines. The normal way to break out of it is Ctl-C, as you found. DLandon: there has not been any mention of User Shares with this issue, so I don't see a connection with your issue so far. However, by changing the allocation method, it is possible that you changed the physical destination drive to one on a mvsas-based card, which could trigger this issue. Would that be possible with your hardware setup?
June 2, 201214 yr I don't believe that I have hardware that applies here. I really don't know what to make of the problem I had, but for the moment I'm writing it off as just a fluke.
June 9, 201214 yr Author I had a hunch, so I changed default spin down delay to 0 and so far it's been running smooth since I last posted. Usually it dies with in 3-4 days, I will reply back with an update at the end of the week. Keeping fingers crossed. Poop
June 16, 201214 yr Author Upgraded to rcp and I am allowing the drives to go to sleep. Froze while I was copying some files. I ran the tail command and the following was looping: Jun 15 21:46:12 Tower kernel: mvsas 0000:01:00.0: Phy7 : No sig fis Jun 15 21:46:12 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2139:phy7 Attached Dev ice Jun 15 21:46:12 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2198:port 7 ctrl sts=0 x89800. Jun 15 21:46:12 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2200:Port 7 irq sts = 0x1001001 Jun 15 21:46:12 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2226:phy7 Unplug Notic e Jun 15 21:46:12 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2198:port 7 ctrl sts=0 x199800. Jun 15 21:46:12 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2200:Port 7 irq sts = 0x81
July 7, 201213 yr Author I ended up replacing the hard drive. Seems to be running better under rc5 time will tell.
July 7, 201213 yr I just ran into a situation where unraid become un-responsive. i.e. shares not accessible, could not telnet, console attached to server would not function, and webGui was non-responsive. This happened twice and I was not able to get a log. Let me describe a change I made yesterday that may or may not have any bearing on this situation and the sequence of events. I have been running all my shares with the "High-water" allocation method. I wanted to see if the allocation of the files would balance differently, so I changed the allocation method on all my shares to "Most-free". I have the monthly Parity Check enabled in Simple Features so the parity check would have started early this morning because it is the first of the month. The unraid server was frozen up this morning. After trying everything I could to not have to hard power off, I had no choice. When I re-started unraid it started a parity check and soon after locked up once again. I powered off and when unraid re-started, the parity check started again. I stopped the parity check and changed all the allocations on my shares back to "High-water". I then started the parity check and it has been running fine ever since. It's currently at 70% and still going. Might be a fluke, but this information might be useful to someone battling the "lock up" problem. Ran into the same problem again the first of the month when I was doing a parity check. I think my motherboard has a Realtek NIC. I ended up putting in an Intel NIC and haven't had any issues since. I wonder if all of these locking up issues aren't a NIC issue that manifests itself in different ways making it hard to pin down. Try an Intel NIC and see if your issues don't go away. The Intel NIC appears to be the best supported and most reliable.
Archived
This topic is now archived and is closed to further replies.