chris1259 Posted February 7, 2017 Share Posted February 7, 2017 I have a ton of these in my System log. kernel: 3w-sas: scsi2: ERROR: (0x03:0x0101): Invalid command opcode:opcode=0x4D. The opcode changes on each line. I have also noticed a lot of LAN errors recently. But i haven't experienced any data corruption that i can see. I first thought it was the age and speed of the hard drives that was causing a file copy to go from 113MB/s to 20MB/s after it reached 30-40%. But now i am not sure. At one point LAN erorrs were reaching over 1 million is 24hours. Even when the system was idle. I first noticed this in v6.3.0-rc6. It got better in 6.3.0-rc9. And now is at an all time low in v6.3.0 Final. I tried swapping out LAN cables and the swtich. The switch may have improved the situation. I tested this server with Windows 2008 R2 before running unRaid on it and i could not find anything wrong. Any other logs i can look at? Quote Link to comment
chris1259 Posted February 7, 2017 Author Share Posted February 7, 2017 https://postimg.org/image/i3iajm8br/ Here is a pictures of the LAN errors count. Quote Link to comment
John_M Posted February 7, 2017 Share Posted February 7, 2017 Which bonding mode are you using? Tools -> Diagnostics and post the zip file. Quote Link to comment
chris1259 Posted February 7, 2017 Author Share Posted February 7, 2017 Bonding = Active-Backup. That was the default and i never changed it. Logs attached. tower-diagnostics-20170207-1836.zip Quote Link to comment
John_M Posted February 8, 2017 Share Posted February 8, 2017 Well, it seems that people were having problems with the nVidia MCP55 chipset three years ago, too. The Ethernet driver was reverse engineered and possibly never worked properly. Do yourself a favour and stick an Intel NIC in there. Even a Realtek one would be better supported. Your 3ware SAS card is spectacularly unhelpful with its error messages. Maybe they're explained in the manual. I had hoped there might be information leading up to the first of those error messages that might give a clue but the only error message in plain English was this one: Feb 6 09:54:37 Tower01 kernel: 3w-sas: scsi2: AEN: INFO (0x04:0x0053): Battery capacity test is overdue:. Unfortunately, a bug in unRAID 6.3.0's diagnostics means that the SMART reports it generates contain no useful information either. So really, it's just guesswork: check data cables and power to disks. Is there some sort of backplane involved? With that number of disks I expect there is. All your user shares are set to Use Cache: No so the figures you report are understandable. Writes will start out saturating the gigabit Ethernet until the spare RAM that is used to cache disk I/O is full, then it will slow to the speed of writing to the parity protected array. Quote Link to comment
chris1259 Posted February 8, 2017 Author Share Posted February 8, 2017 I updated the shares to use cache. I originally turned on cache under settings -> global share settings. I forgot there was another area to enable it. The 2 nics in the server are integrated into the motherboard. If you think it will help, i might be able to stick some PCI nics in there. The server does have a backplane and i noticed that unRaid can't read the temperature of the disks. This might be a limitation of the hardware in the server. The server is a 2U Supermicro CSE-216 H8DME-2. I will take a look at the manuals and report back anything i find. Thank you for your help. Quote Link to comment
chris1259 Posted February 12, 2017 Author Share Posted February 12, 2017 I didn't find anything in the manual about this error: kernel: 3w-sas: scsi2: ERROR: (0x03:0x0101): Invalid command opcode:opcode=0x4D. It might not be a bug either. https://bugzilla.redhat.com/show_bug.cgi?id=666416 "Some monitoring utility is sending requests to the controller that it doesn't like." Or maybe it is? https://bugs.launchpad.net/ubuntu/+source/linux/+bug/618542 Either way it does not appear to be hurting anything. I assume this statement has to do with the battery backup on the raid controller. That is something i have not tested. I will look into that. Feb 6 09:54:37 Tower01 kernel: 3w-sas: scsi2: AEN: INFO (0x04:0x0053): Battery capacity test is overdue. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.