November 15, 200916 yr I'm running 4.4.2. Yesterday I switched it on and started moving a large file, about 11GB to the server over my gigabit network. Sometime in the middle of this move, the originating PC gave out a message saying contact with the server had been lost. I couldn't telnet into the server and the console was blank. So I rebooted the server and it automatically started a parity check, during which time I left it alone, not attempting to read or write anything to the server. After about 12 hours and I think about 80% into the parity check the server crashed with a kernel panic message as attached. Unfortunately I have no syslog. I'm concerned because I haven't changed anything on the server recently (nor attempted to) and it has been running very smoothly (mainly just backing up files onto it). In the past I could attribute problems typically to hardware issues, usually some cable coming loose during the installation of a new hard disk, but this time I hadn't been doing anything. Unfortunately I can't interpret the error message. Can anyone advise me what I should do?
November 15, 200916 yr This looks like the Realtek RTL8169 KP that is fixed in beta7 kernel and later. Large packet hitting the NIC when not initialized for large frames.
November 16, 200916 yr Author I'm using the old unRaid "starter pack" Intel motherboard, which I think might not use Realtek or predate that chip. Also, as I was only doing a parity check and not reading or writing to the server, there was no reason for a large packet to be going through the NIC (and I don't have jumbo frames enabled anywhere on the network, if that's how large frames get created). What beta 7 are you referring to? Is it in the latest unRaid beta release?
November 16, 200916 yr Unraid 4.5beta7 and later has the Realtek RT8169 fix. That appears to be an IRQ service routine, same as the Realtek KP, if you dont have a realtek Gb based NIC then it is another issue. I'd boot the server to memtest and just verify it is happy with the RAM modules.
November 16, 200916 yr This has nothing to do with any NIC - it's a crash in the unraid driver. Is this reproducible? If so, please upgrade to release 4.5-beta11 (latest beta as of this moment) & see if you can make it happen again.
November 18, 200916 yr Author I suspect it's a hardware issue, because I ran Memtest as suggested. It started running and then before completing one full round of tests it froze. I switched off and the next day ran it again and it froze again (this time after one full round). So I tried re-seating the RAM but now when I boot up the flash drive shows no activity, there's no video output and the hard disk access light is on. This happened before when I re-seated the RAM (because I had to remove the RAM to get at the cables to replace a hard drive). I think it might be time to get a new system. I don't really want to try another 12 hour parity check just to have it die on me at the end but certainly if it happens again I will try the upgrade.
Archived
This topic is now archived and is closed to further replies.