Posted August 31, 20186 yr Hi all, My server regularly becomes totally unresponsive, cannot even ssh into it, and I'm wondering whether it has to do with the new drives I have added. Before, my largest drive was 4TB, now I have replaced the parity drive and a data drive with 8TB drives. But the process of adding these drives (https://eshop.macsales.com/item/HGST/0S04012/) was problematic, despite running a pre-clear on them. Problematic as in the server just being unresponsive and having to hard reboot, even though it wasn't doing anything else. A little bit of background, the hardware has been in use since the 4.x days, and running 5.x for the longest time. Earlier this year I did the upgrade to 6.x, and it went without a problem (no docker images, very few plugins). Following the upgrade, I had also started to convert some drives to XFS by following the instructions found at https://wiki.unraid.net/File_System_Conversion. I have seen a snag there as well, with the destination drive taking up 3gigs more than the original drive, 157GB instead of 154GB). Here my hardware: Model: N/A M/B: Supermicro - X7SPA-HF CPU: Intel® Atom™ CPU D510 @ 1.66GHz HVM: Not Available IOMMU: Not Available Cache: 48 kB, 1024 kB Memory: 4 GB (max. installable capacity 4 GB) Network: eth0: 1000 Mb/s, full duplex, mtu 1500 eth1: not connected Kernel: Linux 4.14.49-unRAID x86_64 OpenSSL: 1.0.2o And here today's info from the Log button on the main screen after having to hard reboot again, but it doesn't list much in terms of errors: Aug 31 13:22:02 Zaphod kernel: BTRFS: device fsid 4268c7ad-100b-41ae-a956-7501b9d53230 devid 1 transid 61 /dev/loop3Aug 31 13:22:02 Zaphod kernel: BTRFS info (device loop3): disk space caching is enabledAug 31 13:22:02 Zaphod kernel: BTRFS info (device loop3): has skinny extentsAug 31 13:22:02 Zaphod root: Resize '/etc/libvirt' of 'max'Aug 31 13:22:02 Zaphod kernel: BTRFS info (device loop3): new size for /dev/loop3 is 1073741824Aug 31 13:22:02 Zaphod emhttpd: shcmd (127): /etc/rc.d/rc.libvirt startAug 31 13:22:02 Zaphod root: Starting virtlockd...Aug 31 13:22:02 Zaphod root: Starting virtlogd...Aug 31 13:22:02 Zaphod root: Starting libvirtd...Aug 31 13:22:02 Zaphod kernel: tun: Universal TUN/TAP device driver, 1.6Aug 31 13:22:02 Zaphod kernel: mdcmd (46): check nocorrectAug 31 13:22:02 Zaphod kernel: md: recovery thread: check P ...Aug 31 13:22:02 Zaphod kernel: ip6_tables: (C) 2000-2006 Netfilter Core TeamAug 31 13:22:02 Zaphod kernel: Ebtables v2.0 registeredAug 31 13:22:02 Zaphod kernel: md: using 1536k window, over a total of 7814026532 blocks.Aug 31 13:22:04 Zaphod kernel: virbr0: port 1(virbr0-nic) entered blocking stateAug 31 13:22:04 Zaphod kernel: virbr0: port 1(virbr0-nic) entered disabled stateAug 31 13:22:04 Zaphod kernel: device virbr0-nic entered promiscuous modeAug 31 13:22:04 Zaphod dhcpcd[1464]: virbr0: new hardware address: 52:54:00:10:2f:6bAug 31 13:22:04 Zaphod avahi-daemon[6806]: Joining mDNS multicast group on interface virbr0.IPv4 with address 192.168.122.1.Aug 31 13:22:04 Zaphod avahi-daemon[6806]: New relevant interface virbr0.IPv4 for mDNS.Aug 31 13:22:04 Zaphod avahi-daemon[6806]: Registering new address record for 192.168.122.1 on virbr0.IPv4.Aug 31 13:22:04 Zaphod kernel: virbr0: port 1(virbr0-nic) entered blocking stateAug 31 13:22:04 Zaphod kernel: virbr0: port 1(virbr0-nic) entered listening stateAug 31 13:22:05 Zaphod dnsmasq[8132]: started, version 2.79 cachesize 150Aug 31 13:22:05 Zaphod dnsmasq[8132]: compile time options: IPv6 GNU-getopt no-DBus i18n IDN2 DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth no-DNSSEC loop-detect inotifyAug 31 13:22:05 Zaphod dnsmasq-dhcp[8132]: DHCP, IP range 192.168.122.2 -- 192.168.122.254, lease time 1hAug 31 13:22:05 Zaphod dnsmasq-dhcp[8132]: DHCP, sockets bound exclusively to interface virbr0Aug 31 13:22:05 Zaphod dnsmasq[8132]: reading /etc/resolv.confAug 31 13:22:05 Zaphod dnsmasq[8132]: using nameserver 192.168.1.1#53Aug 31 13:22:05 Zaphod dnsmasq[8132]: read /etc/hosts - 2 addressesAug 31 13:22:05 Zaphod dnsmasq[8132]: read /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addressesAug 31 13:22:05 Zaphod dnsmasq-dhcp[8132]: read /var/lib/libvirt/dnsmasq/default.hostsfileAug 31 13:22:05 Zaphod kernel: virbr0: port 1(virbr0-nic) entered disabled stateAug 31 13:26:43 Zaphod ntpd[1527]: kernel reports TIME_ERROR: 0x41: Clock UnsynchronizedAug 31 13:39:54 Zaphod kernel: md: recovery thread: P incorrect, sector=132077792 Any ideas what the problem could be? Looking at the logs directory of the flash drive, I have no file there since March of this year. Thanks for any pointers, -Christian
August 31, 20186 yr Community Expert Upload the Diagnostics file in a new post. Tools >>> Diagnostics There may be some clues in those files. Install the Fix Common Problems plugin. Turn on the troubleshooting mode. That will write period updates of the syslog to the logs folder/directory on your flash drive. The next time you experience the problem, upload the latest three or four files in the logs folder in another NEW post.
August 31, 20186 yr Author OK, here the diagnostics file. Thanks for the quick reply! -Christian zaphod-diagnostics-20180831-1504.zip
August 31, 20186 yr Author Well, I installed Community Applications, then Fix Common Problems, but it stays on "Scanning" even after several minutes… doesn't sound like that's normal? But going back to the previous page and back in allowed me now to enable the Troubleshooting mode. Now it's wait and see! -Christian Edited August 31, 20186 yr by chrisb42
August 31, 20186 yr Community Expert Do you have a disk 9 on your system? Are any of the disk above 90% full? (Reiserfs has a history of becoming unresponsive as it fills up. Plus, It has not been updated since about 2010 as its chief developer is serving a prison term for killing his wife. Thus, it has not been optimized for the latest large capacity disks.)
September 1, 20186 yr Author Yes, disk #9 is the one I added to convert reiserfs to xfs. Is there an issue with having a 9th data disk? I actually added it as disk #10, so I know that's the one I added for conversion of the other drives. And yes, some drives are rather full, which is why I wanted to upgrade some of the smaller drives with bigger ones.
September 1, 20186 yr Community Expert There is not a problem not having a disk 9 as far as I know. I was more worried that there was a missing smart report which usually means that a disk is off-line. As I recall, there have been slow/hang issues with Reiserfs formatted disks when they get too full. I don't recall ever seeing an explanation of why is is happening but some have said that is the result of some 'housecleaning' occurring that seems to take forever when the disk is nearly full. Plus, I believe that the processor that you have is no powerhouse and that would aggravate the problem even more. Edited September 1, 20186 yr by Frank1940
September 1, 20186 yr Author Oh it's definitely not a speed demon 🙂. I chose the board for it's efficiency, being on 24/7 and all and really only being a file server. Fix Common Problems also pointed out that there may be some issues with my Marvell-based Supermicro AOC-SASLP-MV8 card. I checked, and there is newer firmware for both my board and the controller card, which could also contribute to the issues I have seen since replacing drives. Guess I'll have to pull the server and hook up a monitor and keyboard to it… Thanks for all the pointers! -Christian
September 2, 20186 yr Community Expert 3 hours ago, chrisb42 said: Fix Common Problems also pointed out that there may be some issues with my Marvell-based Supermicro AOC-SASLP-MV8 card. I would suggest that you google this card (and use the term 'unraid' as search parameter). I seem to recall that this card has fall out-of-favor because it seems to have random issues with recent unRAID releases. I can't remember many details beyond that at this point. But here is one thread that goes into some detail: https://forums.unraid.net/topic/39003-marvell-disk-controller-chipsets-and-virtualization/ Today, most folks are using the LSI based cards. They can be found used at very modest prices on E-Bay. You do need to get one of the ones that are natively in the 'IT mode' or (if they have RAID firmware) they have to 'modded' to have the 'IT mode' firmware installed. If necessary, you can do this install yourself or look for a vendor who supplies cards with it already done.
September 3, 20186 yr Author Yes, thanks, I had seen that thread. I have now updated the board's firmware, as well as the controller card firmware (and while I was at it, the IPMI firmware of the mainboard). I will now see whether it's more stable for me or not. I will resume the XFS conversion I started and see what happens. Thanks again for all the help! 👍 -Christian
Archived
This topic is now archived and is closed to further replies.