Posted September 12, 201113 yr I have a share defined, which I use as a DataStore for my ESXi server. ESXi accesses this via NFS. Yesterday, I tried to run an ESXi command, to shrink the disk. This would cause a lot of I/O to take place. After a few minutes this failed, with an I/O Error. Following that, ESXi could no longer connect to the share. Logging into the server, I now see the following: Any, and I mean any, access to either /mnt/user or /mnt/disk2 causes the session to lock up solidly. Also, I cannot access the server from the GUI. Something is running, and chewing CPU, because top shows this: top - 12:32:39 up 12 days, 19:47, 5 users, load average: 231.11, 229.55, 225. Tasks: 308 total, 1 running, 307 sleeping, 0 stopped, 0 zombie Cpu0 : 0.0%us, 0.0%sy, 0.0%ni, 0.0%id,100.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 0.0%us, 0.0%sy, 0.0%ni, 0.0%id, 99.7%wa, 0.0%hi, 0.3%si, 0.0%st Mem: 7529148k total, 4720476k used, 2808672k free, 357324k buffers Swap: 0k total, 0k used, 0k free, 4126876k cached The usage for Cpu0 and Cpu1 never shows more than background chatter, and no task ever shows as using any cpu. So, whatever is burning cycles, shows up in the load, but not as a task. The only way I will be able to regain control here, is going to be "throw the big red switch". Which I guess will be the ultimate test of recover-ability. Is there any other information I can gather before then, to try and see exactly what is happening here. I've attached my syslog, but unfortunately is really doesn't show anything. BTW This is running beta12, but I'm not sure that this is related to that, or maybe it is. Cheers. syslog.zip
September 12, 201113 yr Author A snapshot: top - 15:52:42 up 12 days, 23:07, 1 user, load average: 303.55, 301.67, 297.8 Tasks: 376 total, 1 running, 375 sleeping, 0 stopped, 0 zombie Cpu0 : 0.0%us, 0.0%sy, 0.0%ni, 0.0%id,100.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 0.3%us, 0.0%sy, 0.0%ni, 0.0%id, 99.7%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 7529148k total, 4749736k used, 2779412k free, 358040k buffers Swap: 0k total, 0k used, 0k free, 4132448k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1544 root 20 0 2604 1152 756 R 0 0.0 0:00.22 top 1 root 20 0 828 284 240 S 0 0.0 0:06.87 init 2 root 20 0 0 0 0 S 0 0.0 0:00.02 kthreadd 3 root 20 0 0 0 0 S 0 0.0 0:00.18 ksoftirqd/0 6 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/0 7 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/1 9 root 20 0 0 0 0 S 0 0.0 0:07.92 ksoftirqd/1 11 root 0 -20 0 0 0 S 0 0.0 0:00.00 khelper 149 root 20 0 0 0 0 D 0 0.0 13:17.57 sync_supers 151 root 20 0 0 0 0 S 0 0.0 0:00.03 bdi-default 153 root 0 -20 0 0 0 S 0 0.0 0:00.00 kblockd 270 root 0 -20 0 0 0 S 0 0.0 0:00.00 ata_sff 280 root 20 0 0 0 0 S 0 0.0 0:00.00 khubd 319 eddie 20 0 17280 3184 2408 D 0 0.0 0:00.00 smbd 385 root 0 -20 0 0 0 S 0 0.0 0:00.00 rpciod 414 root 20 0 0 0 0 S 0 0.0 11:01.07 kswapd0 466 eddie 20 0 17280 3216 2440 D 0 0.0 0:00.00 smbd 474 root 20 0 0 0 0 S 0 0.0 0:00.00 fsnotify_mark 493 root 0 -20 0 0 0 S 0 0.0 0:00.00 nfsiod 501 root 0 -20 0 0 0 S 0 0.0 0:00.00 crypto 502 eddie 20 0 17280 3184 2408 D 0 0.0 0:00.00 smbd 531 eddie 20 0 17280 3152 2376 D 0 0.0 0:00.00 smbd 608 root 0 -20 0 0 0 S 0 0.0 0:00.00 cnic_wq 669 eddie 20 0 17280 3152 2376 D 0 0.0 0:00.00 smbd 689 root 16 -4 2284 896 496 S 0 0.0 0:00.03 udevd 707 eddie 20 0 17280 3152 2376 D 0 0.0 0:00.00 smbd 736 eddie 20 0 17280 3156 2380 D 0 0.0 0:00.00 smbd 829 root 20 0 0 0 0 S 0 0.0 0:00.00 scsi_eh_0 830 root 20 0 0 0 0 S 0 0.0 0:00.01 usb-storage 853 root 20 0 0 0 0 S 0 0.0 0:00.00 scsi_eh_1 855 root 20 0 0 0 0 S 0 0.0 0:00.00 scsi_eh_2 857 root 20 0 0 0 0 S 0 0.0 0:00.00 kworker/u:2 858 root 20 0 0 0 0 S 0 0.0 0:00.00 scsi_eh_3 859 root 20 0 0 0 0 S 0 0.0 0:00.00 scsi_eh_4 860 root 20 0 0 0 0 S 0 0.0 0:00.00 kworker/u:3 And here's the same sorted on Time: top - 15:53:27 up 12 days, 23:08, 1 user, load average: 303.55, 301.94, 298.1 Tasks: 376 total, 1 running, 375 sleeping, 0 stopped, 0 zombie Cpu0 : 0.0%us, 0.0%sy, 0.0%ni, 0.0%id,100.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 0.3%us, 0.0%sy, 0.0%ni, 0.0%id, 99.7%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 7529148k total, 4749892k used, 2779256k free, 358040k buffers Swap: 0k total, 0k used, 0k free, 4132448k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2827 root 20 0 127m 3400 556 S 0 0.0 234:22.74 shfs 1695 root 20 0 0 0 0 S 0 0.0 104:56.76 unraidd 2866 root 20 0 0 0 0 S 0 0.0 32:18.75 nfsd 2868 root 20 0 0 0 0 D 0 0.0 30:56.03 nfsd 2870 root 20 0 0 0 0 D 0 0.0 30:21.26 nfsd 2871 root 20 0 0 0 0 S 0 0.0 30:07.98 nfsd 2867 root 20 0 0 0 0 D 0 0.0 28:14.33 nfsd 2865 root 20 0 0 0 0 S 0 0.0 27:44.35 nfsd 2869 root 20 0 0 0 0 S 0 0.0 27:22.39 nfsd 2864 root 20 0 0 0 0 D 0 0.0 27:01.87 nfsd 149 root 20 0 0 0 0 D 0 0.0 13:17.57 sync_supers 414 root 20 0 0 0 0 S 0 0.0 11:01.07 kswapd0 1911 root 35 15 93380 29m 5844 S 0 0.4 8:05.87 python 31781 eddie 20 0 16668 4700 3704 D 0 0.1 4:51.81 smbd 1462 root 20 0 45680 1644 1144 S 0 0.0 2:34.15 emhttp 1878 root 20 0 181m 20m 9452 S 0 0.3 1:30.70 Plex Media Serv 17441 root 20 0 0 0 0 D 0 0.0 0:23.18 flush-9:2 6506 root 20 0 16516 3896 3056 S 0 0.1 0:20.00 smbd 2088 root 20 0 4492 1532 1212 S 0 0.0 0:19.88 ntpd 1086 bin 20 0 1960 520 432 S 0 0.0 0:18.16 rpc.portmap 2843 root 20 0 9520 1940 1308 S 0 0.0 0:08.60 nmbd 9 root 20 0 0 0 0 S 0 0.0 0:07.92 ksoftirqd/1 1 root 20 0 828 284 240 S 0 0.0 0:06.87 init 2881 avahi 20 0 2960 1616 1364 S 0 0.0 0:04.44 avahi-daemon 14469 root 20 0 0 0 0 S 0 0.0 0:02.33 kworker/1:2 1130 root 20 0 1908 584 504 S 0 0.0 0:00.96 crond 2873 root 20 0 2284 792 560 S 0 0.0 0:00.91 rpc.mountd 962 root 20 0 1908 620 536 S 0 0.0 0:00.73 syslogd 1544 root 20 0 2604 1152 756 R 1 0.0 0:00.42 top 11367 root 20 0 0 0 0 D 0 0.0 0:00.26 kworker/1:0 3 root 20 0 0 0 0 S 0 0.0 0:00.18 ksoftirqd/0 11372 root 20 0 0 0 0 S 0 0.0 0:00.17 kworker/0:0 2845 root 20 0 16136 3664 2900 S 0 0.0 0:00.10 smbd 151 root 20 0 0 0 0 S 0 0.0 0:00.03 bdi-default 689 root 16 -4 2284 896 496 S 0 0.0 0:00.03 udevd Cheers.
Archived
This topic is now archived and is closed to further replies.