Thomasg Posted September 23, 2020 Share Posted September 23, 2020 (edited) Hello, I have a DL380 Gen8 with 8 disks. 7 are 900GB SAS disks. Every parity check it comes with 128 errors on every disk. I have done a rebuild on the parity and erased the disks one by one. I have replaced the on board controller with a HP H220 6Gbs PCI-E 3.0 HBA. All with the same result. I did read that it might help to install drivers if you get read/write errors like this, but I do not know what is the best option. I know the network errors in the logs. I moved to the beta to see if this improved anything. tower-syslog-20200923-1326.zip tower-diagnostics-20200923-1441.zip Edited September 23, 2020 by Thomasg Quote Link to comment
trurl Posted September 23, 2020 Share Posted September 23, 2020 54 minutes ago, Thomasg said: I did read that it might help to install drivers if you get read/write errors like this You didn't read it on this forum because you can't install drivers on Unraid. Diagnostics already includes syslog since last reboot so no need to attach it separately. In fact, the syslog you attached is only since it rotated due to it filling up. 58 minutes ago, Thomasg said: I know the network errors in the logs. You should fix that and reboot to give us cleaner syslog. After filtering all that out there was really nothing left. Have you done memtest? Unrelated, but your docker.img is much larger than necessary and you have reconfigured the shares docker uses. Why? 20G is usually more than enough for docker.img but you have 100G. Have you had problems filling it? Making it larger won't fix anything, it will just make it take longer to fill. docker.img should not grow, if it is then you have some application misconfigured so it is writing into docker.img instead of to mapped storage. Common mistake is application paths that don't correspond to container paths. Linux is case-sensitive. appdata, domains, system have files on the array and in fact are configured to be moved to the array instead of the default setting which would have kept them on cache. Having appdata, domains, system on the array will impact docker / VM performance with slower parity, and will cause array disks to spin since these files are always open. The default is to have these on cache and configured to stay on cache. These warnings from FCP are related to these things I mentioned about your docker configuration Sep 22 05:49:01 Tower root: Fix Common Problems: Warning: Share culturedbakehouse set to not use the cache, but files / folders exist on the cache drive Sep 22 05:49:01 Tower root: Fix Common Problems: Warning: Share isos set to not use the cache, but files / folders exist on the cache drive Sep 22 05:49:01 Tower root: Fix Common Problems: Error: Same share (Share) exists in a different case Sep 22 05:49:01 Tower root: Fix Common Problems: Error: Same share (SHARE) exists in a different case Sep 22 05:49:01 Tower root: Fix Common Problems: Error: Default docker appdata location is not a cache-only share ** Ignored Quote Link to comment
JorgeB Posted September 23, 2020 Share Posted September 23, 2020 Syslog is spammed with these: Sep 23 04:40:12 Tower kernel: bond0: (slave eth0): An illegal loopback occurred on slave Sep 23 04:40:12 Tower kernel: Check the configuration to verify that all adapters are connected to 802.3ad compliant switch ports Sep 23 04:40:13 Tower kernel: bond0: (slave eth1): An illegal loopback occurred on slave Sep 23 04:40:13 Tower kernel: Check the configuration to verify that all adapters are connected to 802.3ad compliant switch ports Check and fix network config Quote Link to comment
Thomasg Posted September 24, 2020 Author Share Posted September 24, 2020 I did a reboot and ran a parity check. Here is the diagnostics. From Sep 23 19:44:51. you see the errors on the drives. tower-diagnostics-20200924-1103.zip Quote Link to comment
Thomasg Posted September 24, 2020 Author Share Posted September 24, 2020 All of them are read errors at the Sep 23 19:44:51 timestamp. These appear before the errors. Sep 23 19:44:51 Tower kernel: sd 2:0:0:0: [sdc] tag#6221 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 cmd_age=0s Sep 23 19:44:51 Tower kernel: sd 2:0:0:0: [sdc] tag#6221 Sense Key : 0x5 [current] Sep 23 19:44:51 Tower kernel: sd 2:0:0:0: [sdc] tag#6221 ASC=0x24 ASCQ=0x0 Sep 23 19:44:51 Tower kernel: sd 2:0:0:0: [sdc] tag#6221 CDB: opcode=0x28 28 00 63 a3 8e c8 00 04 00 00 Quote Link to comment
JorgeB Posted September 24, 2020 Share Posted September 24, 2020 Looks more like a power/connection issue, also a good idea to update the LSI's firmware, since it's very old. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.