gray squirrel Posted February 1, 2020 Share Posted February 1, 2020 so a couple of days ago my parity drive started spitting out UDMA errors. it ran up to 5155. i changed the sata cable, run a full SMART check and rebuilt the array. now fix common problems is telling that there are hardware problems and to post my log file. see attached. the UDMA error has not increased. a quick scan down syslog shows lots of md: disk0 read error, sector=11252103536 etc. is my shucked 8TB drive screwed? or do i just acknowledge the error and forget about it. on another note, can i swap my other 8TB drive to become parity? my array is 2X 8TB and 2X 6TB. megatron-diagnostics-20200201-2214.zip Quote Link to comment
Squid Posted February 1, 2020 Share Posted February 1, 2020 The hardware errors are because your RAM is going bad Jan 30 08:34:39 Megatron kernel: mce: [Hardware Error]: Machine check events logged Jan 30 08:34:39 Megatron kernel: EDAC sbridge MC1: HANDLING MCE MEMORY ERROR Jan 30 08:34:39 Megatron kernel: EDAC sbridge MC1: CPU 8: Machine Check Event: 0 Bank 5: 8c00004000010090 Jan 30 08:34:39 Megatron kernel: EDAC sbridge MC1: TSC 33ec5ed340a13a Jan 30 08:34:39 Megatron kernel: EDAC sbridge MC1: ADDR 7a2774f00 Jan 30 08:34:39 Megatron kernel: EDAC sbridge MC1: MISC 2140444486 Jan 30 08:34:39 Megatron kernel: EDAC sbridge MC1: PROCESSOR 0:206d7 TIME 1580373279 SOCKET 1 APIC 20 Jan 30 08:34:39 Megatron kernel: EDAC MC1: 1 CE memory read error on CPU_SrcID#1_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0x7a2774 offset:0xf00 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0090 socket:1 ha:0 channel_mask:1 rank:1) 3 minutes ago, gray squirrel said: i changed the sata cable, Reseat it on both ends, and the power cabling. Outside the the CRC errors, it looks good. Quote Link to comment
gray squirrel Posted February 1, 2020 Author Share Posted February 1, 2020 (edited) i have a 8 6 port sata cable so just switched to a new port completely and pulled and re-seated the power cable. ram going bad never seen that before (but this is the first time i'm using full on server hardware) is it worth a re-seat and run memtest, or is the module screwed? amazingly fast response Squid!!! edit, so the shouldn't worry about the disc read errors, should i swap the parity drive anyway? Edited February 1, 2020 by gray squirrel Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.