Jump to content

Missing Data / Increasing UDMA CRC Count on hard drives


Recommended Posts

Hello, Over the past year I have been getting increasingly frequent reports of the UDMA CRC numbers increasing on each hard drive in my server and eventually the drives would become disabled. I started off replacing my old WD drives from a previous server with IronWolf 4tb drives but the problem kept happening even after I had sent the drives back for repair to Seagate. After I did some research I found that the increasing numbers were nothing to be that worried about and could be down to poor SATA cables or even bends in the cables. I improved my cable management and rebuild the current drive that had been disabled and all seem

fine.

 

Cut to yesterday when it happened again, however this time when I rebuilt the drive the parity check passed but the data isn't visible. Unraid shows the correct drive percentage under the 'Main' tab of the GUI but browsing the drive by clicking the file logo gives me the message 'No Listing: Too many files'. The shares have disappeared from the 'Shares' tab and windows cannot access them either. 

 

The drive with the problem is drive 4 in my set up but drive 3 (an old WD) is also giving me the same problems and creating thousands of error lines. I have one spare (brand new Ironwolf) drive to swap into the box if needed but don't want to try this until I'm sure this will help and not hinder the problems. I have attached the zip file from the diagnostics page (as per the 'Read me first' post on this forum) and downloaded all the SMART reports in case they are helpful later. 

 

I am running Unraid 6.9.2 with these plugins: Recycle Bin, Unassigned Devices, Community Applications, Disk Location, Dynamic Active Streams, Dynamix SSD Trim, Fix Common Problems, Preclear Disks, Unassigned Devices Plus, unbalance, User Scripts. My hardware is a ASRock H370M-ITX, i7-8700 with 32g RAM. 

 

Any help is much appreciated. 

fileserver1-diagnostics-20210714-1733.zip

Link to comment

Disk3 dropped offline 1 minute after you started rebuilding disk4, so the rebuilt disk4 will be mostly corrupt, if system notifications are enable you'd be notified about the read errors, syslog cuts off due to log spam, if if you were rebuilding on top of the old disk, all data on that disk will be gone, and likely there will be be filesystem corruption preventing user shares from working, reboot and post new diags after array start.

Link to comment

Hi JorgeB. As grateful as I was for your response I was tearing my hair out thinking I had lost all that data. However after following your instructions and rebooting the server,  all the shares are working and my data is accessible! This is amazing.

 

I have still posted the diagnostics file in case you can help me prevent this from happening again. Certainly advice on best practices to storing data would be appreciated, should I run two parity disks or get a bigger case as my Fractal design node 304 is pretty tight.

 

Regards

fileserver1-diagnostics-20210714-1936.zip

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...