Jump to content

System crash after 30 minutes of parity check


Go to solution Solved by Bodine95,

Recommended Posts

I have been running my Unraid system for a few years and moving my two parity drives from 8Tb to 14Tb drives. 

 

Here is what I happened, I received an error that one of my drives failed.  I installed the new 14Tb replacement and the parity check started and completed nearly 3 days later.  I then had one 8Tb and one 14Tb parity drive.  I then removed the 8Tb drive and replaced it with the other 14Tb drive and now at about 30 minutes the system reboots and starts with the array stopped.  I have tried booting in safe mode with no plugin and starting the array and parity check.  This still causes the system to reboot at about 30 minutes also.  I have tried moving the new parity drive to a different sata port and the reboot still happens.  I removed my NVIDIA card and removed the NVIDIA driver and the reboot still occurs. The system is stable if I start the array and cancel the parity check.  It just show that I have only one working parity drive and one disabled parity drive.  Thanks in advance for helping me solve this issue.

 

I am running  Unraid version 6.11.5

 

Here is my system hardware:

HP G7 DL580

4 socket Xeon CPU E7-8837 @ 2.67Ghz (32 core)

64 Gb DDR3 ECC memory

BIOS P65

 

Drives:

[0:0:0:0]    disk    SanDisk  Ultra Fit 1.00  /dev/sda   30.7GB
[1:0:0:0]    disk    ATA      ST8000VN0022-2EL SC61  /dev/sdj   8.00TB
[1:0:1:0]    disk    ATA      ST14000VN0008-2K SC61  /dev/sdk   14.0TB
[1:0:2:0]    disk    ATA      ST8000VN0022-2EL SC61  /dev/sdl   8.00TB
[1:0:3:0]    disk    ATA      ST8000VN0022-2EL SC61  /dev/sdm   8.00TB
[1:0:4:0]    disk    ATA      ST14000VN0008-2K SC61  /dev/sdn   14.0TB
[1:0:5:0]    disk    ATA      WDC WD40EFRX-68W 0A80  /dev/sdo   4.00TB
[1:0:6:0]    disk    ATA      Samsung SSD 850  2B6Q  /dev/sdp   1.00TB
[1:0:7:0]    disk    ATA      WDC WD40EFRX-68W 0A80  /dev/sdq   4.00TB
[1:0:8:0]    disk    ATA      ST8000VN004-2M21 SC60  /dev/sdr   8.00TB
[1:0:9:0]    disk    ATA      ST4000DM000-1F21 CC54  /dev/sds   4.00TB
[2:1:0:0]    disk    HP       LOGICAL VOLUME   6.40  /dev/sdb    146GB
[2:1:0:1]    disk    HP       LOGICAL VOLUME   6.40  /dev/sdc    146GB
[2:1:0:2]    disk    HP       LOGICAL VOLUME   6.40  /dev/sdd    146GB
[2:1:0:3]    disk    HP       LOGICAL VOLUME   6.40  /dev/sde    500GB
[2:1:0:4]    disk    HP       LOGICAL VOLUME   6.40  /dev/sdf    500GB
[2:1:0:5]    disk    HP       LOGICAL VOLUME   6.40  /dev/sdg    500GB
[2:1:0:6]    disk    HP       LOGICAL VOLUME   6.40  /dev/sdh    500GB
[2:1:0:7]    disk    HP       LOGICAL VOLUME   6.40  /dev/sdi    500GB
[3:0:0:0]    cd/dvd  hp       DVD D  DS8D3SH   HHE7  /dev/sr0   1.07GB

 

I have attached my diagnostics file of my system while my parity check is running.  I have no error logs to share as they are deleted upon the system restart.

 

 

192.168.130.245-diagnostics-20230129-1757.zip

Edited by Bodine95
I deleted the wrong diag file and uploaded the correct file.
Link to comment
Jan 31 08:19:02 192 root: Fix Common Problems: Warning: Deprecated plugin ca.backup2.plg
Jan 31 08:19:10 192 root: Fix Common Problems: Error: Blacklisted plugin speedtest.plg ** Ignored
Jan 31 08:19:19 192 root: Fix Common Problems: Warning: NerdPack.plg Not Compatible with Unraid version 6.11.5

Jan 31 08:19:31 192 root: Fix Common Problems: Warning: Share media set to use pool cache-ssd, but files / folders exist on the cache-raid pool

Why haven't you fixed these? Uninstall NerdPack and install NerdTools if you really need it. That one you really should take care of.

 

Link to comment

If I don’t start the parity check it runs without issues. No power issues that I can see. The unit is connected to a UPS and it show no issues with power in/out. 
 

The issue I have is that for a good year everything worked until I started to increase the parity disks from 8Tb to 14Tb. The first drive worked without any issue. When I replaced the second drive this is when my issues started. Could the second drive be bad?

Link to comment

No power issues to the drives.

 

I am trying an experiment, I removed the 14Tb drive that I used to replace the first 8Tb parity drive (which completed successfully).  I am now using the second 14Tb drive and the parity check has run for 13.5 hrs and is at 5Tb.  I will keep everyone updated if this completes successfully or not.

Link to comment

Here is an update on the parity status.  The current largest drive with data is 8Tb, it has completed readying all the other data drives and is just moving to the end of the drive.

 

Total size:14 TB

Elapsed time:20 hours, 32 minutes

Current position:8.05 TB (57.5 %)

Estimated speed:126.0 MB/sec

Estimated finish:13 hours, 7 minutes

Link to comment

Update -  It failed.  I came to check on it 4 hours later and the had server rebooted several times and failed the parity check.

 

From Syslog looks like it rebooted three times in quick succession.

Line 58133: Feb  2 17:05:04
Line 61109: Feb  2 17:20:24
Line 64085: Feb  2 18:17:12

 

Any ideas would be greatly appreciated.

syslog_8

Link to comment

Changed my hardware of my server and started the parity check again.  The parity check has completed to 9.37 Tb and is at the point where it failed the last two times before.  I am hopeful that it passes with the changed hardware.

 

image.png.98718d2fa98fd4f2de5785a94eb8cdb8.png

 

Here is what I am using for my server:

 

ASUSTeK COMPUTER INC. Z97M-PLUS 
American Megatrends Inc., Version 0330
BIOS dated: Thu 29 May 2014 12:00:00 AM PDT

Intel® Core™ i7-4790 CPU @ 3.60GHz

Ram: 32 GiB

PSU: 1000W 

 

I have attached the latest diag file and syslog for everyone's review.

 

 

192.168.130.245-diagnostics-20230205-0705.zip syslog_10

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...