Bodine95 Posted January 31, 2023 Share Posted January 31, 2023 (edited) I have been running my Unraid system for a few years and moving my two parity drives from 8Tb to 14Tb drives. Here is what I happened, I received an error that one of my drives failed. I installed the new 14Tb replacement and the parity check started and completed nearly 3 days later. I then had one 8Tb and one 14Tb parity drive. I then removed the 8Tb drive and replaced it with the other 14Tb drive and now at about 30 minutes the system reboots and starts with the array stopped. I have tried booting in safe mode with no plugin and starting the array and parity check. This still causes the system to reboot at about 30 minutes also. I have tried moving the new parity drive to a different sata port and the reboot still happens. I removed my NVIDIA card and removed the NVIDIA driver and the reboot still occurs. The system is stable if I start the array and cancel the parity check. It just show that I have only one working parity drive and one disabled parity drive. Thanks in advance for helping me solve this issue. I am running Unraid version 6.11.5 Here is my system hardware: HP G7 DL580 4 socket Xeon CPU E7-8837 @ 2.67Ghz (32 core) 64 Gb DDR3 ECC memory BIOS P65 Drives: [0:0:0:0] disk SanDisk Ultra Fit 1.00 /dev/sda 30.7GB [1:0:0:0] disk ATA ST8000VN0022-2EL SC61 /dev/sdj 8.00TB [1:0:1:0] disk ATA ST14000VN0008-2K SC61 /dev/sdk 14.0TB [1:0:2:0] disk ATA ST8000VN0022-2EL SC61 /dev/sdl 8.00TB [1:0:3:0] disk ATA ST8000VN0022-2EL SC61 /dev/sdm 8.00TB [1:0:4:0] disk ATA ST14000VN0008-2K SC61 /dev/sdn 14.0TB [1:0:5:0] disk ATA WDC WD40EFRX-68W 0A80 /dev/sdo 4.00TB [1:0:6:0] disk ATA Samsung SSD 850 2B6Q /dev/sdp 1.00TB [1:0:7:0] disk ATA WDC WD40EFRX-68W 0A80 /dev/sdq 4.00TB [1:0:8:0] disk ATA ST8000VN004-2M21 SC60 /dev/sdr 8.00TB [1:0:9:0] disk ATA ST4000DM000-1F21 CC54 /dev/sds 4.00TB [2:1:0:0] disk HP LOGICAL VOLUME 6.40 /dev/sdb 146GB [2:1:0:1] disk HP LOGICAL VOLUME 6.40 /dev/sdc 146GB [2:1:0:2] disk HP LOGICAL VOLUME 6.40 /dev/sdd 146GB [2:1:0:3] disk HP LOGICAL VOLUME 6.40 /dev/sde 500GB [2:1:0:4] disk HP LOGICAL VOLUME 6.40 /dev/sdf 500GB [2:1:0:5] disk HP LOGICAL VOLUME 6.40 /dev/sdg 500GB [2:1:0:6] disk HP LOGICAL VOLUME 6.40 /dev/sdh 500GB [2:1:0:7] disk HP LOGICAL VOLUME 6.40 /dev/sdi 500GB [3:0:0:0] cd/dvd hp DVD D DS8D3SH HHE7 /dev/sr0 1.07GB I have attached my diagnostics file of my system while my parity check is running. I have no error logs to share as they are deleted upon the system restart. 192.168.130.245-diagnostics-20230129-1757.zip Edited January 31, 2023 by Bodine95 I deleted the wrong diag file and uploaded the correct file. Quote Link to comment
trurl Posted January 31, 2023 Share Posted January 31, 2023 Those diagnostics are 2 years old and don't correspond to the descriptions in your post. Try again. 1 hour ago, Bodine95 said: no error logs to share as they are deleted upon the system restart. setup syslog server Quote Link to comment
Bodine95 Posted January 31, 2023 Author Share Posted January 31, 2023 I have enabled the syslog to the flash drive and started the parity check. I will upload the log once the system crashes and reboots. I have also corrected the diag file with the correct current file. 192.168.130.245-diagnostics-20230129-1757.zip Quote Link to comment
trurl Posted January 31, 2023 Share Posted January 31, 2023 RAID controllers are NOT recommended for many reasons. Quote Link to comment
Bodine95 Posted January 31, 2023 Author Share Posted January 31, 2023 Well after 6 hours and 57 minutes and @ 1.42TB my system crashed. Here is the syslog from the flash drive. syslog Quote Link to comment
trurl Posted February 1, 2023 Share Posted February 1, 2023 Jan 31 08:19:02 192 root: Fix Common Problems: Warning: Deprecated plugin ca.backup2.plg Jan 31 08:19:10 192 root: Fix Common Problems: Error: Blacklisted plugin speedtest.plg ** Ignored Jan 31 08:19:19 192 root: Fix Common Problems: Warning: NerdPack.plg Not Compatible with Unraid version 6.11.5 Jan 31 08:19:31 192 root: Fix Common Problems: Warning: Share media set to use pool cache-ssd, but files / folders exist on the cache-raid pool Why haven't you fixed these? Uninstall NerdPack and install NerdTools if you really need it. That one you really should take care of. Quote Link to comment
Bodine95 Posted February 1, 2023 Author Share Posted February 1, 2023 I have done a lot of work on my server. I have removed all unsupported plugins and removed the Raid controller and drives that were on that controller used in Unraid. I have started the parity check and will see if this solves my issues. Quote Link to comment
Bodine95 Posted February 1, 2023 Author Share Posted February 1, 2023 System crashed after a few hours of running the parity check. Here is the latest diag file and syslog file. 192.168.130.245-diagnostics-20230201-1302.zip syslog_6 Quote Link to comment
trurl Posted February 1, 2023 Share Posted February 1, 2023 2 hours ago, Bodine95 said: removed the Raid controller 03:00.0 RAID bus controller [0104]: Hewlett-Packard Company Smart Array G6 controllers [103c:323a] (rev 01) Subsystem: Hewlett-Packard Company Smart Array P410i [103c:3245] Kernel driver in use: hpsa Kernel modules: hpsa Quote Link to comment
trurl Posted February 1, 2023 Share Posted February 1, 2023 Before you complicate things by removing that one Are you sure you don't have a power problem? If you boot up without starting the array, does it still crash? Quote Link to comment
Bodine95 Posted February 2, 2023 Author Share Posted February 2, 2023 If I don’t start the parity check it runs without issues. No power issues that I can see. The unit is connected to a UPS and it show no issues with power in/out. The issue I have is that for a good year everything worked until I started to increase the parity disks from 8Tb to 14Tb. The first drive worked without any issue. When I replaced the second drive this is when my issues started. Could the second drive be bad? Quote Link to comment
trurl Posted February 2, 2023 Share Posted February 2, 2023 12 hours ago, Bodine95 said: No power issues that I can see. The unit is connected to a UPS and it show no issues with power in/out. I didn't mean power to the server, I meant power inside the server, especially to the disks. Quote Link to comment
Bodine95 Posted February 2, 2023 Author Share Posted February 2, 2023 No power issues to the drives. I am trying an experiment, I removed the 14Tb drive that I used to replace the first 8Tb parity drive (which completed successfully). I am now using the second 14Tb drive and the parity check has run for 13.5 hrs and is at 5Tb. I will keep everyone updated if this completes successfully or not. Quote Link to comment
Bodine95 Posted February 2, 2023 Author Share Posted February 2, 2023 Here is an update on the parity status. The current largest drive with data is 8Tb, it has completed readying all the other data drives and is just moving to the end of the drive. Total size:14 TB Elapsed time:20 hours, 32 minutes Current position:8.05 TB (57.5 %) Estimated speed:126.0 MB/sec Estimated finish:13 hours, 7 minutes Quote Link to comment
Bodine95 Posted February 3, 2023 Author Share Posted February 3, 2023 Update - It failed. I came to check on it 4 hours later and the had server rebooted several times and failed the parity check. From Syslog looks like it rebooted three times in quick succession. Line 58133: Feb 2 17:05:04 Line 61109: Feb 2 17:20:24 Line 64085: Feb 2 18:17:12 Any ideas would be greatly appreciated. syslog_8 Quote Link to comment
JorgeB Posted February 3, 2023 Share Posted February 3, 2023 Unfortunately there's nothing relevant logged, suggest it's a hardware problem, start by using a different PSU, if available. Quote Link to comment
trurl Posted February 3, 2023 Share Posted February 3, 2023 21 hours ago, Bodine95 said: No power issues to the drives. How do you know? How many drives per power cable? Splitters? Quote Link to comment
Bodine95 Posted February 6, 2023 Author Share Posted February 6, 2023 Changed my hardware of my server and started the parity check again. The parity check has completed to 9.37 Tb and is at the point where it failed the last two times before. I am hopeful that it passes with the changed hardware. Here is what I am using for my server: ASUSTeK COMPUTER INC. Z97M-PLUS American Megatrends Inc., Version 0330 BIOS dated: Thu 29 May 2014 12:00:00 AM PDT Intel® Core™ i7-4790 CPU @ 3.60GHz Ram: 32 GiB PSU: 1000W I have attached the latest diag file and syslog for everyone's review. 192.168.130.245-diagnostics-20230205-0705.zip syslog_10 Quote Link to comment
Solution Bodine95 Posted February 6, 2023 Author Solution Share Posted February 6, 2023 Great news, with the different hardware my parity check finished. Thank you to trurl and JorgeB for giving me great advise on what could be wrong. 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.