cdixon

Members
  • Posts

    29
  • Joined

  • Last visited

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

cdixon's Achievements

Newbie

Newbie (1/14)

0

Reputation

  1. I already rebooted. What could I do to figure why this happened?
  2. This morning, I checked unraid and I noticed that one of my hard drives was disabled due to 1 read error. After noticing this I restarted the server and put the array in maintenance mode and ran a short smart test. The self test turned out fine and disc 6 is now being rebuilt from the parity. Why did the drive get disabled even though it appears to be healthy? (I included the test results of disc 6 in this post) Thanks in advance! ST8000AS0002-1NA17Z_Z840Q09L-20210902-1243.txt
  3. tower-diagnostics-20210226-1214.zip I have had my unraid server up and running for 3 1/2 years and the other day I ran extended SMART extended self-tests on all of the drives in my system. I took a look at the raw read error rates for the drives and it seems like the raw values might be too high. It says this even though the drives passed the health test. Are my drives failing? And considering the test results, at what point should I expect them to fail?
  4. I have an unraid server that I have had up and running for 3 years. Recently the server has been frequently freezing up and once every week I have to reboot it in order to access it. What should I do? I posted the server diagnostics for you guys to look over. Thank you in advance. tower-diagnostics-20201212-2142.zip
  5. Here you go. tower-diagnostics-20200914-1129.zip
  6. I have had an issue with my unraid server recently where the array occasionally freezes up and I have to reboot the server to get things running normally. Yesterday, the array froze up and of course I rebooted the server. But after I rebooted it, I found disc 1 appearing as missing and the array set to stale configuration. What should I do? syslog.txt
  7. That's the problem. In the settings, I have Error logging set to Syslog and Output file but I am still unable to find the corrupt files with this option.
  8. I ran build and export on one of the discs in my server. During the export process many files were skipped and many corrupt files were found. I am unable to locate any corrupt files because no log was posted after the export process. After this, I ran verification on disc 4. The process ran for 31 hours and no log was posted. How can I locate the corrupt files?
  9. But why did the verification process take so long and why am I unable to find the corrupt files?
  10. Here you go. ST8000DM004-2CX188_ZCT0KKRJ-20200810-0427.txt
  11. I have been using Dynamix File Integrity to scan for corrupt files in my Unraid server. This week I ran build and export on disc 4 on my server and a problem came up. During the export process, many files were skipped and many corrupt files were found. However, I was unable to find out which files were corrupt and I ran a verification task on disc 4 afterwards. The verification process took 31 hours (The usual for me is 8 hours) and for some reason a log file for disc 4 was not created. I figured the drive might be failing so I decided to run an extended SMART test on disc 4 and got the following results: 1Raw read error rate0x000f100064006Pre-failAlwaysNever1882704 3Spin up time0x0003092092000Pre-failAlwaysNever0 4Start stop count0x0032099099020Old ageAlwaysNever1239 5Reallocated sector count0x0033100100010Pre-failAlwaysNever0 7Seek error rate0x000f082060045Pre-failAlwaysNever171924857 9Power on hours0x0032088088000Old ageAlwaysNever10530 (142 161 0) 10Spin retry count0x0013100100097Pre-failAlwaysNever0 12Power cycle count0x0032100100020Old ageAlwaysNever122 183Runtime bad block0x0032100100000Old ageAlwaysNever0 184End-to-end error0x0032100100099Old ageAlwaysNever0 187Reported uncorrect0x0032100100000Old ageAlwaysNever0 188Command timeout0x0032100100000Old ageAlwaysNever0 189High fly writes0x003a100100000Old ageAlwaysNever0 190Airflow temperature cel0x0022065059040Old ageAlwaysNever35 (min/max 26/41) 191G-sense error rate0x0032100100000Old ageAlwaysNever0 192Power-off retract count0x0032100100000Old ageAlwaysNever50 193Load cycle count0x0032098098000Old ageAlwaysNever5418 194Temperature celsius0x0022035041000Old ageAlwaysNever35 (0 23 0 0 0) 195Hardware ECC recovered0x001a100064000Old ageAlwaysNever1882704 197Current pending sector0x0012100100000Old ageAlwaysNever0 198Offline uncorrectable0x0010100100000Old ageOfflineNever0 199UDMA CRC error count0x003e200200000Old ageAlwaysNever0 240Head flying hours0x0000100253000Old ageOfflineNever3133 (78 218 0) 241Total lbas written0x0000100253000Old ageOfflineNever116040943665 242Total lbas read0x0000100253000Old ageOfflineNever314938833407 Is this drive failing? And how can I find out which files are corrupt?
  12. I have an Unraid server that I have had up and running for over three years. A couple of weeks ago, I attempted to run a parity sync but one of the drives (Drive 4) had read errors during the process. I ran a test for the drive but it apparently seems healthy. I deleted all of the corrupt files that I found on the drive and ran the parity sync again but the drive still has read problems during the process. All in all, I have tried everything and don't know what I should do. What should I do to set things back to normal? tower-diagnostics-20200529-1310.zip
  13. I recently ran a dynamix file integrity test on a drive that has read errors. According to the log there are over 90 MKV files with a SHA256 hash key mismatch but even though it states that the files are corrupt they play perfectly. I used a hash file manager to rewrite the hash keys for the files but this did not work. How can I fix these files?
  14. I recently ran a dynamix file integrity test on a drive that has read errors. According to the log there are over 90 MKV files with a SHA256 hash key mismatch but even though it states that the files are corrupt they play perfectly. I used a hash file manager to rewrite the hash keys for the files but this did not work. How can I fix these files?
  15. I have an 40 tb unraid server set up at home that use as a media server. The server has five drives used to store the video files and two parity drives. For the past three years that I have had the server set up, disk two has had some errors now and then, but has never shown any signs of failure or impending failure. The other day I delted a file from the server through network and now disk 2 is unmountable. I tried unplugging the SATA cables and plugging them back in again but that did not work. I then ran a SMART short self-test for disk 2 and these are results: #Attribute NameFlagValueWorstThresholdTypeUpdatedFailedRaw Value 1Raw read error rate0x000f117099006Pre-failAlwaysNever137499552 3Spin up time0x0003091090000Pre-failAlwaysNever0 4Start stop count0x0032097097020Old ageAlwaysNever3638 5Reallocated sector count0x0033100100010Pre-failAlwaysNever0 7Seek error rate0x000f084060030Pre-failAlwaysNever4612661310 9Power on hours0x0032074074000Old ageAlwaysNever22887 (2y, 7m, 10d, 15h) 10Spin retry count0x0013100100097Pre-failAlwaysNever0 12Power cycle count0x0032100100020Old ageAlwaysNever204 183Runtime bad block0x0032100100000Old ageAlwaysNever0 184End-to-end error0x0032100100099Old ageAlwaysNever0 187Reported uncorrect0x0032100100000Old ageAlwaysNever0 188Command timeout0x0032100099000Old ageAlwaysNever25770196998 189High fly writes0x003a100100000Old ageAlwaysNever0 190Airflow temperature cel0x0022071062045Old ageAlwaysNever29 (min/max 29/30) 191G-sense error rate0x0032100100000Old ageAlwaysNever0 192Power-off retract count0x0032081081000Old ageAlwaysNever39551 193Load cycle count0x0032071071000Old ageAlwaysNever59687 194Temperature celsius0x0022029040000Old ageAlwaysNever29 (0 16 0 0 0) 195Hardware ECC recovered0x001a117099000Old ageAlwaysNever137499552 197Current pending sector0x0012100100000Old ageAlwaysNever0 198Offline uncorrectable0x0010100100000Old ageOfflineNever0 199UDMA CRC error count0x003e200200000Old ageAlwaysNever0 240Head flying hours0x0000100253000Old ageOfflineNever5054 (115 152 0) 241Total lbas written0x0000100253000Old ageOfflineNever231932062510 242Total lbas read0x0000100253000Old ageOfflineNever539967992669 After this I ran an XFS repair with the -nv option and this is what came up: Phase 1 - find and verify superblock... - block cache size set to 2960240 entries Phase 2 - using internal log - zero log... zero_log: head block 245530 tail block 245514 - scan filesystem freespace and inode maps... ir_freecount/free mismatch, inode chunk 7/59975168, freecount 31 nfree 7 agi_freecount 162, counted 193 in ag 7 agi_freecount 136, counted 137 in ag 6 agi unlinked bucket 58 is 50974650 in ag 6 (inode=12935876538) ir_freecount/free mismatch, inode chunk 5/4960608, freecount 63 nfree 1 agi_freecount 150, counted 213 in ag 5 sb_ifree 1150, counted 1246 sb_fdblocks 24503680, counted 26690340 - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 imap claims in-use inode 10742378880 is free, correcting imap - agno = 6 imap claims in-use inode 12935876540 is free, correcting imap - agno = 7 imap claims in-use inode 15092360749 is free, correcting imap imap claims in-use inode 15092360750 is free, correcting imap imap claims in-use inode 15092360751 is free, correcting imap imap claims in-use inode 15092360752 is free, correcting imap imap claims in-use inode 15092360753 is free, correcting imap imap claims in-use inode 15092360754 is free, correcting imap imap claims in-use inode 15092360755 is free, correcting imap - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 2 - agno = 4 - agno = 1 - agno = 5 - agno = 3 - agno = 7 - agno = 6 No modify flag set, skipping phase 5 Inode allocation btrees are too corrupted, skipping phases 6 and 7 No modify flag set, skipping filesystem flush and exiting. XFS_REPAIR Summary Sun Mar 15 17:01:16 2020 Phase Start End Duration Phase 1: 03/15 17:01:12 03/15 17:01:12 Phase 2: 03/15 17:01:12 03/15 17:01:12 Phase 3: 03/15 17:01:12 03/15 17:01:16 4 seconds Phase 4: 03/15 17:01:16 03/15 17:01:16 Phase 5: Skipped Phase 6: Skipped Phase 7: Skipped Total run time: 4 seconds What should I do to set everything back to normal?