-
Bizarre Issues After Drive Failures and Parity Rebuild
OK, sounds great. Thank you.
-
Bizarre Issues After Drive Failures and Parity Rebuild
There is a lost+found folder. Is that considered the corruption? Do I need to do something with those files?
-
Bizarre Issues After Drive Failures and Parity Rebuild
as requested . . . Phase 1 - find and verify superblock... - block cache size set to 3026936 entries Phase 2 - using internal log - zero log... zero_log: head block 316672 tail block 316672 - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 data fork in inode 2147495232 claims metadata block 268635728 correcting nextents for inode 2147495232 bad data fork in inode 2147495232 rebuilding inode 2147495232 data fork Metadata corruption detected at 0x489748, inode 0x80002d40 dinode inode 2147495232 data fork rebuild failed, error 117, clearing cleared inode 2147495232 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 6 - agno = 3 - agno = 4 - agno = 1 - agno = 2 - agno = 5 - agno = 7 entry "Bugonia.2025.1080p.AMZN.WEB-DL.DDP5.1.Atmos.H.264-FLUX.mkv" in shortform directory 2147493503 references free inode 2147495232 junking entry "Bugonia.2025.1080p.AMZN.WEB-DL.DDP5.1.Atmos.H.264-FLUX.mkv" in directory inode 2147493503 Phase 5 - rebuild AG headers and trees... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... resetting inode 66987112 nlinks from 4 to 3 resetting inode 6576475519 nlinks from 4 to 3 resetting inode 2185476466 nlinks from 8 to 7 resetting inode 8661071999 nlinks from 5 to 4 resetting inode 2187127208 nlinks from 4 to 3 resetting inode 10737964385 nlinks from 7 to 6 resetting inode 12947217367 nlinks from 5 to 4 resetting inode 15033114733 nlinks from 8 to 7 resetting inode 15033550189 nlinks from 6 to 5 resetting inode 15034299570 nlinks from 3 to 2 resetting inode 15035620842 nlinks from 5 to 4 XFS_REPAIR Summary Sun Dec 21 10:40:57 2025 Phase Start End Duration Phase 1: 12/21 10:40:49 12/21 10:40:50 1 second Phase 2: 12/21 10:40:50 12/21 10:40:50 Phase 3: 12/21 10:40:50 12/21 10:40:53 3 seconds Phase 4: 12/21 10:40:53 12/21 10:40:53 Phase 5: 12/21 10:40:53 12/21 10:40:53 Phase 6: 12/21 10:40:53 12/21 10:40:56 3 seconds Phase 7: 12/21 10:40:56 12/21 10:40:56 Total run time: 7 seconds done
-
Bizarre Issues After Drive Failures and Parity Rebuild
OK, so I decided to give the server a standard reboot and everything has returned to normal function. I can view the contents of disk 5 again without issue, and apparently there are 1.04TB of data on that disk instead of 177GB it had been showing previously. Should I assume the disk is good, or should I still attempt to correct the alleged file system corruption on that disk?
-
Bizarre Issues After Drive Failures and Parity Rebuild
Running the server in Safe Mode I ran the disk Check and received a "File system corruption detected" response. I ran the fix and the result was a dirty log, so I chose to zero the log which resulted in a response of "File system corruption fixed" - cool. I decided to run a follow up Check to verify the fix and I got a "File system corruption detected" response again. Thoughts?
-
Bizarre Issues After Drive Failures and Parity Rebuild
OK, so attempts to start the array in maintenance mode are failing: TL;DR: I'm unable to start the array in Maintenance Mode I have checked the Maintenance Mode box and clicked Start 4x. I believe this is reflected at the bottom this shot from the syslog: Each time I select Start the UNRAID working symbol (I just learned that's apparently called a "throbber"?) pops up for 2 seconds and the Main screen returns to how it was showing the array is not started. Unfortunately Disk 5 seems to agree the array isn't started: System logs are attached if needed calculon-diagnostics-20251219-1222.zip
-
Bizarre Issues After Drive Failures and Parity Rebuild
Thank you, I finished the extended SMART Test and will proceed with the filesystem check SMART test results are attached in case they provide any value calculon-smart-20251219-1115.zip
-
Bizarre Issues After Drive Failures and Parity Rebuild
Ok, Something about writing everything out must have spurred some thought because I decided to compare the cache and array contents. What I discovered is that it seems everything currently residing on the cache drive is only new since the parity rebuild. The difference across between the cache drive download share and what is contained in the download share on the array disks represent are a perfect match to what I was seeing in Krusader during reboots (2 separate shares). More concerning is that I've now noticed that while the "main" page in the UNRAID GUI shows everything is good with Disk 5, when I click the "browse" button at the far left to view the contents, it loads a blank page that simply says "Invalid path" instead of showing the disk contents (177GB worth). I'm thinking my parity rebuild did fail. Disk 5 is the original disk I manually moved the data off via Krusader before rebuilding on a new disk. I'm not sure why I believe I was able to make space on the cache drive using Mover now though, as it seems something is preventing anything from actually moving from cache to array. Perhaps items that I had been deleting were now actually represented as free space as a result of mover? Maybe it was in my head. I'm sure that's irreverent now. I'm guessing despite my array being "Healthy" that it's actually not and I will need to rebuild with a new configuration, and whatever is on Disk 5 is lost? For now I am going to run an Extended SMART test on Disk 5 and wait your direction. A quick test came back without error.
-
Bizarre Issues After Drive Failures and Parity Rebuild
This is a complicated issue, and I'm not sure where to begin, but I'll try my best to describe as briefly, but effectively, as possible. About 2 weeks ago I had 2 drives fail on my dual parity array. I first attempted to fix the perceived root cause of power delivery issues, then proceeded with a parity rebuild. After the rebuild completed (seemingly successfully) about a week ago I've been seeing bizarre behavior that got worse today (described below). Background: Because I have historically had a lot of drives becoming disabled, I decided before proceeding with the parity rebuild to move the files off one of the emulated drives that had comparatively little data, then swap out the drive and rebuild from parity. While moving this data manually from the emulated disk to a good disk via Krusader, I ran into an issue where Krusader appdata wouldn't properly move to the good disk. In retrospect this is probably because I was using Krusader and I couldn't move the app data? Some of that appdata moved, as did all the appdata from the other dockers. Anyway, I don't know if that lead to the issues I figured I should mention it. The other scary thing that happened is that I had another disk issue during the rebuild. I don't remember exactly what happened (there was a lot going on) but I believe I had a separate 3rd drive suddenly show as not present during the parity rebuild. Againn, I'm not sure but something that had me very nervous because there were now 3 drives with issues at the same time. I do recall I paused the rebuild, rebooted the server and that drive returned to normal operation, so I proceeded with the rebuild which eventually completed. Again, seemingly without issue. The current issues: So once the rebuild finished, everything seemed to be operating normally and without issue at first. At some point shortly after though I noticed operations failing because they seemed to not have permissions. First I noticed I would be unable to download something through SABNZB which was giving an error that it was unable to create the necessary folder. I also noticed that the download share in Krusader had different files than what had been in there the day before. At this point I did a server reboot and everything returned (correct files in download share via Krusader & SABNZB able to download again). It was almost like the server was recognizing a separate set of shares (or at least one for downloads). I assumed it was a glitch, but the next day I woke up to alerts that appdata backup had failed. I believe the log for that showed a similar permission issue for the backup as SABNZB had been showing. I restarted the server again and was able to manually run a successful appdata backup though. These issues persisted where I would need to restart my server at least daily. Everything would be running like it historically had, then suddenly it seemed like there was a reconfiguration (or something) and many operations were unable to function. Reboots always fixed it until today. Today I was continuing to download some files while researching potential fixes and the cache drive filled up. I had noticed this has been an issue in the last few days though I previously had been able to manually clear SOME space with Mover and keep operating. Today though, most of the space stayed full so I cancelled the downloads and I rebooted the server, but this time I couldn't get Krusader the Krusader GUI to load. All my other Dockers seem to be working still. I deleted and reinstalled the docker but it still wont load. In fact the Docker logs for Krusader wont come up either. The log window shows up for a split second and vanishes. I assume all these issues are related to some kind of corruption, but I have no idea how to proceed with troubleshooting. There is additional detail I could add but this is already way too long and I think I hit the broad strokes. Logs attached. calculon-diagnostics-20251218-1851.zip
-
Diasabled Disk and Suspected Power Issues
So I was able to swap my HBA SAS expander to a newer 12GB/s version (for what that's worth) and was able to get my drive power split down to 3 HDD per rail. It's been a little over a week and no drive issues yet - fingers crossed. Unfortunately I did have some issues with the parity rebuild, and am now having strange behavior that seems to be worsening. This issue can be closed though.So I was able to swap my HBA SAS expander to a newer 12GB/s version (for what that's worth) and was able to get my drive power split down to 3 HDD per rail. It's been a little over a week and no drive issues yet - fingers crossed. Unfortunately I did have some issues with the parity rebuild, and am now having strange behavior that seems to be worsening. This issue can be closed though.
-
Diasabled Disk and Suspected Power Issues
I did meant it, but it sounds like there may be necessary detail that is still only in my head. I'll try to provide as much as I can here: PSU: EVGA Supernova 750G+ (modular) This has 5 total (3xSATA & 2xPERIF) connectors which I understood to be identical The PSU came with 3x SATA cables which end in 3 connectors, as well as 2x molex cables which also end into 3 connectors What I had done previously was purchase Cable Matters 1-to-1 molex to SATA adapters and attach them to the 2x molex power cables (connected to the PSU PERIF connector) so I was running 5 total cables from the PSU (3 drives per cable). This is when the 1 drive would not show as present after reboot. I'm doing some digging now and it seems the PERIF connector on the PSU end is different in that in may not have the 3.3v pin (makes sense). Since I have a mix of SAS/SATA drives, it's possible the drive wasn't showing without that 3.3v pin? My current request to EVGA is though my product registration for this specific PSU, so the request is hopefully a clear attempt to get cables specific to this power supply. I had read EVGA is a good company and willing to do that. I'm not sure if that's true, but I guess I'll find out. 120-GP-0750-X1.pdf
-
Diasabled Disk and Suspected Power Issues
That's interesting. I actually had previously tried to use the molex->SATA power cables included with my power supply so that I could reduce the amount of splits. Unfortunately every time I powered on the sever one of the drives wouldn't show up (I don't recall exactly how it was represented in the UI, but I think as just not present). After 3 attempts to reseat and double check everything I gave up and went back to the 1 into 5 SATA set up I'm currently running. Obviously I'm still running into disable issues, but at least all the drives are present. Perhaps I should instead see if EVGA will send more molex cables instead.
-
Diasabled Disk and Suspected Power Issues
Oh, I guess I never noticed that. I only recently set up my alerts so that's what caught my attention. Thanks again
-
Diasabled Disk and Suspected Power Issues
OK, this is a new one to me. Yesterday I had stopped my array for a bit, then I restarted it. I believe it was at this time that the I received this alert from the server: Event: Unraid array errors Subject: Notice [CALCULON] - array turned good Description: Array has 0 disks with read errors Can you help me understand? How did the read errors just dissapear? I didn't realize that could happen. Do you have thoughts on what that could potentially mean in this case? The 2 data drives are still dissabled, but it is accurate that no read errors are displayed in the UI. calculon-diagnostics-20251202-0927.zip
-
Diasabled Disk and Suspected Power Issues
Thank you for the clarification Jorge. As always I appreciate it. I am still going to update the HBA to a modern 12Gb version as they are cheap. In addition, I will contact EVGA to see about getting 2 addition SATA cables. I do have 5 SATA/PERIF outputs on my power supply but only 3 SATA cables. This is why I'm running 3-5 drives off each cable. I will also purchase all new splitters. Hopefully getting the down to the 3 drives/PSU cable with new cabling will help.
beatmurph
Members
-
Joined
-
Last visited