Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Power outage, multiple problems

Featured Replies

Hi all. Just suffered a power outage at my house, no UPS. (Learning the hard way... one has since been ordered)

 

When the server came back up I found the following:

1. Disk 1 was missing (wouldn't even show up under unassigned devices)

2. Cache pool showed uncorrectable errors during a scrub

 

Strangely, even though the server had uncleanly shut down, there was no parity check. Maybe because Disk 1 was missing?

 

Anyway, I ran the parity check, and after 20 minutes checked back in to see Disk 1 sitting in the UD section. I guess it needed time to wake back up after getting the power cut. All the SMART attributes looked fine but I wanted to make sure that it won't die on me again or something, so I ran a short SMART (passed) and a long SMART (still running, 20% left at this point) to make sure it won't die again during the rebuild.

 

Now since the server is configured in dual parity I'm not worried about data loss. What I am worried about is the cache pool since that holds the appdata. I do keep daily backups with the backup plugin, so worst case scenario is I lose about less than a day's worth of changes. But I wanted to see if I can just use the current data. However, the scrub command won't tell me exactly which files are affected.

 

In the logs I see the following:

 

[42493.446568] BTRFS error (device dm-3): bdev /dev/mapper/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 32, gen 0
[42493.446579] BTRFS error (device dm-3): unable to fixup (regular) error at logical 82579787776 on dev /dev/mapper/nvme0n1p1
[42493.929861] BTRFS error (device dm-3): bdev /dev/mapper/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 33, gen 0
[42493.929871] BTRFS error (device dm-3): unable to fixup (regular) error at logical 85866598400 on dev /dev/mapper/nvme0n1p1
[42496.080932] BTRFS error (device dm-3): bdev /dev/mapper/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 34, gen 0
[42496.080942] BTRFS error (device dm-3): unable to fixup (regular) error at logical 100984004608 on dev /dev/mapper/nvme0n1p1
[42501.552701] BTRFS error (device dm-3): bdev /dev/mapper/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 35, gen 0
[42501.552709] BTRFS error (device dm-3): unable to fixup (regular) error at logical 136209657856 on dev /dev/mapper/nvme0n1p1


But when I run `btrfs inspect-internal logical-resolve <logical_address> /mnt/cache` it returns nothing for all four affected addresses.

 

BTRFS check (read-only) reports that there are no errors.

 

I've stopped the Docker service to prevent further damage (stopped before all the checking and triage – so about 15-20 minutes after the server came up) and had a backup taken (again, with the uncorrectable errors – so I'm guessing some files in there are corrupt?)

 

At this point this is my plan for the day:

1. Wait for parity check to complete (hopefully without any errors; if there are then it's another 14 hours worth of parity checks with the write corrections turned on. Is there a way to just fix the errors without rerunning the entire thing?)

2. Stop the array

3. Wipe cache pool (how? Can't find the format button... does it show up when the array is stopped?)

4. Make sure long SMART for dev1 passes

5. Reassign dev1 to Disk 1

6. Start array

7. While rebuilding, restore appdata backup

 

Is the plan okay or should I do something else? Anything I've missed? Feedback would be appreciated!

 

Attached diagnostics.

dipper-diagnostics-20230326-1022.zip

  • Author

So the parity check completed without errors, so I stopped the array to begin maintenance.

 

Wiping the cache pool was relatively simple (I was definitely overthinking things – for future reference, unassign the SSD from the pool, run `blkdiscard -f /dev/nvmeXnX` with your drive number, then reassign to pool, and erase).

 

Once I verified that the long SMART passed for dev1 I re-assigned the drive back to Disk 1 and started the array. Then I rebuilt the Docker images and started restoring the appdata.

 

Hopefully this helps someone else who has the same issue!

  • Author

I also had to change configurations of some Docker apps, because the underlying Docker network changed from 172.18.0.x to 172.17.0.x. Something to keep in mind if some services are not resolving after the rebuild.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.