Jump to content

Naustradamus

Members
  • Posts

    8
  • Joined

Posts posted by Naustradamus

  1. 20 hours ago, JorgeB said:

    SMART looks OK and this is usually a power/connection problem:

     

    Apr 19 01:06:35 Oracle kernel: sd 11:0:11:0: Power-on or device reset occurred

     

    Replace cables and if the emulated disk11 contents look correct you can rebuild on top.

    Hi, sorry for the delay in responding. Saw your reply while at work, and was busy after that. Thanks for the advise! I've done so, disk rebuilt too. Hope it's not the backplane though, the other 3 drives on it weren't affected, though it was the one farthest from where the power cable plugs in.

     

    Again, very much appreciated!

    • Like 1
  2. Hi,

     

    Looking for help/second opinion please.

     

    Last night, a HDD got a read error and has been disabled by unraid, being emulated. I started a extended SMART test at the time. I'm not able to due much more yet, cause this happened during a Parity drive rebuild (changed drives to increase parity capacity).

     

    I shut down all dockers and services for now. The extended SMART test came back no errors. From what I can understand from the results, I'm planning on restarting the server once the party is done rebuilding, and keeping and eye on it going forward. That said, I don't understand all aspects of the results, so though I'd ask here incase there's other things that indicate that it is indeed failing.

     

    So bellow is both the diagnostics made now (sorry, didn't think it creating one last night when it happened) and ethe ext SMART test for the drive specifically, incase it's not in the diagnostics.

     

    Any and all advise and help is appreciated!

    oracle-diagnostics-20230419-1411.zip oracle-smart-20230419-1411.zip

  3. Thanks!

     

    Dang, looks like I might be out of luck, cause yeah, error happens again once I start the container.

    To be fare, I don't think the container was the problem, just that due to the setup listed above, it was the only container that accessed the SSD.

     

    For the moment in time, I'll move the files onto my main cache drive, and run the container from there. If that works, I can try and re-format the nvme ... see where it goes from there. Not the vacation I was expecting, oh well. It's only been over a year, still under warranty if needed.

     

    Thanks again. Incase you notice anything further, I'm adding one more diagnostic, after the container runs and causes the issue.

    oracle-diagnostics-20230302-1010.zip

  4. Thanks for replying!
    Sorry for the delay, had went to bed.

     

    Posted bellow. Restarted, kept dockers off as the error seems to happen after a certain docker opens and attempts to use the nvme.

     

    For further info, this nvme is plex dedicated. Was trying to get the thumbnails, scroll preview, metadata and all to be on SSD to make navigations and such very responsive. Not sure if that info helps, but yeah. I didn't see any error in the new diagnostics, but most likely I don't know what I'd be looking for.

     

    Thanks again for the support.

    oracle-diagnostics-20230302-0928.zip

  5. So as I've been looking into it, I had found this post linking to this.

     

    Am having a problem with step 2, as dumb as it sounds, I don't know how to create a folder on a single drive outside of a share. I'm assuming it must be while unraid is in maintenance mode, but I'm not sure of that either.

     

    On side note, found some btrfs commands and tried rescue fix-device-size, no device size related problems found.

    Bellow was the second command I tried, and what returned.

     

    Quote

    btrfs rescue chunk-recover /dev/nvme0n1p1
    Scanning: DONE in dev0                
    corrupt leaf: root=1 block=1311653888 slot=0, unexpected item end, have 16283 expect 0
    Couldn't read tree root
    open with broken chunk error

     

    Tried clear-space-cache v1 and v2, long shot at has noting to do with the error message I think but yeah, still same error on btrfs check.

     

    Not sure if a --repair would fix the type of issue, or if it is the nvme going bad. Unless someone advises me to try the repair, think my next step is the restore method but me being me, not getting the steps to get things ready before 'btrfs restore -v'

     

    Just updating to where I'm at currently

  6. Hello,

     

    Sorry, am a user but not a very technical one, so this has just happened and not very sure of how to resolve yet.

    Fix Common Issues plugging has reported to me that a drive, not my default cache drive but a second one is read only. As in the title: "unraid Drive mounted read-only or completely full. Begin Investigation Here"

     

    I think I found in the system log the error it's mentioning, yet I don't know how to decipher it.

     

    I attacked the diagnostic zip, as well as a txt file for what I see when I click on the disk log info for the drive.

    Don't think it helps at all, but made sure the drive was balance and scrubbed, did also perform a filesystem with the following (--readonly):

    Quote

    [1/7] checking root items

    [2/7] checking extents

    Error reading 1162723328, -1

    Error reading 1162723328, -1

    bad tree block 1162723328, bytenr mismatch, want=1162723328, have=0

    owner ref check failed [1162723328 16384]

    ERROR: errors found in extent allocation tree or chunk allocation

    [3/7] checking free space tree

    [4/7] checking fs roots

    Error reading 1162723328, -1

    Error reading 1162723328, -1

    bad tree block 1162723328, bytenr mismatch, want=1162723328, have=0

    [5/7] checking only csums items (without verifying data)

    Error reading 1162723328, -1

    Error reading 1162723328, -1

    bad tree block 1162723328, bytenr mismatch, want=1162723328, have=0

    Error going to next leaf -5

    [6/7] checking root refs

    [7/7] checking quota groups skipped (not enabled on this FS)

    Opening filesystem to check...

    Checking filesystem on /dev/nvme0n1p1

    UUID: 4ad3bcf9-112e-4303-b54e-ca4ba41c8365 found 287609282560 bytes used, error(s) found

    total csum bytes: 279989224 total tree bytes: 895320064

    total fs tree bytes: 493961216

    total extent tree bytes: 48168960

    btree space waste bytes: 206223021

    file data blocks allocated: 288377303040

     referenced 286613630976



    Was thinking of doing a filesystem check with "--repair", but from what I could find, it's very much advised to avoid, plus I wouldn't know how to enter it in, cause I think it wants a input after to confirm, so either would need to do it from the console, or something like "--repair -y"

     

    My apologies if if this seems a easy thing, just not the most technical. Also worried it's the drive, it's a Samsung 970 evo plus 1tb, a bit over a year old. Am planning on adding another for parity this year, just figured they'd last longer (I did read recently about samsung drives failing, but thought my model was in the clear)

     

    Thanks in advance for any assistance provided, though I'll thank you again in the replies.

    oracle-diagnostics-20230301-2036.zip nvme0n1p1.txt

×
×
  • Create New...