Jump to content

privateer

Members
  • Posts

    110
  • Joined

  • Last visited

Posts posted by privateer

  1. I'm using docker folder, and moved my cache to the array, then changed the drive from BTRFS to XFS, and moved the files back. This process had some hiccups but I think all the info should be back on the cache drive. No folders related to docker are still on the array. When I enable docker, it shows I have no containers.

     

    Here is a snap of my docker config:

    image.thumb.png.a9bd39c142a1b0447b78c38ff38c69cb.png

     

    Inside my docker folder on my cache, I have a folder called BTRFS. This is inside the correct folder.

    image.png.6329ec6ff196bad422be3ad8f951cf31.png

     

    Does this need to be renamed / is this related to my issue?

     

    Just need to reinstall every one of the containers?

  2. I've set my shares to move everything from cache to array, however I have 13.1g left on the cache that doesn't seem to move.

     

    Docker is off, VMs off. Mover logging enabled.

     

    Mover log shows lots of 'file exists' which when I check the array, the file does, indeed already exist. However, the log also just shows the mover hanging/freezing, and the mover command doesn't actually show that it completes in the log. I end up having to manually have to stop it.

     

    With mover not completing I'm nervous about pulling/formatting the disk. Any idea what's going on or when I should feel OK to pull the disk?

  3. 2 hours ago, trurl said:

    If you did the check of the sd device and not the md device then that would invalidate parity. Checking md device keeps parity in sync with changes.

     

    I did the MD device per the instructions in the Unraid docs.

     

    2 hours ago, trurl said:

    Did you do the filesystem check from the command line? Sounds like you may have gotten the command wrong and invalidated parity. Better to use the webUI it will use the correct command.

     

    I used the UI, not the command line.

  4. 1 hour ago, trurl said:

    Check filesystem on disk17

     

    Oof.

     

    Phase 1 - find and verify superblock...
    Phase 2 - using internal log
            - zero log...
    ALERT: The filesystem has valuable metadata changes in a log which is being
    ignored because the -n option was used.  Expect spurious inconsistencies
    which may be resolved by first mounting the filesystem to replay the log.
            - scan filesystem freespace and inode maps...
    agf_freeblks 9058410, counted 9058399 in ag 2
    agi_count 1984, counted 2048 in ag 2
    agi_freecount 21, counted 13 in ag 2
    agi_freecount 21, counted 13 in ag 2 finobt
    sb_icount 34880, counted 35744
    sb_ifree 506, counted 455
    sb_fdblocks 603095616, counted 616701578
            - found root inode chunk
    Phase 3 - for each AG...
            - scan (but don't clear) agi unlinked lists...
            - process known inodes and perform inode discovery...
            - agno = 0
    Metadata corruption detected at 0x438a03, xfs_inode block 0xa63aa00/0x4000
    Metadata corruption detected at 0x438a03, xfs_inode block 0xa63aa20/0x4000
    bad CRC for inode 174303744
    bad magic number 0x16 on inode 174303744
    bad version number 0xffffffaa on inode 174303744
    bad next_unlinked 0xc0fc21c3 on inode 174303744
    inode identifier 9179762597904482375 mismatch on inode 174303744
    bad CRC for inode 174303745
    bad magic number 0xccd3 on inode 174303745
    bad version number 0xffffffba on inode 174303745
    inode identifier 3371903399051595482 mismatch on inode 174303745
    bad CRC for inode 174303746
    bad magic number 0xdbe0 on inode 174303746
    bad version number 0xffffffdf on inode 174303746
    inode identifier 50872455103499597 mismatch on inode 174303746
    bad CRC for inode 174303747
    bad magic number 0xdd24 on inode 174303747
    bad version number 0xffffffbd on inode 174303747
    bad next_unlinked 0x9f522d11 on inode 174303747
    inode identifier 9340838863122723239 mismatch on inode 174303747
    bad CRC for inode 174303748
    bad magic number 0x2043 on inode 174303748
    bad version number 0xffffffa2 on inode 174303748
    bad next_unlinked 0xf10ffca3 on inode 174303748
    inode identifier 1184165229794778217 mismatch on inode 174303748
    bad CRC for inode 174303749
    bad magic number 0x66b7 on inode 174303749
    bad version number 0x79 on inode 174303749
    bad next_unlinked 0xb51219b6 on inode 174303749
    inode identifier 14679859918268388760 mismatch on inode 174303749

     

    Lots more of the bad crc, bad magic, bad version, bad_next, inode lines.

     

    Several of these:

    imap claims a free inode 1155669479 is in use, would correct imap and clear inode

     

    A few of these with various folder names:

    entry "[FOLDER NAME]" at block 0 offset 152 in directory inode 6600634561 references free inode 1155669489
    	would clear inode number in entry at offset 152...

     

    These as well:

    entry "[FOLDER NAME]" in shortform directory 32911946759 references free inode 2600817779
    would have junked entry "[FOLDER NAME]" in directory inode 32911946759

     

    Many of both of these:

    disconnected dir inode 4888060274, would move to lost+found
    
    and
    
    would have reset inode 6600634561 nlinks from 164 to 140

     

  5. Dec 30 19:50:23 Tower kernel: ata4.00: exception Emask 0x10 SAct 0x80000000 SErr 0x4090000 action 0xe frozen
    Dec 30 19:50:23 Tower kernel: ata4.00: irq_stat 0x00400040, connection status changed
    Dec 30 19:50:23 Tower kernel: ata4: SError: { PHYRdyChg 10B8B DevExch }
    Dec 30 19:50:23 Tower kernel: ata4.00: failed command: READ FPDMA QUEUED
    Dec 30 19:50:23 Tower kernel: ata4.00: cmd 60/20:f8:a0:00:00/00:00:00:02:00/40 tag 31 ncq dma 16384 in
    Dec 30 19:50:23 Tower kernel:         res 40/00:f8:a0:00:00/00:00:00:02:00/40 Emask 0x10 (ATA bus error)
    Dec 30 19:50:23 Tower kernel: ata4.00: status: { DRDY }
    Dec 30 19:50:23 Tower kernel: ata4: hard resetting link
    Dec 30 19:50:26 Tower kernel: ata1: link is slow to respond, please be patient (ready=0)
    Dec 30 19:50:26 Tower kernel: ata2: link is slow to respond, please be patient (ready=0)
    Dec 30 19:50:29 Tower kernel: ata4: link is slow to respond, please be patient (ready=0)
    Dec 30 19:50:30 Tower kernel: ata1: COMRESET failed (errno=-16)
    Dec 30 19:50:30 Tower kernel: ata2: COMRESET failed (errno=-16)
    Dec 30 19:50:30 Tower kernel: ata2: hard resetting link
    Dec 30 19:50:31 Tower kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
    Dec 30 19:50:31 Tower kernel: ata1.00: configured for UDMA/133
    Dec 30 19:50:33 Tower kernel: ata4: COMRESET failed (errno=-16)
    Dec 30 19:50:33 Tower kernel: ata4: hard resetting link
    Dec 30 19:50:33 Tower kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
    Dec 30 19:50:33 Tower kernel: ata2.00: configured for UDMA/133
    Dec 30 19:50:33 Tower kernel: ata2: EH complete
    Dec 30 19:50:38 Tower kernel: ata4: link is slow to respond, please be patient (ready=0)
    Dec 30 19:50:43 Tower kernel: ata4: COMRESET failed (errno=-16)
    Dec 30 19:50:43 Tower kernel: ata4: hard resetting link
    Dec 30 19:50:46 Tower kernel: ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
    Dec 30 19:50:46 Tower kernel: ata4.00: configured for UDMA/133
    Dec 30 19:50:46 Tower kernel: ata4: EH complete
    Dec 30 19:50:47 Tower kernel: ata2.00: exception Emask 0x10 SAct 0x700027 SErr 0x4890000 action 0xe frozen
    Dec 30 19:50:47 Tower kernel: ata2.00: irq_stat 0x0c400040, interface fatal error, connection status changed
    Dec 30 19:50:47 Tower kernel: ata2: SError: { PHYRdyChg 10B8B LinkSeq DevExch }
    Dec 30 19:50:47 Tower kernel: ata2.00: failed command: READ FPDMA QUEUED
    Dec 30 19:50:47 Tower kernel: ata2.00: cmd 60/00:00:20:58:7d/04:00:85:01:00/40 tag 0 ncq dma 524288 in
    Dec 30 19:50:47 Tower kernel:         res 40/00:10:60:5d:7d/00:00:85:01:00/40 Emask 0x10 (ATA bus error)
    Dec 30 19:50:47 Tower kernel: ata2.00: status: { DRDY }
    Dec 30 19:50:47 Tower kernel: ata2.00: failed command: READ FPDMA QUEUED
    Dec 30 19:50:47 Tower kernel: ata2.00: cmd 60/40:08:20:5c:7d/01:00:85:01:00/40 tag 1 ncq dma 163840 in
    Dec 30 19:50:47 Tower kernel:         res 40/00:10:60:5d:7d/00:00:85:01:00/40 Emask 0x10 (ATA bus error)
    Dec 30 19:50:47 Tower kernel: ata2.00: status: { DRDY }
    Dec 30 19:50:47 Tower kernel: ata2.00: failed command: READ FPDMA QUEUED
    Dec 30 19:50:47 Tower kernel: ata2.00: cmd 60/d0:10:60:5d:7d/03:00:85:01:00/40 tag 2 ncq dma 499712 in
    Dec 30 19:50:47 Tower kernel:         res 40/00:10:60:5d:7d/00:00:85:01:00/40 Emask 0x10 (ATA bus error)
    Dec 30 19:50:47 Tower kernel: ata2.00: status: { DRDY }
    Dec 30 19:50:47 Tower kernel: ata2.00: failed command: READ FPDMA QUEUED
    Dec 30 19:50:47 Tower kernel: ata2.00: cmd 60/00:28:30:61:7d/04:00:85:01:00/40 tag 5 ncq dma 524288 in
    Dec 30 19:50:47 Tower kernel:         res 40/00:10:60:5d:7d/00:00:85:01:00/40 Emask 0x10 (ATA bus error)

     

    I have seen a bunch of errors related to ata but not sure what's exactly triggering things. I removed the cable attached to what's labeled as SATA3_2 on my mobo but still getting these issues. I rebooted while attempting to solve this but couldn't reboot from GUI, had to use the button. Then unraid couldn't unmount all the drives so I forced an unclean shutdown. Came back up and Drive 17 fell out of the array with no prior warning.

     

    When I go in for attributes on Disk 17 it has a high raw read rate and a high seek error rate but not sure if that's being caused by bad cables, or other hardware issue than the disk.

     

    I think I also may be triggering it when running mover but can't tell.

     

    Any thoughts? I

    tower-diagnostics-20231230-1952.zip

  6. 47 minutes ago, JorgeB said:

    That disk appears to be failing, you can run an extended SMART test to confirm, if it fails replace it.

     

    Command "Execute SMART Extended self-test routine immediately in off-line mode" failed: scsi error medium or hardware error (serious)

     

    Any way to recover the data on this drive since I can't start the array (was replacing a parity drive when this died, so no parity). Assuming drive is fried.

     

    Additionally, if this drive is dead, can you confirm how I get the array back up without this disk, but maintaining the data on the remaining disks?

    Is it tools -> new config -> preserve current assignments (all) that will put the array back up preserving the remaining data on the other disks?

  7. Was adding disks to array, swapping out parity, physically moving devices in my box. In the process of all this starting and stopping of arrays and rebooting, I had a drive show up as missing and was listed under unassigned devices. The only option for this drive in unassigned devices is to format it, which I have not done as I have data on this drive.

     

    I've swapped SATA cables to the drive as well as power to the drive. No change.

     

    I see the below error:

    Oct 28 14:23:01 Tower kernel: ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
    Oct 28 14:23:01 Tower kernel: ata5.00: irq_stat 0x40000001
    Oct 28 14:23:01 Tower kernel: ata5.00: failed command: READ DMA EXT
    Oct 28 14:23:01 Tower kernel: ata5.00: cmd 25/00:08:80:ff:ff/00:00:74:05:00/e0 tag 0 dma 4096 in
    Oct 28 14:23:01 Tower kernel:         res 53/40:08:80:ff:ff/00:00:74:05:00/40 Emask 0x9 (media error)
    Oct 28 14:23:01 Tower kernel: ata5.00: status: { DRDY SENSE ERR }
    Oct 28 14:23:01 Tower kernel: ata5.00: error: { UNC }
    Oct 28 14:23:01 Tower kernel: ata5.00: Read log 0x13 page 0x00 failed, Emask 0x1
    Oct 28 14:23:01 Tower kernel: ata5.00: Read log 0x12 page 0x00 failed, Emask 0x1
    Oct 28 14:23:01 Tower kernel: ata5.00: Read log 0x13 page 0x00 failed, Emask 0x1
    Oct 28 14:23:01 Tower kernel: ata5.00: Read log 0x12 page 0x00 failed, Emask 0x1
    Oct 28 14:23:01 Tower kernel: ata5.00: configured for UDMA/133
    Oct 28 14:23:01 Tower kernel: sd 6:0:0:0: [sdf] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=DRIVER_OK cmd_age=0s
    Oct 28 14:23:01 Tower kernel: sd 6:0:0:0: [sdf] tag#0 Sense Key : 0x3 [current] 
    Oct 28 14:23:01 Tower kernel: sd 6:0:0:0: [sdf] tag#0 ASC=0x11 ASCQ=0x4 
    Oct 28 14:23:01 Tower kernel: sd 6:0:0:0: [sdf] tag#0 CDB: opcode=0x88 88 00 00 00 00 05 74 ff ff 80 00 00 00 08 00 00
    Oct 28 14:23:01 Tower kernel: I/O error, dev sdf, sector 23437770624 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 2
    Oct 28 14:23:01 Tower kernel: ata5: EH complete

     

    Any ideas on what's wrong? Thinking the disk may be fried but kind of surprised it just crapped out and hadn't thrown off any errors before this that I was aware of.

  8. 2 hours ago, Swarles said:

    Hmm okay strange because this is how the mover tuning plugin runs the scripts.

    Only other thing I can think of without digging deeper is if they are executable? 

    Try the following:

    ls -l "/mnt/user/[FILEPATH]/script1.py"

    and if it is not executable (there are no -x in it) try this and run it again:

    chmod +x "/mnt/user/[FILEPATH]/script1.py"

     

     

    How could it not be executable if I can already run it from CLI?

     

    EDIT: Forgot to answer your Q. It's 766. The command also shows 766.

  9. 5 hours ago, Swarles said:

    How are you running your script? Does your script have spaces in the path (this shouldn't cause issues)?

    "./script" or "python3 /script"?

    Can you try the following for me in terminal:

    eval "/script"

    and

    eval /script

     

     

    The script are python scripts. When I test I put in "/mnt/user/appdata/scripts/scriptname.py" and hit enter. Script shows the correct output and I can see its working by checking to make sure it has the correct effect. Eval commands with and without quotes show the correct output.

     

    There are no spaces in the file path or file name.

     

    Here is what the log looks like when it runs my commands through mover tuning:

    root: ionice -c 2 -n 0 nice -n 0 /usr/local/emhttp/plugins/ca.mover.tuning/age_mover start 0 0 0 '' '' "/mnt/user/[FILEPATH]/script1.py" "/mnt/user/[FILEPATH]/script2.py" '' '' '' '' 55

     

    But when I check it doesn't have the desired effects, or alternative the scripts are running so fast back to back that I can't tell that they are running, since the goal is for script 1 to make a change, then mover runs, then script 2 undoes the change. This works if I do it manually/

  10. On 8/21/2023 at 1:40 AM, Swarles said:

    My first guess will be that perhaps you are missing the shebang at the beginning of your script? The shebang has to be on line one and point to the location of your python interpreter. For example:

    #!/usr/bin/python3

    But edit it ^^ to make sure it is your location.

     

    This is my best guess and you should test you can otherwise run your script with just "./mnt/user/[rest of path]/1.py" in the terminal. If it does run properly but the shebang doesn't fix it, you will have to go to /tmp/Mover and provided the latest ".log" file in there for me to see what the mover is doing.

     

    I'm not missing it, and the script runs correctly when executed from the CLI.

     

    I can run the before script manually, run mover, run the after script manually, and everything works. When I put it in the mover tuner, it doesn't run and I can't seem to figure out where to start.

  11. Having issues with the Script to run before mover. I've run the script through CLI and it works as expected. When triggered via mover tuner it doesn't seem to have any effect.

     

    Aug 18 11:00:01 Tower root: ionice -c 2 -n 0 nice -n 0 /usr/local/emhttp/plugins/ca.mover.tuning/age_mover start 0 0 0 '' '' "/mnt/user/[rest of path]/1.py" '' '' '' '' '' 55

     

    Any idea whats going on? Any additional info I can provide to help?

  12. 3 hours ago, Kaizac said:

    Yes it should be on "always", otherwise sonarr/radarr won't pick up changes or new downloads you made. Using manual is only useful for never changing libraries.

     

    Discovered a new problem! If I attempt to manually import something that's on my cloud drive (Manage episodes -> import, due to misnamed files), it can't execute the command but keeps the task running.

     

    This is interesting because the automatic import works on the cloud drive, it's only for files that aren't correctly named and have to be manually added.

     

    Encounter this before, or have any thoughts?

  13. On 4/8/2023 at 6:20 AM, Kaizac said:

    Glad to hear!

     

    Did the permissions persist now through mounting again?

     

    I still want to advise to not use the analzye video options in sonarr/radarr. It doesn't give you anything you should need and it takes time and api hits. I don't see any advantage for that.

     

    Mono is what Sonarr is built upon/dependent off, so it could be that it was running jobs or maybe the docker already had errors/crashes and then you often get the error in my experience. Now that it's running well, does stopping it still give you this error?

    I would advise to add UMASK again to your docker template again in case you removed it. That way you stick closed to the templates of the docker repo (linuxserver in this case).

     

    Regarding your mount script not working the first time. It is because it sees the file "mount_running" in /mnt/user/appdata/other/rclone/remotes/$RcloneRemoteName. This is a checker to prevent it from running multiple times simultaneously. So apparantly this file doesn't get deleted when your script finishes. Maybe your unmount script doesn't remove the file? Or are you running the script twice during starting of your system/array? Maybe 1 on startup of array and another one on a cron job?

     

    I would check your rclone scripts and the checker files they use and the way they are scheduled. Something must be conflicting there.

     

    Do you do anything with the "Rescan Series Folder After Refresh" option in Sonarr? I've moved mine to after manual refresh, but I'm not sure if I needed to do that.

  14. On 4/8/2023 at 6:20 AM, Kaizac said:

    Glad to hear!

     

    Did the permissions persist now through mounting again?

     

    I still want to advise to not use the analzye video options in sonarr/radarr. It doesn't give you anything you should need and it takes time and api hits. I don't see any advantage for that.

     

    Mono is what Sonarr is built upon/dependent off, so it could be that it was running jobs or maybe the docker already had errors/crashes and then you often get the error in my experience. Now that it's running well, does stopping it still give you this error?

    I would advise to add UMASK again to your docker template again in case you removed it. That way you stick closed to the templates of the docker repo (linuxserver in this case).

     

    Regarding your mount script not working the first time. It is because it sees the file "mount_running" in /mnt/user/appdata/other/rclone/remotes/$RcloneRemoteName. This is a checker to prevent it from running multiple times simultaneously. So apparantly this file doesn't get deleted when your script finishes. Maybe your unmount script doesn't remove the file? Or are you running the script twice during starting of your system/array? Maybe 1 on startup of array and another one on a cron job?

     

    I would check your rclone scripts and the checker files they use and the way they are scheduled. Something must be conflicting there.

     

    Permissions are now persistent. UMASK was on the container when I had it running. I've left analyse unchecked and left the permissions unchecked (default).

     

    I'll work on the getting the script to hit the first time, but it's lower priority as it works.

  15. 6 hours ago, Kaizac said:

    Yes I have it all on 99/100 down to the file.

     

    Can you try the library import without analyze video checked on? This causes a lot of CPU strain and also api hits, because it reads the file which is like it's streaming your files.

     

    I'm still working through some permutations but wanted to add a bit more info. When I try to stop the Sonarr container after it has the importing issues above, I get an execution error and I can't stop the container. Using OpenFiles I can see that mono is what's using sonarr and keeping it from closing. Here's the listing in case it's helpful.

     

    image.thumb.png.a28d092a481c3115c0148ceeed792d24.png

×
×
  • Create New...