Jump to content

rkotara

Members
  • Posts

    21
  • Joined

Posts posted by rkotara

  1. After digging into the system a bit, the only odd item I can find is that there is a CRC error logged to one of my SSDs in the cache pool.  Although its probably some issue, I'm guessing it didn't cause the issues I saw since core system functionality should not be impacted by the cache pool.

  2. My unraid gui and at least one of my dockers crashed (Plex as was still running fine though).  I was able to login to the console through keyboard/monitor but the "diagnostics" command was not found and even find was getting errors.  I've since rebooted (yes, know I lost logs this way) and grabbed a diag.  I wonder if there is anything inherently bad about my setup that might have caused this?  It has happened before, but I've just rebooted without doing additional checks those times.  Is there something else I could have done to get better logs?  I searched the forums but didnt find much outside of "diagnostics".  Even powerdown and shutdown didnt work, had to use "reboot" instead.  Attached an image of the strange results from /mnt/user0 but /mnt/user seemed ok at the top level.

    tower-diagnostics-20240819-1636.zip

     

  3. Replaced the sata power cable extension I had on hand (Monoprice 108794) with one from Cable Matters and all is well so far.  No more drops on the drives after 24 hours of running preclear on all 3 drives so far.  Will update if anything changes. 

  4. TLDR: I've had Preclear fail on all 3 drives I just installed in my system using a Dell Perc H310 controller flash to IT mode.   No smart errors, does that mean its the controller?

     

    I just installed a Dell H310 6Gbps SAS HBA (LSI 9211-8i) with IT MODE firmware.  I added it to a free PCIe 16x slot and cabled up 3 refurb drives (which is why I want to preclear). I started Preclear on one, then seeing how long it would take (days), started the other two.  All have failed since and that seems very unlikely.  The failures all seem to be read errors and different points.  There are no SMART issues from the data I can see.  Breakout cables are new and all 3 drives failing seem to point to the controller.  I guess its a bad controller?  Any insight would be appreciated, thanks!

    preclear_disk_8DJGT6JH_21015.txt

    tower-diagnostics-20240226-1401.zip

    preclear_disk_9JH30AWT_27151.txt preclear_disk_8DJGT6JH_21015.txt

  5. 1 hour ago, Mainfrezzer said:

    Only if you read the patchnotes release notes, lol, and do what they say to make it work. Bridging is enabled and thus doesnt do much.

    Yes!  Finding this out through other posts, for sure.  Not always a simple click-n-upgrade depending on your customization.

  6. Saw this in your syslog.  I'm just a newbie and do not know your setup, but maybe disk3 having issues could be part of the problem?

     

    Also, I use the plugin "CA Auto Update Applications" to auto update many of my plugins/dockers that are not critical (probably dont use it for Adguard in your list).

     

     

    Sep 23 11:43:14 Odyssey root: Fix Common Problems Version 2023.07.29
    Sep 23 11:43:24 Odyssey root: Fix Common Problems: Error: disk3 (WDC_WD30EFRX-68EUZN0_WD-WCC4N7VNY6K8) has read errors
    Sep 23 11:43:25 Odyssey root: Fix Common Problems: Warning: Plugin unassigned.devices.plg is not up to date
    Sep 23 11:43:25 Odyssey root: Fix Common Problems: Warning: Docker Application AdGuard-Home has an update available for it
    Sep 23 11:43:25 Odyssey root: Fix Common Problems: Warning: Docker Application bazarr has an update available for it
    Sep 23 11:43:25 Odyssey root: Fix Common Problems: Warning: Docker Application Chromium has an update available for it
    Sep 23 11:43:25 Odyssey root: Fix Common Problems: Warning: Docker Application overseerr has an update available for it
    Sep 23 11:43:25 Odyssey root: Fix Common Problems: Warning: Docker Application plexautoskip has an update available for it
    Sep 23 11:43:25 Odyssey root: Fix Common Problems: Warning: Docker Application prowlarr has an update available for it
    Sep 23 11:43:25 Odyssey root: Fix Common Problems: Warning: Docker Application radarr has an update available for it
    Sep 23 11:43:25 Odyssey root: Fix Common Problems: Warning: Docker Application sabnzbd has an update available for it
    Sep 23 11:43:25 Odyssey root: Fix Common Problems: Warning: Docker Application sonarr has an update available for it
    Sep 23 11:43:25 Odyssey root: Fix Common Problems: Warning: Docker Application tautulli has an update available for it
    Sep 23 11:43:25 Odyssey root: Fix Common Problems: Warning: Docker Application UptimeKuma has an update available for it
    Sep 23 11:45:11 Odyssey root: Fix Common Problems: Other Warning: Unassigned Devices Plus not installed
    Sep 23 11:45:11 Odyssey root: Fix Common Problems: Other Warning: Background notifications not enabled
    Sep 23 11:45:11 Odyssey root: Fix Common Problems: Warning: Jumbo Frames detected on eth0 ** Ignored

  7. I have a btrfs cache pool of two SSD drives and got an alert about a CRC error on one drive (which shows up in smart data as 1 total).  Is there any action to take here if this does not repeat?  From what I've read and from the logs, it seems like it reset the link and wrote the data and I'm good for now unless it keeps happening, which is when I should reseat cables, etc... Just wanted to check with the forum if this is the right information.  Thanks!

     

    (log of entire event below)

     

    Aug 15 18:57:56 Tower kernel: ata4.00: exception Emask 0x10 SAct 0x7000000 SErr 0x400100 action 0x6 frozen
    Aug 15 18:57:56 Tower kernel: ata4.00: irq_stat 0x08000008, interface fatal error
    Aug 15 18:57:56 Tower kernel: ata4: SError: { UnrecovData Handshk }
    Aug 15 18:57:56 Tower kernel: ata4.00: failed command: WRITE FPDMA QUEUED
    Aug 15 18:57:56 Tower kernel: ata4.00: cmd 61/20:c0:e0:17:46/00:00:00:00:00/40 tag 24 ncq dma 16384 out
    Aug 15 18:57:56 Tower kernel:         res 40/00:d0:a0:15:46/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
    Aug 15 18:57:56 Tower kernel: ata4.00: status: { DRDY }
    Aug 15 18:57:56 Tower kernel: ata4.00: failed command: WRITE FPDMA QUEUED
    Aug 15 18:57:56 Tower kernel: ata4.00: cmd 61/20:c8:a0:16:46/00:00:00:00:00/40 tag 25 ncq dma 16384 out
    Aug 15 18:57:56 Tower kernel:         res 40/00:d0:a0:15:46/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
    Aug 15 18:57:56 Tower kernel: ata4.00: status: { DRDY }
    Aug 15 18:57:56 Tower kernel: ata4.00: failed command: WRITE FPDMA QUEUED
    Aug 15 18:57:56 Tower kernel: ata4.00: cmd 61/20:d0:a0:15:46/00:00:00:00:00/40 tag 26 ncq dma 16384 out
    Aug 15 18:57:56 Tower kernel:         res 40/00:d0:a0:15:46/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
    Aug 15 18:57:56 Tower kernel: ata4.00: status: { DRDY }
    Aug 15 18:57:56 Tower kernel: ata4: hard resetting link
    Aug 15 18:57:56 Tower kernel: ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
    Aug 15 18:57:56 Tower kernel: ata4.00: supports DRM functions and may not be fully accessible
    Aug 15 18:57:56 Tower kernel: ata4.00: supports DRM functions and may not be fully accessible
    Aug 15 18:57:56 Tower kernel: ata4.00: configured for UDMA/133
    Aug 15 18:57:56 Tower kernel: ata4: EH complete
    Aug 15 18:57:56 Tower kernel: ata4.00: Enabling discard_zeroes_data

     

  8. 9 hours ago, djgizmo said:

     

    " After I made a custom box with a Rosewill Case, a SuperMicro motherboard, Intel 4770, 16Gb of ram, SSD for cache / VMs/containers.. things seemed a bit bettee, I’d only have to reboot every 2-3 months.   

    ...

    I’ve memtested my ram for 6 hours, all passes and no errors."

    My single threaded memtest will not run a full cycle in 6 hours, but is very accurate on testing.  Might be best to let it run a full cycle and ensure your using the single threaded (older) memtest86.  The newer multi-threaded version will miss some ram issues in my experience.

  9. Seems like a logic issue when a mount point (directory) is present and not a "mount".  The code in this line is looking for a directory:

    if [[ -d /var/lib/docker_bind ]]; then umount /var/lib/docker_bind || logger -t docker Error: RAM-Disk bind unmount failed while docker stops!; fi\

     

     

    But maybe something like this would work better since it actually checks for a mount:

    mount | awk '{if ($3 == "/var/lib/docker_bind") { exit 0}} ENDFILE{exit -1}' && umount /var/lib/docker_bind || logger -t docker Error: RAM-Disk bind unmount failed while docker stops!\

    Of course, I cant really test this since I'm not using the script yet, I'm just following along until I upgrade and add this afterward or hope its part of future "improvements". ;)

     

×
×
  • Create New...