Jump to content

nomisco

Members
  • Posts

    38
  • Joined

  • Last visited

Posts posted by nomisco

  1. As a further bit of information, I updated to the latest build about two days after this event, and upon reboot, a different disk was disabled, so I repeated the process.

     

    The server has been stable for in excess of a year, so it is troubling as to why I might suddently be seeing this.

     

    I purchased another LSI 9211 and a breakout cable (as a spare) as I don't want this suddenly failing over the holiday period!

  2. Please can I have soome assistance with a disk becoming disabled? Unfortunately I have rebooted since it happened so am probably missing log files. Nothing has changed; the server was sitting idle and it suddenly happened. It had been up for nearly a month.

     

    Obviously I want to avoid any data loss.

     

    I have no idea what I'm doing beyond basic setup so I'd appreciate some explicit guidance.

     

    Thanks!

     

    Edit: I've tried the repair in maintenance mode and it says:

     

    Phase 1 - find and verify superblock...
            - block cache size set to 1117984 entries
    Phase 2 - using internal log
            - zero log...
    zero_log: head block 118210 tail block 118206
    ERROR: The filesystem has valuable metadata changes in a log which needs to
    be replayed.  Mount the filesystem to replay the log, and unmount it before
    re-running xfs_repair.  If you are unable to mount the filesystem, then use
    the -L option to destroy the log and attempt a repair.
    Note that destroying the log may cause corruption -- please attempt a mount
    of the filesystem before doing this.

     

    I have no idea what this means.

     

    unraid-diagnostics-20231202-0827.zip

  3. ljm42: Changed the port to something else.

     

    trurl: Yes, I do recognise the IP address, that's OK. And I'm aware of the corruption in the cache pool. I don't know if it's a failing SSD or not. No further attempts to access the server. The only things which are forwardded on my router are the Plex server (running on unRAID) and the port aforementioned which I've changed for the WebUI remote access.

  4. I've just seen this in my log. Never seen anything like this before. Is this someone trying to gain access to my server? Anything I should worry about?

     

    Jan 26 15:43:38 unRAID nginx: 2023/01/26 15:43:38 [crit] 24956#24956: *1977125 SSL_do_handshake() failed (SSL: error:141CF06C:SSL routines:tls_parse_ctos_key_share:bad key share) while SSL handshaking, client: 152.32.141.142, server: 0.0.0.0:443

     

    That IP address is from Nigeria.

     

    Thanks

     

  5. 41 minutes ago, JorgeB said:

    Could be SMB, but I can write to Unraid at 1GB/s, normally no one has problems getting line speed with SMB over gigabit, could still be network related despite the clean iperf results, I would try two things, using a different NIC if available and/or creating a new Unraid flash drive and restore only the key and disk assignments, in case it's something to to with your Unraid install.

     

     

    There may be a perfect storm of something in my case with the many recent changes to the SMB implementation. It most certainly used to saturate the Gb network during SMB transfers. I shall do a fresh install in the next day or two and report back.

     

    Thanks for your help.

  6. The disk settings are set to reconstruct write. It is writing to the largest available space, which is about 2TB of space on a 4TB disk. The write speed is still ~50MB/s from the client to unraid, but it appears to buffer in the unraid memory, then dump large chunks to disk before waiting for some buffer to fill on unraid, they repeating. Hopefully the below images give you some idea of the behaviour.

     

    2.jpg

    3.jpg

    4.jpg

    5.jpg

     

    The disk writes in the top image are to the parity and array disk. Cache disk (SDD) is not used during test.

  7. Bypassed the entire segment of the network, so Win10 > switch > unraid. No change. SMB still about half the data rate it used to be. I can transfer from the win10 machine to another on the network at ~110MB/s. The iPefr tests show good performance. SMB is the problem.

     

     

    unraid_client.jpg

    win_sender.jpg

    win10_client.jpg

    win10_server.jpg

  8. 6 hours ago, JorgeB said:

    There are a lot of retries in one of the directions suggesting a network problem .

    I don't believe that the number of retries is of concern when the network is being saturated. 1Gb/s can max out at about 80,00+ packets per second and the network was minimally in use elsewhere; skpe, online gaming etc (which would have prioritised packets).

     

    I'll subsitiute a segment of the network with a long cable which will byass a couple of switches, but because there's a 50% drop in throughput I don't have much hope for that.

     

    Just to add, iPerf shows that the network performance is largely as expected, and I see better when using a different protocol through lancache, so it points to an SMB problem to me.

  9. I know there are several posts on here about this, and a few appear to have solutions, but none of those apply to my situation. I used to be able to max my 1Gbps network connection when copying from my Windows 10 machine over SMB to unRAID. It would max out at ~ 110MB/s. Now it will max at a consistent half of that - about 52MB/s. I don't know when it first happened, but I'm guessing it's been slow for about 6 months or so.

     

    Don't want to spam this thread with screenshots (so I've added them as an archive). I've tested:

     

    Windows to unRAID using iPerf with both tested as client and server. Results are > 930Mb/s

    Benchmarks of the drives and controller (LSI 9211-8i). Spinners and SSDs. The slowest drive is around 100MB/s at its slowest point.

    Windows to Windows machine on the same network ~ 110MB/s. SSD on both ends.

    Tested with cache SSD and direct to array with no change.

     

    Tested in safe mode. Tested with all dockers stopped. Played with network settings, turbo write etc, but I can see that the disks and network don't appear to be the bottleneck.

     

    I've recently set up Lancache for gaming downloads. Those cache only on an SSD and may contain a lot of small files, but even installing to both machines I am seeing a consistent 75 - 80MB/s.

     

    Screens archive also shows that array parity check speed is consistently ~150MB/s average. Test file is 60GB ISO.

     

    Attached screens.zip shows the screenshots I took whilst running the tests.

     

    Thanks

     

    screens.zip unraid-diagnostics-20230104-1924.zip

  10. Looking for some guidance on how to set up Pushover to send alerts to my Android phone. I've installed the Android app and got a Pushover account, along with the user key. I have no idea what should be in the app token line (Settings > Notifications). Clicking on the link sends me to a page on the Pushover website. All I want to do is simple alerts from the OS, not from Dockers etc.

     

    My Linux and SysAdmin skills are weak. I need an idiots guide!

  11. Thought it might be worth an update:

     

    Now over two weeks of uptime without anything unusual in the logs. I re-enabled the i915 driver about a week ago and it's being used for Plex HW transcoding - the same as the last couple of years.

     

    I may reinstall the MyServers Plugin to see what happens. I am still running 6.10 RC2.

  12. Unfortunately the server crashed again in the night, just after 3AM. I've attached the log from the flash which shows just over an hour and twenty minutes of errors. The midnight spin up will be for the Plex library scan, then the disks go to sleep until an hour and ten minutes later when I get this massive block of errors until the box dies. I can't see what is causing the error, but it may be more obvious to someone here. Hopefully.

     

    Or perhaps someone can suggest a more verbose logging effort if they have a suspect in mind.

     

    Thanks

     

     

     

    syslog.txt

  13. Thank you Jorge. I only use that driver for Plex transcoding but can do without. I remember adding some lines to a config file or two; but can't remember how I'd remove them.

     

    I suppose a reboot of the server would be required after they're removed?

     

    Update:  I removed some lines from the 'Go' file:

     

    #enable module for iGPU and perms for the render device
    modprobe i915
    chown -R nobody:users /dev/dri
    chmod -R 777 /dev/dri

     

    And also the relevant lines from the Plex container.

     

    I'll give this a try, and report back either way, but it may be a week! Any other suggestions in the meantime would be welcome.

×
×
  • Create New...