Jump to content
  • [6.12.3] crashes on big file transfers


    xxlbug
    • Minor

    Hi,

     

    TLDR: My server crashes if i initiate large file transfers with kb to gb sizes. With or without docker enabled. Don't know what to do, any help to diagnose this would be appreciated.

     

    As there was nearly no interaction on my "General Support" post, i'm trying it over here. The previous diagnostics (multiple) are attached to this post: 

    I've changed docker to ipvlan, but even without docker activated i have these crashes. This also happens since several unraid versions ago and i've switched server hardware completely in between, except for my unraid usb stick.

     

    My main problem is, that i have no idea where to go from here and better the situation. If i don't transfer many things it runs smoothly for months. Then (as e.g. the last couple of days) i can reproduce crashes with specific operations.

     

    Since there is no syslog entry and no error message afterwards i'm not sure why the server crashes. There is no screen to look at (everything blank) or any remote connection possible. Is there some way to see kernel panics or similar with physical monitors?

     

    Steps i've taken (in no special order):

    - Diagnosed several dockers separately

    - switched to ipvlan

    - switched to second network interface

    - changed from fritzbox setup to unifi incl. new subnets

    - changed all hardware over the years

    - started completely new with my unraid config (i think on 6.10)

    - disabled docker, only used rsync to transfer

    - switched from standard unraid fs to encrypted fs

     

    I don't use VM, and switched over to pure docker compose since a couple of generations of unraid. I've installed per slack package and per script on array start. No difference. Before that i used the unraid templates and had the same problems on bigger downloads.

     

    Some things i've found interesting:

    - i don't have crashed while munching files only on the server like transcoding 4k with plex

    - Generating previews from a couple of thousand video files (literaly) also no problem

    - after a crash i can start the server normal, start the array, let the parity check run and get no errors. Even though the server crashed on a file transfer

     

    Any ideas?

     

     




    User Feedback

    Recommended Comments

    This looks more like a hardware problem, to rule out any config issue you can redo the flash drive and retest with a stock config, copy only your key, super.dat and pool folder from the current /config, if it still crashes like that you can then restore the rest and it will rule out config problems.

    Link to comment

    Hi @JorgeB

     

    i did some further testing and found something interesting:

    1. if i do large file copies (many files, some small, some many GB) directly on the array, i have no problems (rsync one folder to another, rm the copied data, start anew)

    2. (now it is getting interesting) i did a iperf3 test: unraid = iperf3 in server mode, client in the network on another machine. i can reliably crash the server after a couple of minutes (test only runs 5 min max).

     

    Number 2 even happens when the array is not even startet yet. I have also found that it happens on the main network interface (eth0). Since the array is not started my ipvlan docker network shouldn't come into play?

     

    Is there some way to get detailed logs about the network interface while running a test? This could give a hint to the problem.

     

    Thank you

    Link to comment

    Hey @xxlbug,

     

    this is exactly what I am struggling with. The server freezes completely. Over IPMI I can even see that the login prompt cursor is not blinking anymore and only a power reset brings the system back online.

     

    What I have tried:

    • Enable syslog to flash -> there is no entry when the server crashes
    • Tried various network / config settings
    • Disabled docker completly
    • Removed the Cache pool device
    • Downgrade from 6.12.4 to 6.11.5
    • Run memtest
    • Run disk checks / SMART
    • Changed cache pool from zfs back to btrfs
    • Removed pure parity drives
    • BIOS / IPMI update
    • Connecting the server to my client PC directly (no switch / router ... I am using unifi hardware)
    • Running in Safe Mode

     

    I can copy files around the server via the GUI (e.g. to an usb drive) without any crash but transfering it over the network freezes the server. However, I also tried ipfer after reading your post and this does not seem to have an impact.

    Edited by EofChris
    Link to comment

    I just want to leave a possible solution here, that worked for me after days of struggling. I don't know if this is an OS or hardware problem on my side, but turning off the cpu graphic unit in the BIOS of my Supermicro X11SSH-LN4F completly solved all problems.

    This however means that you can no longer pass the gpu to your virtualization.

    Link to comment


    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.

×
×
  • Create New...