macvlan call traces, regular crashes, gui unresponsiveness, btrfs errors, weird dynamix errors, high usage stats and ......


danioj

Recommended Posts

.... my journey to unraid stability!

 

I am a long time user of unraid. Since v5 actually. I have always purchased half decent server hardware (drives aside) and my use case for the os has never been that far outside the box. For my main server, I have always followed the recommended upgrade pathway without a need to rebuild (sans a usb failure which was an easy restore from backup).

 

In recent times, I have been having stability issues. For a number of reasons. I used bonded dual nic's, vlans, macvlan, btrfs errors, host access, many plugins and of course whatever was left over from my years of tinkering and learning as I upgraded. Here is what I have done to try and regain stability.

 

- started fresh - maintained parity and drive assignments

- redesigned my network configuration and switched from bonded dual nics to a single (non bridged) eht1 for unraid and a single (non bridged) eht2 for docker (4 vlans including the vlan for the same network unraid is on so I don't use br0 for eth1)

- as a result of above, switched to ipvlan from macvlan

- uploaded (via the file manger plugin) docker templates from my old usb backup and reinstalled docker containers

- switched docker image to xfs

- assigned dockers to the network I wanted and started them

- sorted out my shares (which had been auto discovered) as their config was not there

- created users again

- installed only "what I need" plugins (9 from 21)

- disabled VM service  (don't use them)

- checked parity

 

Now I am on my way. Everything is working as I need it to. 29 hours down and not a single event in the log to bat an eye lid over. Server is as idle as I have seen it for a long time. No network issues. gui is snappy. Just the drives spinning up and down. A nice clean fresh install.

 

I have this feeling that this is going to work and it only took me 90 minutes (tops) to go from where I was to where I am now.

 

I intend on checking in to this post regularly (ie monthly) with my stability notes and uptime. 

 

Edited by danioj
  • Upvote 1
Link to comment
  • danioj changed the title to macvlan call traces, regular crashes, gui unresponsiveness, btrfs errors, weird dynamix errors, high usage stats and ......

Almost a week on since my start from scratch "rebuild". I am now so glad I did this.

 

Not a single entry in the log except the nightly trim and backup entries. Outside of that just the smart events as the array spins up and down as it is accessed throughout the day.

 

1 user driven restart mid week following installation of the latest Nvidia driver.

 

Solid as a rock.

  • Like 1
  • Upvote 1
Link to comment

Another week goes by the the server is still solid as a rock.

 

Only one odd message to speak of in the syslog this week:

 

Aug 17 07:02:38 unraid nginx: 2023/08/17 07:02:38 [error] 5368#5368: *1417305 FastCGI sent in stderr: "Primary script unknown" while reading response header from upstream, client: X.X.X.X, server: , request: "POST /plugins/unassigned.devices/UnassignedDevices.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "host.local"

 

Seems to be something to do with Unassigned Devices. The only weird thing there is, I don't have the plugin installed? So a bit of a WTAF there. It didn't seem to impact stability though.

 

Also noticing that by reverting to the new build defaults vs whatever my settings where that had evolved over so many years - my server is drawing less power. I can only imagine it being down to the default spin down settings. Can't remember what I had my old setup was set to.

 

 

Link to comment
  • 2 weeks later...

Im now at the beginning of September and things are still rock solid.

 

Re my previous issue, the author of UAD confirmed that there are no components of that plugin installed into the OS by default. He also stated that the UAD error could only occur IF UAD was installed, which it is not. Unresolved but hasn't caused any problems so I have moved on.

 

I see the .4 release of v6.12 has been released. I am going to go for it. I have no wish to give up my stability but it's a personal decision to keep with the stable branch for security fixes and upgrades etc.

 

Tune in next week! LOL!

Link to comment
  • 2 weeks later...

Well, I think I will call this thread a wrap.

 

There has been no material change since upgrading to v6.12.4. I obviously jumped the gun and "fixed" the issues myself prior to the unraid "fix" for macvlan issues.

 

All that said, my server is back to its usual rock solid self. All of the other things that were playing me have also gone.

 

Starting fresh really did help and I am glad I did it. I can go back to forgetting that the server is there and just using it when needed.

  • Like 2
Link to comment

Glad things are working well for you! 

 

Looks like you went with the "2-nic docker segmentation method", that prevents macvlan call traces because it avoids bridging on the nic that is running Docker. 

 

The downside is that it requires two nics, and it is a little more complicated than I would want all users to have to go through.

 

For other folks, 6.12.4 has a one-click solution that only needs a single nic, but if you are happy using two nics then there is no reason to change anything.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.