Unraid OS version 6.10.0 available


Recommended Posts

Updated my server from 6.8.3 to 6.10.1.

 

No issues that weren't of my own making.

 

I had an old version of PuTTY installed on the computer I was accessing the server from, so SSH would not connect because of Cipher selections. A quick update of Putty resolved that.

 

I stopped docker to alter set IPVLAN. The dockers would not start with some old parameters still around from before. A quick adjustment to the extra parameters to remove the "--mac-address=" resolved that.

 

I then updated all plugins to get versions tailored for 6.10.x series and had NERD Utils update the packages.

 

I can't think of any additional steps I need to do at this time. Will update further should anything change.

 

 

  • Like 2
Link to comment

I successfully updated from 6.9.2 to 6.10.1

I was initially concerned when looking at the Main tab, seeing all my drives mounting one by one with 0TB, but after the Array finished starting up, everything is fine. So don't be worried if you notice this!

I really like the new dashboard graphs, and the btrfs schedule-able features. Very exciting!

  • Like 1
Link to comment
On 5/23/2022 at 2:19 AM, tjb_altf4 said:

Possibly related to this issue linked.

If so, you should just need to grab the nvme ID again and update your VM XML and any passthru settings (UD or vfio-pci driver binding)

 

ssd were passtru and working - when booted from iso I can access it, run check disk (no errors) but windows wont boot. When I clone entire SSD to raw image system boots without error.

Link to comment
On 5/22/2022 at 4:57 PM, JorgeB said:

 

TL; DR I would recommend only running v6.10.x on a server with a Brodcom NIC that uses the tg3 driver if VT-d/IOMMU is disable or it might in some cases cause serious stability issues, including possible filesystem corruption.

 

 

Another update since this is an important issue, there's a new case with an IBM/Lenovo X3100 M5 server, this server uses the same NIC driver as the HP so this appears to confirm the problem is the NIC/NIC driver when IOMMU is enable, these are the NICs used.

 

Known problematic NICs:

 

HP Microserver Gen8:

[truncated]

 

Wow, thanks for the warning! I updated on the weekend, but hadn't noticed anything wrong. I just checked my logs, and indeed I had the errors mentioned above.

So I rebooted and disabled VT-d.

Do you know how I can check to see if I have any data corruption? Is it known to be caused by any specific activities? Would a parity check verify (and allow me to find what has been corrupted if it detects any problems?)?

Thanks again.

Link to comment
On 5/21/2022 at 7:47 PM, pkoci1 said:

Anyone else cannot perform a time machine backup? It always ends with an error that the backup disc is not to be found. 

I just realised that I also can't backup to Time Machine since upgrading to v6.10

Says "Preparing backup" then... nothing. No error: just doesn't backup.

SMB Multichannel is disabled. Enhanced MacOS compat is enabled.

SMB extras is:

#vfs_recycle_start
#Recycle bin configuration
[global]
   syslog only = Yes
   syslog = 0
   logging = 0
   log level = 0 vfs:0
#vfs_recycle_end

 

A while ago, I remember having to delete my backups and start fresh, but I was getting errors that time. I don't really have anything too valuable on my Mac, so can experiment with deleting the contents of my TimeMachine share and starting fresh, but I'd rather hold off to see if anyone has any other ideas.

I've seen a similar issue mentioned here, but looking at the logs, my error just says "no mountable file systems" also, I'm running MacOS 11.6.5

I have to go pick my child up, but will create my own support thread soon (and probably delete my old backup DIR and start fresh to see if that fixes it).

 

Edited by jademonkee
Link to comment
15 hours ago, jademonkee said:

Do you know how I can check to see if I have any data corruption? Is it known to be caused by any specific activities?

Not yet clear what activities, usually issues start after a few hours of normal use, docker image or any other btrfs filesystem are usually the first to go since they are very susceptible to memory corruption issues, if the pools are btrfs run a scrub, parity check for the array if xfs, though if errors are detected basically you can only correct them, unless you have pre-existing checksums, also note that while some corruption is possible it's not certain, in at least one case btrfs detected some corruption but after disabling VT-d and running scrub no more corruption was found.

  • Thanks 2
Link to comment
4 hours ago, jademonkee said:

I just realised that I also can't backup to Time Machine since upgrading to v6.10

Says "Preparing backup" then... nothing. No error: just doesn't backup.

SMB Multichannel is disabled. Enhanced MacOS compat is enabled.

SMB extras is:

#vfs_recycle_start
#Recycle bin configuration
[global]
   syslog only = Yes
   syslog = 0
   logging = 0
   log level = 0 vfs:0
#vfs_recycle_end

 

A while ago, I remember having to delete my backups and start fresh, but I was getting errors that time. I don't really have anything too valuable on my Mac, so can experiment with deleting the contents of my TimeMachine share and starting fresh, but I'd rather hold off to see if anyone has any other ideas.

I've seen a similar issue mentioned here, but looking at the logs, my error just says "no mountable file systems" also, I'm running MacOS 11.6.5

I have to go pick my child up, but will create my own support thread soon (and probably delete my old backup DIR and start fresh to see if that fixes it).

 

 

FYI I deleted the content of my Time Machine share and started fresh. It failed to backup the first time, but then I hit backup again after about 5 minutes, and now it's backing up. If it fails again, then I'll create a support thread, but this may yet work.

Link to comment
3 hours ago, jademonkee said:

 

FYI I deleted the content of my Time Machine share and started fresh. It failed to backup the first time, but then I hit backup again after about 5 minutes, and now it's backing up. If it fails again, then I'll create a support thread, but this may yet work.

Just curious what error TM showed on the failed backup?

log show --predicate 'subsystem == "com.apple.TimeMachine"' --info | grep 'upd: (' | cut -c 1-19,140-999

 

Link to comment
On 5/22/2022 at 6:41 AM, kennymc.c said:

Has 6.10 changed the ability of docker containers to access files owned by root on the host? I noticed some containers were not able to access root-owned 700 or 755 files or directories after the update, like ssh keys or mounted volumes inside /tmp. When I change the permission to 755, respectively 777 for directories to write in, it worked again.

Since you can define a docker volume to be read/write or read-only inside the container I am wondering why I now have to set additional permissions which were not needed before. After a restart my sub-directory created in /tmp for a container ramdisk got set back to 711 which again caused error messages.

Is there a way to prevent this or was this done intentionally for security reasons?

I've been having this issue too, so far though it's only effected 2 of my containers that I'm aware of. My nextcloud and tachidesk-server containers lost access to the mounts outside of appdata.

I've had to go in the Unraid terminal and grant 777 permissions to get them working again.

Link to comment
7 hours ago, wgstarks said:

Just curious what error TM showed on the failed backup?

log show --predicate 'subsystem == "com.apple.TimeMachine"' --info | grep 'upd: (' | cut -c 1-19,140-999

 

 

Here's the output over the time the backup failed (this is after I deleted the contents of the TimeMachine share):

2022-05-24 21:12:51al] Starting manual backup
2022-05-24 21:12:51al] Attempting to mount 'smb://simonchester@percy/sctimemach'
2022-05-24 21:12:52al] Mounted 'smb://simonchester@percy/sctimemach' at '/Volumes/.timemachine/percy/47F34407-EFBA-4499-969E-25DAFCAFA7C4/sctimemach'
2022-05-24 21:12:52al] Initial network volume parameters for 'sctimemach' {disablePrimaryReconnect: 0, disableSecondaryReconnect: 0, reconnectTimeOut: 60, QoS: 0x0, attributes: 0x1C}
2022-05-24 21:12:52al] Configured network volume parameters for 'sctimemach' {disablePrimaryReconnect: 0, disableSecondaryReconnect: 0, reconnectTimeOut: 30, QoS: 0x20, attributes: 0x1C}
2022-05-24 21:12:52al] Mountpoint '/Volumes/.timemachine/percy/47F34407-EFBA-4499-969E-25DAFCAFA7C4/sctimemach' is still valid
2022-05-24 21:12:52al] Mountpoint '/Volumes/.timemachine/percy/47F34407-EFBA-4499-969E-25DAFCAFA7C4/sctimemach' is still valid
2022-05-24 21:12:52al] Creating a sparsebundle using Case-sensitive APFS filesystem
2022-05-24 21:13:10al] Mountpoint '/Volumes/.timemachine/percy/47F34407-EFBA-4499-969E-25DAFCAFA7C4/sctimemach' is still valid
2022-05-24 21:13:10al] Failed to read 'file:///Volumes/.timemachine/percy/47F34407-EFBA-4499-969E-25DAFCAFA7C4/sctimemach/710CB9A8-9829-5E7F-B77F-E11F8AB058ED.sparsebundle/com.apple.TimeMachine.MachineID.plist', error: Error Domain=NSCocoaErrorDomain Code=260 "The file “com.apple.TimeMachine.MachineID.plist” couldn’t be opened because there is no such file." UserInfo={NSFilePath=/Volumes/.timemachine/percy/47F34407-EFBA-4499-969E-25DAFCAFA7C4/sctimemach/710CB9A8-9829-5E7F-B77F-E11F8AB058ED.sparsebundle/com.apple.TimeMachine.MachineID.plist, NSUnderlyingError=0x7f8cc8422b10 {Error Domain=NSPOSIXErrorDomain Code=2 "No such file or directory"}}
2022-05-24 21:13:10al] Failed to read 'file:///Volumes/.timemachine/percy/47F34407-EFBA-4499-969E-25DAFCAFA7C4/sctimemach/710CB9A8-9829-5E7F-B77F-E11F8AB058ED.sparsebundle/com.apple.TimeMachine.MachineID.bckup', error: Error Domain=NSCocoaErrorDomain Code=260 "The file “com.apple.TimeMachine.MachineID.bckup” couldn’t be opened because there is no such file." UserInfo={NSFilePath=/Volumes/.timemachine/percy/47F34407-EFBA-4499-969E-25DAFCAFA7C4/sctimemach/710CB9A8-9829-5E7F-B77F-E11F8AB058ED.sparsebundle/com.apple.TimeMachine.MachineID.bckup, NSUnderlyingError=0x7f8cc842c320 {Error Domain=NSPOSIXErrorDomain Code=2 "No such file or directory"}}
2022-05-24 21:13:20al] Failed to read 'file:///Volumes/.timemachine/percy/47F34407-EFBA-4499-969E-25DAFCAFA7C4/sctimemach/710CB9A8-9829-5E7F-B77F-E11F8AB058ED.sparsebundle/com.apple.TimeMachine.MachineID.plist', error: Error Domain=NSCocoaErrorDomain Code=260 "The file “com.apple.TimeMachine.MachineID.plist” couldn’t be opened because there is no such file." UserInfo={NSFilePath=/Volumes/.timemachine/percy/47F34407-EFBA-4499-969E-25DAFCAFA7C4/sctimemach/710CB9A8-9829-5E7F-B77F-E11F8AB058ED.sparsebundle/com.apple.TimeMachine.MachineID.plist, NSUnderlyingError=0x7f8cc8720100 {Error Domain=NSPOSIXErrorDomain Code=2 "No such file or directory"}}
2022-05-24 21:13:20al] Failed to read 'file:///Volumes/.timemachine/percy/47F34407-EFBA-4499-969E-25DAFCAFA7C4/sctimemach/710CB9A8-9829-5E7F-B77F-E11F8AB058ED.sparsebundle/com.apple.TimeMachine.MachineID.bckup', error: Error Domain=NSCocoaErrorDomain Code=260 "The file “com.apple.TimeMachine.MachineID.bckup” couldn’t be opened because there is no such file." UserInfo={NSFilePath=/Volumes/.timemachine/percy/47F34407-EFBA-4499-969E-25DAFCAFA7C4/sctimemach/710CB9A8-9829-5E7F-B77F-E11F8AB058ED.sparsebundle/com.apple.TimeMachine.MachineID.bckup, NSUnderlyingError=0x7f8cc84351c0 {Error Domain=NSPOSIXErrorDomain Code=2 "No such file or directory"}}
2022-05-24 21:13:32al] Renamed '/Volumes/.timemachine/percy/47F34407-EFBA-4499-969E-25DAFCAFA7C4/sctimemach/710CB9A8-9829-5E7F-B77F-E11F8AB058ED.sparsebundle' to '/Volumes/.timemachine/percy/47F34407-EFBA-4499-969E-25DAFCAFA7C4/sctimemach/Mercury.sparsebundle'
2022-05-24 21:13:32al] Successfully created '/Volumes/.timemachine/percy/47F34407-EFBA-4499-969E-25DAFCAFA7C4/sctimemach/Mercury.sparsebundle'
2022-05-24 21:13:32al] Checking for runtime corruption on '/Volumes/.timemachine/percy/47F34407-EFBA-4499-969E-25DAFCAFA7C4/sctimemach/Mercury.sparsebundle'
2022-05-24 21:13:37al] Mountpoint '/Volumes/.timemachine/percy/47F34407-EFBA-4499-969E-25DAFCAFA7C4/sctimemach' is still valid
2022-05-24 21:13:37al] Runtime corruption check passed for '/Volumes/.timemachine/percy/47F34407-EFBA-4499-969E-25DAFCAFA7C4/sctimemach/Mercury.sparsebundle'
2022-05-24 21:13:43al] Mountpoint '/Volumes/Backups of Mercury' is still valid
2022-05-24 21:13:43al] '/Volumes/.timemachine/percy/47F34407-EFBA-4499-969E-25DAFCAFA7C4/sctimemach/Mercury.sparsebundle' mounted at '/Volumes/Backups of Mercury'
2022-05-24 21:13:43al] Updating volume role for '/Volumes/Backups of Mercury'
2022-05-24 21:13:44al] Mountpoint '/Volumes/.timemachine/percy/47F34407-EFBA-4499-969E-25DAFCAFA7C4/sctimemach' is still valid
2022-05-24 21:13:44al] Mountpoint '/Volumes/Backups of Mercury' is still valid
2022-05-24 21:13:44al] Stopping backup to allow volume '/Volumes/Backups of Mercury' to be unmounted.
2022-05-24 21:13:44al] Backup cancel was requested.
2022-05-24 21:13:54al] backupd exiting - cancelation timed out
2022-05-24 21:14:05Management] Initial thermal pressure level 0

 

And then when I next hit backup and it worked (though I've truncated this at the point it started logging files etc):

2022-05-24 21:19:55al] Starting manual backup
2022-05-24 21:19:55al] Network destination already mounted at: /Volumes/.timemachine/percy/47F34407-EFBA-4499-969E-25DAFCAFA7C4/sctimemach
2022-05-24 21:19:55al] Initial network volume parameters for 'sctimemach' {disablePrimaryReconnect: 0, disableSecondaryReconnect: 0, reconnectTimeOut: 30, QoS: 0x20, attributes: 0x1C}
2022-05-24 21:19:55al] Configured network volume parameters for 'sctimemach' {disablePrimaryReconnect: 0, disableSecondaryReconnect: 0, reconnectTimeOut: 30, QoS: 0x20, attributes: 0x1C}
2022-05-24 21:19:56al] Found matching sparsebundle 'Mercury.sparsebundle' with host UUID '710CB9A8-9829-5E7F-B77F-E11F8AB058ED' and MAC address '(null)'
2022-05-24 21:19:57al] Not performing periodic backup verification: no previous backups to this destination.
2022-05-24 21:19:58al] 'Mercury.sparsebundle' does not need resizing - current logical size is 510.03 GB (510,027,366,400 bytes), size limit is 510.03 GB (510,027,366,400 bytes)
2022-05-24 21:19:58al] Mountpoint '/Volumes/.timemachine/percy/47F34407-EFBA-4499-969E-25DAFCAFA7C4/sctimemach' is still valid
2022-05-24 21:19:58al] Checking for runtime corruption on '/Volumes/.timemachine/percy/47F34407-EFBA-4499-969E-25DAFCAFA7C4/sctimemach/Mercury.sparsebundle'
2022-05-24 21:20:02al] Mountpoint '/Volumes/.timemachine/percy/47F34407-EFBA-4499-969E-25DAFCAFA7C4/sctimemach' is still valid
2022-05-24 21:20:02al] Runtime corruption check passed for '/Volumes/.timemachine/percy/47F34407-EFBA-4499-969E-25DAFCAFA7C4/sctimemach/Mercury.sparsebundle'
2022-05-24 21:20:07al] Mountpoint '/Volumes/Backups of Mercury' is still valid
2022-05-24 21:20:07al] '/Volumes/.timemachine/percy/47F34407-EFBA-4499-969E-25DAFCAFA7C4/sctimemach/Mercury.sparsebundle' mounted at '/Volumes/Backups of Mercury'
2022-05-24 21:20:07al] Mountpoint '/Volumes/Backups of Mercury' is still valid
2022-05-24 21:20:08al] Checking identity of target volume '/Volumes/Backups of Mercury'
2022-05-24 21:20:08al] Mountpoint '/Volumes/Backups of Mercury' is still valid
2022-05-24 21:20:09al] Mountpoint '/Volumes/Backups of Mercury' is still valid
2022-05-24 21:20:09al] Backing up to Backups of Mercury (/dev/disk3s1,e): /Volumes/Backups of Mercury
2022-05-24 21:20:09al] Mountpoint '/Volumes/Backups of Mercury' is still valid
2022-05-24 21:20:10Thinning] Starting age based thinning of Time Machine local snapshots on disk '/System/Volumes/Data'
2022-05-24 21:20:10SnapshotManagement] Created Time Machine local snapshot with name 'com.apple.TimeMachine.2022-05-24-212010.local' on disk '/System/Volumes/Data'
2022-05-24 21:20:10al] Declared stable snapshot: com.apple.TimeMachine.2022-05-24-212010.local
2022-05-24 21:20:10SnapshotManagement] Mounted stable snapshot: com.apple.TimeMachine.2022-05-24-212010.local at path: /Volumes/com.apple.TimeMachine.localsnapshots/Backups.backupdb/Mercury/2022-05-24-212010/Macintosh HD — Data source: Macintosh HD — Data
2022-05-24 21:20:10pThinning] No further thinning possible - no thinnable backups
2022-05-24 21:20:13Collection] First backup of source: "Macintosh HD — Data" (device: /dev/disk1s1 mount: '/System/Volumes/Data' fsUUID: F88D7C18-E1EC-4F90-9B71-4A481B580F26 eventDBUUID: 58B7A351-1BA1-4761-A370-5A46FA30AC5D)
2022-05-24 21:20:13Collection] Trusting source modification times for remote backups.
2022-05-24 21:20:13Collection] Found 0 perfect clone families, 0 partial clone families. Zero KB physical space used by clone files. Zero KB shared space.
2022-05-24 21:20:13Collection] Finished collecting events from volume "Macintosh HD — Data"
2022-05-24 21:20:13Collection] Saved event cache at /Volumes/Backups of Mercury/2022-05-24-212010.inprogress/.F88D7C18-E1EC-4F90-9B71-4A481B580F26.eventdb
2022-05-24 21:20:13gProgress] (fsk:0,dsk:0,fsz:1,dsz:0)(1/0)

 

 

FWIW, it looks like my Mac has been backing up successfully over night. I'll report back if it fails at some point today.

And, for the record, I'm on MacOS v11.6.5

Link to comment
On 5/24/2022 at 11:50 AM, jademonkee said:

Wow, thanks for the warning! I updated on the weekend, but hadn't noticed anything wrong. I just checked my logs, and indeed I had the errors mentioned above.

So I rebooted and disabled VT-d.

Do you know how I can check to see if I have any data corruption? Is it known to be caused by any specific activities? Would a parity check verify (and allow me to find what has been corrupted if it detects any problems?)?

Thanks again.

Does anyone know if this would be fixed in later versions or is disabling this the proper thing to do, seems like a band-aid. Im asking because i don't think its ready for me to upgrade, just too many various issues all over here and reddit. Are the devs making a list of these issues to address or are we suppose to dive in, update and fix whatever arise's as we find it? Sorry this is not meant to be condescending i just don't know how long i should wait it out or if now is as good as it gets.

Edited by thedinz
  • Like 2
Link to comment
19 hours ago, thedinz said:

Does anyone know if this would be fixed in later versions or is disabling this the proper thing to do, seems like a band-aid.

 

This is only for servers using a NIC that uses the tg3 driver, this assuming I'm right and the NIC, or more likely the NIC driver, is the problem when VT-d is enabled, if that's the case there's not much LT can do, they can't fix the driver, what they can do for now is to block loading the tg3 NIC driver if one is found during boot and VT-d/IOMMU is enable, this should at least protect anyone updating without being aware of the possible issues.

 

Even if the NICs are the problem it's not clear all different NIC models supported by this driver are affected, so it should also be possible to force Unraid do still use it when VT-d is enable, for anyone that wants to or believes they are not affected by this issue and doesn't want to lose VT-d.

 

Link to comment

After flipping through this 9 pages I decided not to upgrade. Again you need to be a Linux / Unraid guru to fix (basic) stuff. I am not a noob at all but Limetech may overestimate how much time people are willing to invest to go down the rabbit hole. After months of testing and 10 RCs network and stuff like time machine backups should work without a glitch. They don't as stated here. Probably the best way to deal with this is to wait for a couple of weeks for 6.10.x , set up a complete new server and move all data over from the old one. With a 128tb server this will take forever even with 10gb. I am really looking forward to this...

Link to comment
4 minutes ago, typewriter said:

Probably the best way to deal with this is to wait for a couple of weeks for 6.10.x , set up a complete new server and move all data over from the old one.

Doesn't seem like the best way to me. Even if you did want to setup new hardware, you can just transfer your drives with their data intact.

Link to comment
30 minutes ago, trurl said:

Doesn't seem like the best way to me. Even if you did want to setup new hardware, you can just transfer your drives with their data intact.

Yes, but that does not affect the need to iron out the quirks anyway. Never mind. It's not my intention to be a troll. I like the product. But it's a paid product so stuff should work. Nobody would have cared for another three month of testing to prevent those thing from happening. And frankly I can't understand that anyone came across the time machine problem at Limetech - for months. The rolling release approach that seems to be nearly as common as "subscriptions" in the industry leads basically to rolling betas that are named "releases". Everybody who has to use for example tools like Adobe CC knows what I am talking about. This is frustrating and requires far too much of my time.

Edited by typewriter
  • Like 1
Link to comment
20 minutes ago, typewriter said:

Yes, but that does not affect the need to iron out the quirks anyway. Never mind. It's not my intention to be a troll. I like the product. But it's a paid product so stuff should work. Nobody would have cared for another three month of testing to prevent those thing from happening. And frankly I can't understand that anyone came across the time machine problem at Limetech - for months. The rolling release approach that seems to be nearly as common as "subscriptions" in the industry leads basically to rolling betas that are named "releases". Everybody who has to use for example tools like Adobe CC knows what I am talking about. This is frustrating and requires far too much of my time.

I'm not trying to be mean but you fail to grasp the sheer number of hardware configurations out there that can cause different issues to present themselves.  Not to mention that Limetech is not in control of the linux kernel nor most/all of the different drivers used in it.  The VT-d and NIC combo issue seems to be specific to that configuration and I would NOT expect Limetech to have all configs of hardware covered, heck one hardware config might be perfectly fine from one manufacturer while another has issues; purely because of the BIOS.

 

Software like Adobe CC should be much easier to control for as it is only the software layer, they don't care so much about the OS (while Limetech has to since they are building an appliance)

 

As for Time Machine... it has from my experience always been finicky on anything other than Apple hardware or USB drives connected to a Mac.  I stopped using it with my Mac laptops and desktop a long time ago for that reason.

 

 

Most people are unwilling to test a beta/rc on there machines, but will willing upgrade to a stable release as soon as it is out.  The pool of people willing to test is therefore much much lower and the number of different configs available for testing is smaller.  It is a catch 22 for both Limetech and the person with the server.  I don't upgrade any of my machines to the newest reason right away; my main box stayed on 6.4 until 6.9.2 was out.  I doubt my machines will see 6.10 until Christmas, mostly because they are running beautifully right now and I don't want to touch them.

  • Upvote 1
Link to comment
16 minutes ago, prostuff1 said:

I'm not trying to be mean but you fail to grasp the sheer number of hardware configurations out there that can cause different issues to present themselves. 

Correct. But Unraid is not an open source product. You have to pay for it. I wouldn't mind if the company would state "our product is ready, but driver x is not working because of the actual Linux state". Fine. Then I could decide to not install if I need this exact driver or feature. My point is that we as users are now figuring out what is not working. This is not our job.

 

And you are right - not too many want to test beta software. Well, they are not paid for it. It's not their job. Testing by the community can only be an additional and small part of quality assurance.

 

Well, I don't want to argue here - as I said I like the product and I will like you upgrade a few month from now.

Link to comment
1 hour ago, typewriter said:

and stuff like time machine backups should work without a glitch.

Just wanted to point out a newly added docker Time Machine based on a project whose goal is to provide stable Time Machine backups in a constantly changing environment (I'm paraphrasing). I hesitate to recommend it since I just started using it about an hour ago but it was fairly easy to get it installed and working. You might want to check it out.

  • Like 1
  • Thanks 2
Link to comment
On 5/22/2022 at 11:57 PM, JorgeB said:

 

TL; DR I would recommend only running v6.10.x on a server with a Brodcom NIC that uses the tg3 driver if VT-d/IOMMU is disable or it might in some cases cause serious stability issues, including possible filesystem corruption.

 

 

Another update since this is an important issue, there's a new case with an IBM/Lenovo X3100 M5 server, this server uses the same NIC driver as the HP so this appears to confirm the problem is the NIC/NIC driver when IOMMU is enable.

 

Known problematic NICs:

 

HP Microserver Gen8:

03:00.0 Ethernet controller [0200]: Broadcom Inc. and subsidiaries NetXtreme BCM5720 Gigabit Ethernet PCIe [14e4:165f]
    DeviceName: NIC Port 1
    Subsystem: Hewlett-Packard Company NC332i Adapter [103c:2133]
    Kernel driver in use: tg3

 

IBM/Lenovo X3100 M5:

06:00.0 Ethernet controller [0200]: Broadcom Inc. and subsidiaries NetXtreme BCM5717 Gigabit Ethernet PCIe [14e4:1655] (rev 10)
    DeviceName: Broadcom 5717
    Subsystem: IBM NetXtreme BCM5717 Gigabit Ethernet PCIe [1014:0490]
    Kernel driver in use: tg3

 

HP ProLiant ML350p Gen8

02:00.0 Ethernet controller [0200]: Broadcom Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe [14e4:1657] (rev 01)
    DeviceName: NIC Port 1
    Subsystem: Hewlett-Packard Company NetXtreme BCM5719 Gigabit Ethernet PCIe [103c:3372]
    Kernel driver in use: tg3

 

This driver supports many different NICs, unclear for now if all are affected or just some, also unclear if AMD based servers with AMD-Vi/IOMMU enable are affected, but for now I would recommend only running v6.10.x on a server with a Brodcom NIC that uses this driver if VT-d/IOMMU is disable or it might in some cases cause serious stability issues, including possible filesystem corruption.

 

When there is a problem with one of these NICs and VT-d you should see multiple errors similar to below in the logs not long after booting, usually before a couple of hours of uptime:

 

May 21 15:53:05 Tower kernel: DMAR: ERROR: DMA PTE for vPFN 0xb0780 already set (to b0780003 not 28dc74801)
May 21 15:53:05 Tower kernel: ------------[ cut here ]------------
May 21 15:53:05 Tower kernel: WARNING: CPU: 1 PID: 557 at drivers/iommu/intel/iommu.c:2408 __domain_mapping+0x2e5/0x390

 

If you see that stop using the server and disable VT-d/IOMMU ASAP, there's no need to disable VT-x/HVM, i.e., you can still run VMs (but without VT-d/IOMMU can't passthrough any device to one).

 

For Intel CPUs VT-d can usually be disabled in the BIOS, alternatively you can add intel_iommu=off to the syslinux.cfg append line, on the main GUI page click on flash and scroll down to "Syslinux Configuration", then add it to the default boot option, the one in green) :

 

image.png.76e6370b2b9d73824904acaed1f8102d.png

 

 

In either case confirm it's really disabled, you can do that by clicking on "system information", top right of the GUI:

 

image.png.90a561f39e1842c1801e78b2d542333e.png

 

 

 

Original post here:

https://forums.unraid.net/topic/123620-unraid-os-version-6100-available/?do=findComment&comment=1128822

 

 

Just confirm this also affect NVMe, during change mobo to Intel B365 and add NVMe also have corrupt problem ( WD SN550 & Samsung EVO 970 same ), if use MC or CP to copy file, it may not prompt error, but rsync will easy prompt you the error during file copy.

 

By apply intel_iommu=off, this solve this serious problem.

 

PS: Confirm even enable IOMMU, if not use NVMe, system stable solid. I have apply RMA for the mobo, anyway it is a good news that I don't need swap mobo after this fix.

 

 

Edited by Vr2Io
Link to comment
5 minutes ago, Vr2Io said:

 

Just confirm this also affect NVMe, during change mobo to Intel B365 and add NVMe also have corrupt problem ( WD SN550 & Samsung EVO 970 same ), if use MC or CP to copy file, it may not prompt error, but rsync will easy prompt you the error during file copy.

 

By apply intel_iommu=off, this solve this serious problem.

 

PS: Confirm even enable IOMMU, if not use NVMe, system stable solid.

 

 

 

To confirm: which motherboard is this?  Does onboard NIC use the 'tg3' driver?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.