[6.7.0-rc1] Potential security risk?


dlandon

Recommended Posts

I see this in my log:

Jan 23 09:36:52 MediaServer kernel: L1TF CPU bug present and SMT on, data leak possible. See CVE-2018-3646 and https://www.kernel.org/doc/html/latest/admin-guide/l1tf.html for details.

Seems to be related to Intel HyperThreading, cpu pinning in VMs, and a malicious VM guest.

 

Here is the /sys/devices/system/cpu/vulnerabilities/l1tf file:

Mitigation: PTE Inversion; VMX: conditional cache flushes, SMT vulnerable

I think this is a potential issue on Unraid.

Link to comment

Maybe it's this simple.

 

The option/parameter is “kvm-intel.vmentry_l1d_flush=always,cond,never”.

 

The parameter can be provided on the kernel command line, as a module parameter when loading the modules and at runtime modified via the sysfs file:

 

/sys/module/kvm_intel/parameters/vmentry_l1d_flush

 

The default is ‘cond’. If ‘l1tf=full,force’ is given on the kernel command line, then ‘always’ is enforced and the kvm-intel.vmentry_l1d_flush module parameter is ignored and writes to the sysfs file are rejected.

 

EDIT: Maybe as you said a non issue for Unraid because we are running trusted VMs.

Edited by dlandon
Link to comment
11 minutes ago, dlandon said:

Maybe it's this simple.

Good analysis.  When looking at this last year I ran across something that said this had a pretty significant performance impact, I'll have to try and find that again.  I remember thinking at the time, if someone cared about it they could add the kernel option.

Link to comment
22 minutes ago, limetech said:

Good analysis.  When looking at this last year I ran across something that said this had a pretty significant performance impact, I'll have to try and find that again.  I remember thinking at the time, if someone cared about it they could add the kernel option.

A I read in the link you posted, that's about where my head exploded.  Definitely not in my pay grade.

Link to comment

tl;dr to fully fix this Intel garbage users are going to pay a performance price. The price varies wildly based on user workloads and is basically impossible to predict however I have been in conversation where some devops have seen insane edge cases performance drops.

 

I would suggest the right way to do this is to fix it by default but document an opt out for those that want to accept the risk because it is not possible for normal humans to really understand this beginning to end.

Link to comment

This is recommended and is the way it is right now on Unraid:

 

"The general recommendation is to enable L1D flush on VMENTER. The kernel defaults to conditional mode on affected processors."

 

Maybe it is best to leave it alone.  I don't know how you would communicate to a novice user what it means and why turn it on or off.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.