[PATCH v2 00/12] x86/kvm/emulate: Avoid RET for FASTOPs

Peter Zijlstra posted 12 patches 1 week, 5 days ago
[PATCH v2 00/12] x86/kvm/emulate: Avoid RET for FASTOPs
Posted by Peter Zijlstra 1 week, 5 days ago
Hi!

At long last, a respin of these patches.

The FASTOPs are special because they rely on RET to preserve CFLAGS, which is a
problem with all the mitigation stuff. Also see things like: ba5ca5e5e6a1
("x86/retpoline: Don't clobber RFLAGS during srso_safe_ret()").

Rework FASTOPs to no longer use RET and side-step the problem of trying to make
the various return thunks preserve CFLAGS for just this one case.

There are two separate instances, test_cc() and fastop(). The first is
basically a SETCC wrapper, which seems like a very complicated (and somewhat
expensive) way to read FLAGS. Instead use the code we already have to emulate
JCC to fully emulate the instruction.

That then leaves fastop(), which when marked noinline is guaranteed to exist
only once. As such, CALL+RET isn't needed, because we'll always be RETurning to
the same location, as such replace with JMP+JMP.

My plan is to take the objtool patches through tip/objtool/core, the nospec
patches through tip/x86/core and either stick the fastop patches in that latter
tree if the KVM folks agree, or they can merge the aforementioned two branches
and then stick the patches on top, whatever works for people.
Re: [PATCH v2 00/12] x86/kvm/emulate: Avoid RET for FASTOPs
Posted by Sean Christopherson 1 week, 5 days ago
On Mon, Nov 11, 2024, Peter Zijlstra wrote:
> Hi!
> 
> At long last, a respin of these patches.
> 
> The FASTOPs are special because they rely on RET to preserve CFLAGS, which is a
> problem with all the mitigation stuff. Also see things like: ba5ca5e5e6a1
> ("x86/retpoline: Don't clobber RFLAGS during srso_safe_ret()").
> 
> Rework FASTOPs to no longer use RET and side-step the problem of trying to make
> the various return thunks preserve CFLAGS for just this one case.
> 
> There are two separate instances, test_cc() and fastop(). The first is
> basically a SETCC wrapper, which seems like a very complicated (and somewhat
> expensive) way to read FLAGS. Instead use the code we already have to emulate
> JCC to fully emulate the instruction.
> 
> That then leaves fastop(), which when marked noinline is guaranteed to exist
> only once. As such, CALL+RET isn't needed, because we'll always be RETurning to
> the same location, as such replace with JMP+JMP.
> 
> My plan is to take the objtool patches through tip/objtool/core, the nospec
> patches through tip/x86/core and either stick the fastop patches in that latter
> tree if the KVM folks agree, or they can merge the aforementioned two branches
> and then stick the patches on top, whatever works for people.

Unless Paolo objects, I think it makes sense to take the fastop patches through
tip/x86/core.