[v2] x86/kvm/emulate: Avoid RET for FASTOPs

[PATCH v2 00/12] x86/kvm/emulate: Avoid RET for FASTOPs

Posted by Peter Zijlstra 1 year, 2 months ago

Hi!

At long last, a respin of these patches.

The FASTOPs are special because they rely on RET to preserve CFLAGS, which is a
problem with all the mitigation stuff. Also see things like: ba5ca5e5e6a1
("x86/retpoline: Don't clobber RFLAGS during srso_safe_ret()").

Rework FASTOPs to no longer use RET and side-step the problem of trying to make
the various return thunks preserve CFLAGS for just this one case.

There are two separate instances, test_cc() and fastop(). The first is
basically a SETCC wrapper, which seems like a very complicated (and somewhat
expensive) way to read FLAGS. Instead use the code we already have to emulate
JCC to fully emulate the instruction.

That then leaves fastop(), which when marked noinline is guaranteed to exist
only once. As such, CALL+RET isn't needed, because we'll always be RETurning to
the same location, as such replace with JMP+JMP.

My plan is to take the objtool patches through tip/objtool/core, the nospec
patches through tip/x86/core and either stick the fastop patches in that latter
tree if the KVM folks agree, or they can merge the aforementioned two branches
and then stick the patches on top, whatever works for people.

Re: [PATCH v2 00/12] x86/kvm/emulate: Avoid RET for FASTOPs

Posted by Sean Christopherson 1 year, 2 months ago

On Mon, Nov 11, 2024, Peter Zijlstra wrote:
> Hi!
> 
> At long last, a respin of these patches.
> 
> The FASTOPs are special because they rely on RET to preserve CFLAGS, which is a
> problem with all the mitigation stuff. Also see things like: ba5ca5e5e6a1
> ("x86/retpoline: Don't clobber RFLAGS during srso_safe_ret()").
> 
> Rework FASTOPs to no longer use RET and side-step the problem of trying to make
> the various return thunks preserve CFLAGS for just this one case.
> 
> There are two separate instances, test_cc() and fastop(). The first is
> basically a SETCC wrapper, which seems like a very complicated (and somewhat
> expensive) way to read FLAGS. Instead use the code we already have to emulate
> JCC to fully emulate the instruction.
> 
> That then leaves fastop(), which when marked noinline is guaranteed to exist
> only once. As such, CALL+RET isn't needed, because we'll always be RETurning to
> the same location, as such replace with JMP+JMP.
> 
> My plan is to take the objtool patches through tip/objtool/core, the nospec
> patches through tip/x86/core and either stick the fastop patches in that latter
> tree if the KVM folks agree, or they can merge the aforementioned two branches
> and then stick the patches on top, whatever works for people.

Unless Paolo objects, I think it makes sense to take the fastop patches through
tip/x86/core.