Mailing List Archive

[PATCH v2 6/6] x86: limit amount of INT3 in IND_THUNK_*
There's no point having every replacement variant to also specify the
INT3 - just have it once in the base macro. When patching, NOPs will get
inserted, which are fine to speculate through (until reaching the INT3).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
I also wonder whether the LFENCE in IND_THUNK_RETPOLINE couldn't be
replaced by INT3 as well. Of course the effect will be marginal, as the
size of the thunk will still be 16 bytes when including tail padding
resulting from alignment.
---
v2: New.

--- a/xen/arch/x86/indirect-thunk.S
+++ b/xen/arch/x86/indirect-thunk.S
@@ -11,6 +11,8 @@

#include <asm/asm_defns.h>

+.purgem ret
+
.macro IND_THUNK_RETPOLINE reg:req
call 2f
1:
@@ -24,12 +26,10 @@
.macro IND_THUNK_LFENCE reg:req
lfence
jmp *%\reg
- int3 /* Halt straight-line speculation */
.endm

.macro IND_THUNK_JMP reg:req
jmp *%\reg
- int3 /* Halt straight-line speculation */
.endm

/*
@@ -44,6 +44,8 @@ ENTRY(__x86_indirect_thunk_\reg)
__stringify(IND_THUNK_LFENCE \reg), X86_FEATURE_IND_THUNK_LFENCE, \
__stringify(IND_THUNK_JMP \reg), X86_FEATURE_IND_THUNK_JMP

+ int3 /* Halt straight-line speculation */
+
.size __x86_indirect_thunk_\reg, . - __x86_indirect_thunk_\reg
.type __x86_indirect_thunk_\reg, @function
.endm
Re: [PATCH v2 6/6] x86: limit amount of INT3 in IND_THUNK_* [ In reply to ]
On Mon, Sep 28, 2020 at 02:32:24PM +0200, Jan Beulich wrote:
> There's no point having every replacement variant to also specify the
> INT3 - just have it once in the base macro. When patching, NOPs will get
> inserted, which are fine to speculate through (until reaching the INT3).
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Roger Pau Monné <roger.pau@citrix.com>

> ---
> I also wonder whether the LFENCE in IND_THUNK_RETPOLINE couldn't be
> replaced by INT3 as well. Of course the effect will be marginal, as the
> size of the thunk will still be 16 bytes when including tail padding
> resulting from alignment.

I think Andrew is the best one to have an opinion on this.

Thanks, Roger.
Re: [PATCH v2 6/6] x86: limit amount of INT3 in IND_THUNK_* [ In reply to ]
On 28/09/2020 13:32, Jan Beulich wrote:
> There's no point having every replacement variant to also specify the
> INT3 - just have it once in the base macro. When patching, NOPs will get
> inserted, which are fine to speculate through (until reaching the INT3).
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> I also wonder whether the LFENCE in IND_THUNK_RETPOLINE couldn't be
> replaced by INT3 as well. Of course the effect will be marginal, as the
> size of the thunk will still be 16 bytes when including tail padding
> resulting from alignment.

There are surprising performance implications from the choice of
speculation blocker.  RSB filling in particular had a benefit (up to 6%
iirc) from unrolling the loop.

Any differences here are likely to be marginal, whereas for inline
retpoline, the code volume reduction might easily be the winning factor.

> ---
> v2: New.
>
> --- a/xen/arch/x86/indirect-thunk.S
> +++ b/xen/arch/x86/indirect-thunk.S
> @@ -11,6 +11,8 @@
>
> #include <asm/asm_defns.h>
>
> +.purgem ret

This needs a comment.

~Andrew