Mailing List Archive: Proposal: ld.gold --rosegment

Proposal: ld.gold --rosegment

ale+gentoo at clearmind

Jan 27, 2016, 3:30 PM

Post #1 of 6 (2353 views)

Hi, as you might know, global read-only data (e.g. the .rodata section)
usually end up in the same segment as .text. This means that .rodata
contains potentially executable data, which is always useful for an
attacker looking for ROP gadgets.

However, the gold linker has a nice option (--rosegment) to split in
distinct segments .rodata and .text, so that read-only data is not
executable.

So: why don't we enable it in Gentoo hardened? I know for sure that
certain packages fail to link with ld.gold (see [1]).

A couple of questions:

* Can we blacklist some packages from being linked using gold? Maybe we
can provide a package.env file in an overlay/profile listing all
those that have to use bfd (CFLAGS="-fuse-ld=bfd").
* Does Gentoo have an infrastructure to rapidly test a new option on a
large set of packages? If not, I might set up something. Scripts to
orchestrate everything would be useful too.

--
Alessandro Di Federico

[1] https://bugs.gentoo.org/show_bug.cgi?id=269315

Re: Proposal: ld.gold --rosegment [ In reply to ]

pageexec at freemail

Jan 27, 2016, 5:49 PM

Post #2 of 6 (2328 views)

On 28 Jan 2016 at 0:30, Alessandro Di Federico wrote:

> Hi, as you might know, global read-only data (e.g. the .rodata section)
> usually end up in the same segment as .text. This means that .rodata
> contains potentially executable data, which is always useful for an
> attacker looking for ROP gadgets.
>
> However, the gold linker has a nice option (--rosegment) to split in
> distinct segments .rodata and .text, so that read-only data is not
> executable.
>
> So: why don't we enable it in Gentoo hardened?

because it's a useless security measure. for a non-executable .rodata
section to make any sense, the following condition would have to hold:

a bug (or set of bugs) is exploitable if and only if .rodata is executable.

nobody has ever shown that there exists such a bug (or set of bugs) and
in fact there's ample evidence that already executable code contains all
the necessary gadgets an exploit would need. on the other hand breaking
.rodata out into its own PT_LOAD segment will waste disk space, kernel
memory, virtual address space, slow down vma lookup time, etc, for exactly
zero gain in security. why bother?

Re: Proposal: ld.gold --rosegment [ In reply to ]

ale+gentoo at clearmind

Jan 29, 2016, 7:44 AM

Post #3 of 6 (2312 views)

On Thu, 28 Jan 2016 02:49:46 +0100
"PaX Team" <pageexec@freemail.hu> wrote:

> because it's a useless security measure. for a non-executable .rodata
> section to make any sense, the following condition would have to hold:
>
> a bug (or set of bugs) is exploitable if and only if .rodata is
> executable.
>
> nobody has ever shown that there exists such a bug (or set of bugs)
> and in fact there's ample evidence that already executable code
> contains all the necessary gadgets an exploit would need.

With a dirty one-liner run in my `/usr/bin` I've found 956 MiB of .text
and 444 MiB of .rodata, this means about a third of the opportunities
of finding the right gadget.

I wanted to run a ROP-gadget finder to be able to do a more precise
evaluation (maybe also in terms of type of gadgets found in various
sections) but the two gadget finder I usually use apparently look for
executable *sections*, not segments. But I could work on it.

In any case, in my experience finding the right gadget is not always
that easy, depending on what you want to do. Depending on the
architecture, syscalls have been proven not that hard to find, but
stealthy attacks might want to avoid unexpected syscalls completely
(SELinux? AppArmor?), maybe they just want to corrupt some data
structure, and then finding the right gadgets might become
challenging, in particular if your target library/program is small.

> on the other hand breaking .rodata out into its own PT_LOAD segment
> will waste disk space, kernel memory, virtual address space, slow
> down vma lookup time, etc, for exactly zero gain in security. why
> bother?

While trying to evaluate the cost of the thing in terms of disk and
memory space I realized that `--rosegment` is only partially effective.

Take a look at the following `readelf -l` of a `--rosegment` hello world
program:

Program Headers:
Type Offset VirtAddr FileSiz MemSiz Flg Align
LOAD 0x000000 0x0000000000400000 0x00040d 0x00040d R E 0x1000
LOAD 0x000410 0x0000000000401410 0x000318 0x000318 R 0x1000
LOAD 0x000728 0x0000000000402728 0x000228 0x000229 RW 0x1000

The wasted disk space is practically zero, and there are 0x410 wasted
bytes of memory due to `--rosegment` (the second `PT_LOAD` is mapped at
0x401410), in addition to the 0x728 which are wasted due to the RW
segment. But the real problem is that the kernel is going to `mmap`
0x1000 bytes for the first `PT_LOAD`, no matter what, and all that data
will be +x, including data which is supposed to be in the second and
third `PT_LOAD`.

This means that `--rosegment` is a fully effective countermeasure only
if the `+x` segment is 0x1000 bytes large. Or from another POV,
`--rosegment` should force the `+x` segment to have page-sized dedicated
area *in the file*.

I'll try to come up with a patch for `ld.gold`.

--
Alessandro Di Federico

Re: Proposal: ld.gold --rosegment [ In reply to ]

pageexec at freemail

Jan 29, 2016, 9:13 AM

Post #4 of 6 (2319 views)

On 29 Jan 2016 at 16:44, Alessandro Di Federico wrote:

> On Thu, 28 Jan 2016 02:49:46 +0100
> "PaX Team" <pageexec@freemail.hu> wrote:
>
> > because it's a useless security measure. for a non-executable .rodata
> > section to make any sense, the following condition would have to hold:
> >
> > a bug (or set of bugs) is exploitable if and only if .rodata is
> > executable.
> >
> > nobody has ever shown that there exists such a bug (or set of bugs)
> > and in fact there's ample evidence that already executable code
> > contains all the necessary gadgets an exploit would need.
>
> With a dirty one-liner run in my `/usr/bin` I've found 956 MiB of .text
> and 444 MiB of .rodata, this means about a third of the opportunities
> of finding the right gadget.

all that is irrelevant i'm afraid. what matters is the simple condition
above. do you know of any bugs that satisfy it? you see, you're asking
for a change that has non-zero costs and for all we know, zero benefits.

> Take a look at the following `readelf -l` of a `--rosegment` hello world
> program:
>
> Program Headers:
> Type Offset VirtAddr FileSiz MemSiz Flg Align
> LOAD 0x000000 0x0000000000400000 0x00040d 0x00040d R E 0x1000
> LOAD 0x000410 0x0000000000401410 0x000318 0x000318 R 0x1000
> LOAD 0x000728 0x0000000000402728 0x000228 0x000229 RW 0x1000
>
> The wasted disk space is practically zero,

for a useless hello world. what is it for real apps? what is it when you
page align section data that go into different segments? what fits in a
single physical page above would end up in 2 or 3 pages, a 100% or 200%
overhead if you really want to play this silly game. but before you care
about the costs of --rosegment you should take a step back and demonstrate
its non-zero benefits.

> and there are 0x410 wasted bytes of memory due to `--rosegment` (the second
> `PT_LOAD` is mapped at 0x401410), in addition to the 0x728 which are wasted
> due to the RW segment.

there's nothing wasted here, quite the opposite in fact, the linker was
smart enough to pull 3 segments into one physical page which minimizes
page cache waste on the kernel side and disk block usage on the filesystem
side.

> This means that `--rosegment` is a fully effective countermeasure only
> if the `+x` segment is 0x1000 bytes large.

you have yet to demonstrate that it's a countermeasure against anything ;).

Re: Proposal: ld.gold --rosegment [ In reply to ]

ale+gentoo at clearmind

Jan 29, 2016, 11:23 AM

Post #5 of 6 (2318 views)

On Fri, 29 Jan 2016 18:13:23 +0100
"PaX Team" <pageexec@freemail.hu> wrote:

> On 29 Jan 2016 at 16:44, Alessandro Di Federico wrote:
>
> > On Thu, 28 Jan 2016 02:49:46 +0100
> > "PaX Team" <pageexec@freemail.hu> wrote:
> > > nobody has ever shown that there exists such a bug (or set of
> > > bugs) and in fact there's ample evidence that already executable
> > > code contains all the necessary gadgets an exploit would need.

Could you please detail better this "ample evidence"? Being so vague
while requiring so much burden of proof on my side is a bit unfair :)

I understand, and partly like, your practical approach, but security is
not just preventing a certain bug to from being exploitable.
Nowadays exploits, as you know, often use multiple vulnerabilities,
while in the past a single one was enough. This makes attacker's life
harder, that's why we ASLR: it doesn't prevent an exploit, it just makes
it harder to exploit (e.g. you need an additional vulnerability). That's
also why we have RELRO.

A common principle when designing a secure system is trying to reduce
the attack surface as much as possible, to decrease the chances that
an attack is feasible. And that's exactly what the countermeasure I'm
suggesting aims to do.

I don't like to cite my own work, but take for instance the "leakless"
attack [1,2]: finding the various gadgets required to perform the attack
is not a straightforward task. The lesser the executable code, the
harder it gets.

Also, there are a couple of tools which require a set of gadgets to
automatically build ROP chains (so called "ROP compilers", e.g. [3]).
Reducing the amount of binaries on which they work increases the
security of your system. Same argument on more versatile tools such as
nROP [4], which don't require a fixed set of gadget but try to play with
what they have. In this case we might make their life harder, and make
them produce longer ROP chains, which is a critical factor in the
evaluation of the feasibility of an exploit. Try a run of

find -name "*.rb" -exec grep "'Space'" {} \;

in your metasploit directory to get an idea of the average space
available for exploits.

Also, quite often, developers try to isolate Internet-facing deamons in
small programs (take a look at DJB's qmail), making them even smaller by
reducing the amount executable code sounds like a good idea to me.

Not to talk about works trying to remove gadgets at compile-time *on
the actual code* [5], ignoring `.rodata` and the like. But Gentoo will
probably never use those, so it's less of an argument :)

> > and there are 0x410 wasted bytes of memory due to `--rosegment`
> > (the second `PT_LOAD` is mapped at 0x401410), in addition to the
> > 0x728 which are wasted due to the RW segment.
>
> there's nothing wasted here, quite the opposite in fact, the linker
> was smart enough to pull 3 segments into one physical page which
> minimizes page cache waste on the kernel side and disk block usage on
> the filesystem side.

You're partly right. The kernel should be smart enough to use a single
physical page for both the 0x400000 and 0x401000 pages, despite having
different permissions, but for sure the last PT_LOAD needs a distinct
page, so its first 0x728 bytes are effectively wasted. But this is an
unrelated issue.

> you have yet to demonstrate that it's a countermeasure against
> anything ;).

First I have to convince myself that the countermeasure I'm suggesting
is fully effective! Which is not the case for how `--rosegment`
currently works. :)

--
Alessandro Di Federico

[1] https://clearmind.me/leakless.pdf
[2] https://www.usenix.org/conference/usenixsecurity15/technical-sessions/presentation/di-frederico
[3] https://github.com/pakt/ropc
[4] http://aurelien.wail.ly/nrop/
[5] https://www.acsac.org/2010/openconf/modules/request.php?module=oc_program&action=view.php&a=&id=121&type=2

Re: Proposal: ld.gold --rosegment [ In reply to ]

pageexec at freemail

Jan 29, 2016, 3:45 PM

Post #6 of 6 (2317 views)

On 29 Jan 2016 at 20:23, Alessandro Di Federico wrote:

> On Fri, 29 Jan 2016 18:13:23 +0100
> "PaX Team" <pageexec@freemail.hu> wrote:
>
> > On 29 Jan 2016 at 16:44, Alessandro Di Federico wrote:
> >
> > > On Thu, 28 Jan 2016 02:49:46 +0100
> > > "PaX Team" <pageexec@freemail.hu> wrote:
> > > > nobody has ever shown that there exists such a bug (or set of
> > > > bugs) and in fact there's ample evidence that already executable
> > > > code contains all the necessary gadgets an exploit would need.
>
> Could you please detail better this "ample evidence"? Being so vague
> while requiring so much burden of proof on my side is a bit unfair :)

the fact that the real life ROP exploits i've seen out there all make
use of gadgets exclusively from (intended) code and not some data that
happened to be mapped executable. note that i'm not claiming to have
seen all the exploits out there but even if some exploits happen to
use gadgets from non-code pages, it doesn't prove the value of --rosegment
until one also proves that it is impossible to write the same exploits
with gadgets from code pages only. add to this academic work like [1]
and i believe you'll find yourself up against a rather high hill if you
still want to prove the benefit of --rosegment ;).

[1] Q: Exploit Hardening Made Easy
https://www.usenix.org/legacy/event/sec11/tech/full_papers/Schwartz.pdf
https://www.usenix.org/legacy/event/sec11/tech/slides/schwartz.pdf

> I understand, and partly like, your practical approach, but security is
> not just preventing a certain bug to from being exploitable.

my approach is as much practical as theoretical, it is all based on a
very generic and strong threat model, a full categorization of exploit
techniques and corresponding defense mechanisms. i evaluate other defense
mechanisms based on the same threat model and/or just plain common sense
like in this case.

> Nowadays exploits, as you know, often use multiple vulnerabilities,
> while in the past a single one was enough. This makes attacker's life
> harder, that's why we ASLR: it doesn't prevent an exploit, it just makes
> it harder to exploit (e.g. you need an additional vulnerability).

actually ASLR does prevent exploitation with a bounded probability
absent of information leaks. if you're thinking of weak imitations
then sure, you can brute force your way, but that won't work under
grsecurity (you realize this is the gentoo hardened list, right? ;).

> That's also why we have RELRO.

yes, RELRO serves a quantifiable and useful purpose, which i have yet
to see about --rosegment.

> A common principle when designing a secure system is trying to reduce
> the attack surface as much as possible, to decrease the chances that
> an attack is feasible. And that's exactly what the countermeasure I'm
> suggesting aims to do.

the number of overall gadgets is not an attack surface. the number of
any particular kind of gadget is one in that if it's 0 then the particular
kind of gadget is unavailable for an exploit (which may or may not prevent
exploitation). you haven't shown that --rosegment can achieve it for
any particular gadget class on any particular binary, let alone all
binaries mapped into a particular process (good luck with glibc btw ;).

> I don't like to cite my own work, but take for instance the "leakless"
> attack [1,2]: finding the various gadgets required to perform the attack
> is not a straightforward task. The lesser the executable code, the
> harder it gets.

yes i know your work and i think you chose the very wrong choir to
preach it to :P. for one, your attack assumes a setup without PIEs
and BIND_NOW/RELRO which is kinda the anti-thesis of hardened gentoo.

second, your attack assumes the ability to hijack control flow for
which we also know the solution (various forms of CFI, i guess i'll
stop at not citing my own work on it to be fair ;).

last but not least, you haven't proved the 'harder' part in any
quantifiable way i can see. like i said at the beginning, just
show me *one* bug which is exploitable when .rodata is executable
and isn't exploitable when .rodata is non-executable.

> > there's nothing wasted here, quite the opposite in fact, the linker
> > was smart enough to pull 3 segments into one physical page which
> > minimizes page cache waste on the kernel side and disk block usage on
> > the filesystem side.
>
> You're partly right. The kernel should be smart enough to use a single
> physical page for both the 0x400000 and 0x401000 pages, despite having
> different permissions, but for sure the last PT_LOAD needs a distinct
> page,

only when a copy-on-write fault hits it, which i'm not sure happens for
a simple thing like hello world (i could certainly write one that would
not need to write there since it's basically a write/exit sequence).