Mailing List Archive

Null pointer dereference warnings when compiling with -O3
Hi,

I've recently started to build with "--extra-cppflags=-O3", to investigate
if incidental loss of transport stream packets can be related to runtime
performance.
To my surprise I do then get a lot of "null pointer dereference" and
"possible null pointer dereference" compilation warnings. This is with
Fedora-32 and gcc 10.2.1.

I have the impression that these warnings are just wrong.

This is for the following reasons:
- The code that is flagged runs already for years and years. If there was a
null pointer dereference I think that somebody would have noticed this. The
word "segfault" comes to the mind.
- There are several warnings about class DTVChannelInfo declared in file
dtvconfparser.h. The interesting bit is that when I change the order of the
class variables, e.g. move "QString m_name" two lines up, that some of the
"null pointer dereference" warnings disappear. This is not logical; the
order of the class members should not determine if code is valid or not. As
far as I know of course.
- When compiled with "-O3 -fno-inline" then all "null pointer
dereference"warnings but one do disappear. This is again not logical to me.
Possibly only when "-fno-inline" is given the "null pointer dereference"
warnings are correct.

Considering this I do not think it is a good idea to rewrite the code to
satisfy the compiler.

One way to deal with this is to remove the "-Wnull-dereference" option from
the standard set of compiler flags.

What I want to achieve is to build fully optimized without getting
warnings, so that if there is a warning it is something that needs to be
looked at.

Anybody an opinion on this?

Kind regards,
Klaas.
Re: Null pointer dereference warnings when compiling with -O3 [ In reply to ]
On Wed, Sep 23, 2020 at 5:59 PM Klaas de Waal <klaas.de.waal@gmail.com> wrote:

> Anybody an opinion on this?

I seem to recall that there were known to be
a number of (possibly) false positives in
gcc 10 that have been fixed in the latest git
(to be gcc 11). I am not sure if/when those
patches might be backported (one of the
fixes was a somewhat larger patch to the
internal trackers) to a future 10.x release.
_______________________________________________
mythtv-dev mailing list
mythtv-dev@mythtv.org
http://lists.mythtv.org/mailman/listinfo/mythtv-dev
http://wiki.mythtv.org/Mailing_List_etiquette
MythTV Forums: https://forum.mythtv.org
Re: Null pointer dereference warnings when compiling with -O3 [ In reply to ]
On Wed, 2020-09-23 at 19:58 +0200, Klaas de Waal wrote:
> Hi,
>
> I've recently started to build with "--extra-cppflags=-O3", to
> investigate if incidental loss of transport stream packets can be
> related to runtime performance.

You could also try " --toolchain=hardened". That will get you O2
optimization and a couple of other things.

> To my surprise I do then get a lot of "null pointer dereference" and
> "possible null pointer dereference" compilation warnings. This is
> with Fedora-32 and gcc 10.2.1.
>
> I have the impression that these warnings are just wrong.
>
> This is for the following reasons:
> - The code that is flagged runs already for years and years. If there
> was a null pointer dereference I think that somebody would have
> noticed this. The word "segfault" comes to the mind.

You are misunderstanding what the compiler is saying. It is telling you
that after inlining and optimization, based on what it can infer from
the code, there are definitely code paths that will lead to a segfault.

Here's a scenario that may help you understand why the compiler is
complaining...

Function A operates on pointer "foo" and has a priori knowledge that
"foo" will never be set to nullptr, so it can dereference the pointer
without testing its validity. Function A calls function B, and function
B (being called from many places) performs a validity test of the
passed in "foo" before using it. This code will compile cleanly when
there is no optimization.

When compiled with optimization, the above code will spew null pointer
dereference warnings about function A. This occurs because the compiler
is allowed to completely embed the source code of function B into
function A to eliminate overhead of calling an extra function. When
processing this new combined function A', the compiler notices the
validity test for the variable "foo" (from the former B), and it
scribbles a note for itself that says "the developer must know that
"foo" can be null because she wrote code that performs a validity
check". When the compiler finds a later dereference of "foo" (from the
former A) and can't find a validity check prior to that dereference, it
prints a warning message.

As a quick rule of thumb: In any given function, if one dereference of
a pointer is protected then they must all be protected. This rule of
thumb explains why the unoptimized two function code compiles cleanly
(each function is self consistent in whether or not it checks the
pointer), and why the optimized one big combined function doesn't
(because it is inconsistent in whether or not it checks the pointer).

If there was a way to embed into the source code the a priori knowledge
that the argument to function A will never be null, then the compiler
shouldn't complain at all about the optimized code. Unfortunately,
here's no way to label that function parameter in standard C++. Adding
an actual null pointer test at the top of function A would codify that
knowledge in a way that the compiler understands.

> - There are several warnings about class DTVChannelInfo declared in
> file dtvconfparser.h. The interesting bit is that when I change the
> order of the class variables, e.g. move "QString m_name" two lines
> up, that some of the "null pointer dereference" warnings disappear.
> This is not logical; the order of the class members should not
> determine if code is valid or not. As far as I know of course.

I agree, that's weird behavior. But then again, this is the optimizer
we're talking about. Everything about the optimizer seems to be a maze
of twisty little passages, all alike.

> - When compiled with "-O3 -fno-inline" then all "null pointer
> dereference"warnings but one do disappear. This is again not logical
> to me. Possibly only when "-fno-inline" is given the "null pointer
> dereference" warnings are correct.

Disabling inlines is disabling one of the major optimizations the
compiler can make.

When you inline one (or more) function(s) into another you are now
compiling a new function which is potentially as large as the sum of
the old function and every function that it calls. The compiler
metadata for this new function will be different, the code flow
decisions of which code path requires a branch to be taken and which
doesn't could change, the choices of which variable goes into which
register could change, etc, etc.

> Considering this I do not think it is a good idea to rewrite the code
> to satisfy the compiler.

I disagree. Its very likely a big rewrite isn't necessary, but all
these warnings should all be investigated to see exactly what is
happening and why the compiler is complaining. All the example
above would need is a validity check of "foo" at the top of function A
and the null dereference warnings will go away. Its very likely that
the solution to your warnings will have a similar fix.

> One way to deal with this is to remove the "-Wnull-dereference"
> option from the standard set of compiler flags.

To quote from the GCC documentation:

Warn if the compiler detects paths that trigger erroneous or
undefined behavior due to dereferencing a null pointer.

Why in the world would you ever want to remove that check? Compiler
and other tool developers have spent a lot of time implementing these
kinds of checks to help other developers improve code robustness,
speed, readability, etc. We should be taking advantage of that effort
wherever we can.

> What I want to achieve is to build fully optimized without getting
> warnings, so that if there is a warning it is something that needs to
> be looked at.

I absolutely agree with this statement! However, I believe that
disabling warnings instead of fixing them is the wrong approach.

David


_______________________________________________
mythtv-dev mailing list
mythtv-dev@mythtv.org
http://lists.mythtv.org/mailman/listinfo/mythtv-dev
http://wiki.mythtv.org/Mailing_List_etiquette
MythTV Forums: https://forum.mythtv.org
Re: Null pointer dereference warnings when compiling with -O3 [ In reply to ]
On Thu, Sep 24, 2020 at 11:34:34AM -0400, David Hampton wrote:
> If there was a way to embed into the source code the a priori knowledge
> that the argument to function A will never be null, then the compiler
> shouldn't complain at all about the optimized code. Unfortunately,
> here's no way to label that function parameter in standard C++. Adding
> an actual null pointer test at the top of function A would codify that
> knowledge in a way that the compiler understands.

My understanding is that if a pointer parameter can never be null,
then that a priori knowledge can be conveyed by passing a reference
instead of a pointer.

David
--
David Engel
david@istwok.net
_______________________________________________
mythtv-dev mailing list
mythtv-dev@mythtv.org
http://lists.mythtv.org/mailman/listinfo/mythtv-dev
http://wiki.mythtv.org/Mailing_List_etiquette
MythTV Forums: https://forum.mythtv.org
Re: Null pointer dereference warnings when compiling with -O3 [ In reply to ]
On Thu, 2020-09-24 at 11:34 -0400, David Hampton wrote:
> On Wed, 2020-09-23 at 19:58 +0200, Klaas de Waal wrote:
> > To my surprise I do then get a lot of "null pointer dereference"
> > and
> > "possible null pointer dereference" compilation warnings. This is
> > with Fedora-32 and gcc 10.2.1.
> >
> > I have the impression that these warnings are just wrong.
> >
> > This is for the following reasons:
> > - The code that is flagged runs already for years and years. If
> > there
> > was a null pointer dereference I think that somebody would have
> > noticed this. The word "segfault" comes to the mind.
>
> You are misunderstanding what the compiler is saying. It is telling
> you that after inlining and optimization, based on what it can infer
> from the code, there are definitely code paths that will lead to a
> segfault.

Hi Klaus,

I spent some time looking at this yesterday, and here's what I believe
is happening. Looking at this warning:

mpeg/atscdescriptors.h: In member function
‘void AvFormatDecoder::ScanATSCCaptionStreams(int)’:
mpeg/atscdescriptors.h:107:39: warning: potential null pointer
dereference [-Wnull-dereference]
107 | { return bool(((Offset(i,-1)[3])) & 1); }
| ~~~~~~~~~~~~~~~~^~~

Line atscdescriptors.h:106-7 is this:

bool Line21Field(int i) const
{ return bool(((Offset(i,-1)[3])) & 1); }

This is getting back a "const unsigned char *" pointer from Offset,
reading the byte at offset 3, and manipulating that value.

Line atscdescriptors.h:125 defines Offset as this:

const unsigned char *Offset(int i, int j) const
{ return m_ptrs[Index(i,j)]; }

The m_ptrs member variable is a QMap, and this is where the problem
occurs. According to the documentation for QMap::operator[]:

If the map contains no item with the key, the function inserts a
default-constructed value into the map with that key, and returns a
reference to it.

The default constructed value for any pointer is nullptr, so the
QMap::operator[] function has a code path that returns nullptr. I
believe that what is happening here is that the optimizer is inlining
this default constructor into AvFormatDecoder::ScanATSCCaptionStreams,
so the compiler now sees a code path where the pointer is set to
nullptr and then the third byte off that pointer is read.

That said, given the way that this function creates, uses, and discards
the CaptionServiceDescriptor object all within a single "for" loop, I
agree with you that it not possible for this code path to be triggered.

Looking at the warning from CaptionServiceDescriptor::toString, I'm not
clear where its called from so don't know if the map is valid.

Looking at the warning from ContentAdvisoryDescriptor::Parse, it's the
two calls to RatedDimensions(i) that have potential nullptr
dereferences. I suspect that this code also can't traverse the nullptr
code path, because it looks like its creating each item Index(i,j+1)
just before the loop tries to reference it.

I'm not sure there's a clean solution here, as every call to Offset()
potentially needs to be checked to see if it returned a null pointer.

If you are convinced that these nullptr dereference code paths can
never happen, you should be able to disable the warnings in
atscdescriptors.h with the following:

#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wnull-dereference"
foo(b); /* no diagnostic for this one */
#pragma GCC diagnostic pop

David


_______________________________________________
mythtv-dev mailing list
mythtv-dev@mythtv.org
http://lists.mythtv.org/mailman/listinfo/mythtv-dev
http://wiki.mythtv.org/Mailing_List_etiquette
MythTV Forums: https://forum.mythtv.org