GCC 7 PPC64 Uninitialized data issue

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

GCC 7 PPC64 Uninitialized data issue

Carl E. Love-2
Valgrind developers:

The GCC 7 compiler for power has a new optimization that is tripping up
Valgrind.  Basically they are taking the strcmp() function and doing
some inlining of the code.  Here is a snipit of the generated code

return strcmp(str1, str2);
    100004bc:   28 4c 20 7d     ldbrx   r9,0,r9
    100004c0:   28 54 40 7d     ldbrx   r10,0,r10
    100004c4:   51 48 6a 7c     subf.   r3,r10,r9
    100004c8:   1c 00 82 40     bne     100004e4 <main+0x84>
    100004cc:   f8 1b 2a 7d     cmpb    r10,r9,r3
    100004d0:   00 00 aa 2f     cmpdi   cr7,r10,0
    100004d4:   38 00 9e 41     beq     cr7,1000050c <main+0xac>

The inlined code has two load double word instructions (ldbrx inst) that
are partially uninitialized. Following the two double word loads we do a
subf. instruction to subtract the values and set the condition code.
Then we get to the branch instruction (bne) and valgrind flags the
error:

==23948== Conditional jump or move depends on uninitialised value(s)
==23948==    at 0x100004C8: main (bug80497.c:9)
==23948==
==23948== Syscall param exit_group(status) contains uninitialised
byte(s)
==23948==    at 0x41BDEA4: _Exit (_exit.c:31)

The code has some cmpb instructions to make sure they don't actually use
the uninitialized bytes but that doesn't really help Valgrind.  I was
thinking of trying to create a rule to ignore the error but not sure I
can do this as it is inlined code.  It isn't like trying to ignore
errors from a function.  I have looked a little at the suppression rules
and from what I know of them it isn't clear how to write one for this
case were inlined code could show up anywhere.  

Wondering if anyone has thoughts on how to asddress fixing the issue or
                      how to suppress the issue?


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/valgrind-developers
Reply | Threaded
Open this post in threaded view
|

Re: GCC 7 PPC64 Uninitialized data issue

Patrick J. LoPresti-2
This sort of code is supposed to be handled by
"--partial-loads-ok=yes". (Which should be made the default, in my
opinion.)

If that does not work, it is a bug in the partial-loads-ok support.

 - Pat


On Tue, Apr 25, 2017 at 10:36 AM, Carl E. Love <[hidden email]> wrote:

> Valgrind developers:
>
> The GCC 7 compiler for power has a new optimization that is tripping up
> Valgrind.  Basically they are taking the strcmp() function and doing
> some inlining of the code.  Here is a snipit of the generated code
>
> return strcmp(str1, str2);
>     100004bc:   28 4c 20 7d     ldbrx   r9,0,r9
>     100004c0:   28 54 40 7d     ldbrx   r10,0,r10
>     100004c4:   51 48 6a 7c     subf.   r3,r10,r9
>     100004c8:   1c 00 82 40     bne     100004e4 <main+0x84>
>     100004cc:   f8 1b 2a 7d     cmpb    r10,r9,r3
>     100004d0:   00 00 aa 2f     cmpdi   cr7,r10,0
>     100004d4:   38 00 9e 41     beq     cr7,1000050c <main+0xac>
>
> The inlined code has two load double word instructions (ldbrx inst) that
> are partially uninitialized. Following the two double word loads we do a
> subf. instruction to subtract the values and set the condition code.
> Then we get to the branch instruction (bne) and valgrind flags the
> error:
>
> ==23948== Conditional jump or move depends on uninitialised value(s)
> ==23948==    at 0x100004C8: main (bug80497.c:9)
> ==23948==
> ==23948== Syscall param exit_group(status) contains uninitialised
> byte(s)
> ==23948==    at 0x41BDEA4: _Exit (_exit.c:31)
>
> The code has some cmpb instructions to make sure they don't actually use
> the uninitialized bytes but that doesn't really help Valgrind.  I was
> thinking of trying to create a rule to ignore the error but not sure I
> can do this as it is inlined code.  It isn't like trying to ignore
> errors from a function.  I have looked a little at the suppression rules
> and from what I know of them it isn't clear how to write one for this
> case were inlined code could show up anywhere.
>
> Wondering if anyone has thoughts on how to asddress fixing the issue or
>                       how to suppress the issue?
>
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Valgrind-developers mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/valgrind-developers

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/valgrind-developers
Reply | Threaded
Open this post in threaded view
|

Re: GCC 7 PPC64 Uninitialized data issue

Julian Seward-2
In reply to this post by Carl E. Love-2

> The inlined code has two load double word instructions (ldbrx inst) that
> are partially uninitialized. Following the two double word loads we do a
> subf. instruction to subtract the values and set the condition code.

Does it help to run with --expensive-definedness-checks=yes?  That
enables more accurate but more expensive definedness tracking for
subtracts, among other things.

J

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/valgrind-developers
Reply | Threaded
Open this post in threaded view
|

Re: GCC 7 PPC64 Uninitialized data issue

Mark Wielaard-3
In reply to this post by Patrick J. LoPresti-2
On Tue, 2017-04-25 at 11:19 -0700, Patrick J. LoPresti wrote:
> This sort of code is supposed to be handled by
> "--partial-loads-ok=yes". (Which should be made the default, in my
> opinion.)

It already is, according to NEWS, since:

        Release 3.11.0 (22 September 2015)
       
          - The default value for --partial-loads-ok has been changed
        from
            "no" to "yes", so as to avoid false positive errors
        resulting
            from some kinds of vectorised loops.

Cheers,

Mark

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/valgrind-developers
Reply | Threaded
Open this post in threaded view
|

Re: GCC 7 PPC64 Uninitialized data issue

Julian Seward-2
In reply to this post by Patrick J. LoPresti-2
On 25/04/17 20:19, Patrick J. LoPresti wrote:
> This sort of code is supposed to be handled by
> "--partial-loads-ok=yes". (Which should be made the default, in my
> opinion.)

I thought it was the default now.  Isn't it?  It certainly should be.

But I think that's not the issue.  Memcheck isn't complaining about
bad addresses here.  It's complaining that the resulting branch is
on undefined data -- and that undefined data exists exactly because
the partial-loads machinery paints the data acquired from the
invalid areas as undefined.  So the partial-loads machinery is
working perfectly.  IIUC, that is.

J

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/valgrind-developers
Reply | Threaded
Open this post in threaded view
|

Re: GCC 7 PPC64 Uninitialized data issue

Patrick J. LoPresti-2
On Tue, Apr 25, 2017 at 11:42 AM, Julian Seward <[hidden email]> wrote:
>
> I thought it was the default now.  Isn't it?  It certainly should be.
>
> But I think that's not the issue.  Memcheck isn't complaining about
> bad addresses here.  It's complaining that the resulting branch is
> on undefined data -- and that undefined data exists exactly because
> the partial-loads machinery paints the data acquired from the
> invalid areas as undefined.  So the partial-loads machinery is
> working perfectly.  IIUC, that is.

Yes, my mistake. I should have read more carefully.

This looks very hard to fix in general, since Valgrind is correct that
the branch depends on invalid data. Setting partial-loads-ok=no would
not help; it would just move the complaint to the memory access
itself.

Maybe try compiling the application with "-fno-builtin-strcmp"?

 - Pat

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/valgrind-developers
Reply | Threaded
Open this post in threaded view
|

Re: GCC 7 PPC64 Uninitialized data issue

Carl E. Love-2
In reply to this post by Julian Seward-2
> On Tue, 2017-04-25 at 11:19 -0700, Patrick J. LoPresti wrote:
>> This sort of code is supposed to be handled by
>> "--partial-loads-ok=yes". (Which should be made the default, in my
>> opinion.)
>>
>> If that does not work, it is a bug in the partial-loads-ok support.

On Tue, 2017-04-25 at 20:32 +0200, Julian Seward wrote:

> > The inlined code has two load double word instructions (ldbrx inst) that
> > are partially uninitialized. Following the two double word loads we do a
> > subf. instruction to subtract the values and set the condition code.
>
> Does it help to run with --expensive-definedness-checks=yes?  That
> enables more accurate but more expensive definedness tracking for
> subtracts, among other things.
>
> J
>


Julian, Patrick:

So I tried the --partial-loads first.  It didn't change things.

valgrind --partial-loads-ok=yes ./bug80497-gcc7 --track-origins=yes
==32322== Memcheck, a memory error detector
==32322== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==32322== Using Valgrind-3.13.0.SVN and LibVEX; rerun with -h for copyright info
==32322== Command: ./bug80497-gcc7 --track-origins=yes
==32322==
==32322== Conditional jump or move depends on uninitialised value(s)
==32322==    at 0x100004C8: main (bug80497.c:9)
==32322==
==32322== Syscall param exit_group(status) contains uninitialised byte(s)
==32322==    at 0x41BDEA4: _Exit (_exit.c:31)
==32322==    by 0x411520B: __run_exit_handlers (exit.c:98)
==32322==    by 0x40F29A3: generic_start_main.isra.0 (libc-start.c:323)
==32322==    by 0x40F2BB7: (below main) (libc-start.c:102)

I then tried the expensive-definedness-checking

valgrind --partial-loads-ok=yes ./bug80497-gcc7 --track-origins=yes
==32322== Memcheck, a memory error detector
==32322== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==32322== Using Valgrind-3.13.0.SVN and LibVEX; rerun with -h for copyright info
==32322== Command: ./bug80497-gcc7 --track-origins=yes
==32322==
==32322== Conditional jump or move depends on uninitialised value(s)
==32322==    at 0x100004C8: main (bug80497.c:9)
==32322==
==32322== Syscall param exit_group(status) contains uninitialised byte(s)
==32322==    at 0x41BDEA4: _Exit (_exit.c:31)
==32322==    by 0x411520B: __run_exit_handlers (exit.c:98)
==32322==    by 0x40F29A3: generic_start_main.isra.0 (libc-start.c:323)
==32322==    by 0x40F2BB7: (below main) (libc-start.c:102)

I did try recompiling the test case with -fno-builtin-strcmp and running without any
additional Valgrind flags and still got the issue.  The option does not turn off the
optimization of the strcmp.  Even if it did, I am not sure that would be a satisfactory
solution for the user in this case.  I will have to check with them for sure.

Other ideas?

                       Carl Love



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/valgrind-developers
Reply | Threaded
Open this post in threaded view
|

Re: GCC 7 PPC64 Uninitialized data issue

Patrick J. LoPresti-2
On Tue, Apr 25, 2017 at 12:22 PM, Carl E. Love <[hidden email]> wrote:
>
> I did try recompiling the test case with -fno-builtin-strcmp and running without any
> additional Valgrind flags and still got the issue.

Hm. You are sure the warning is still from the application code and
not the C library?

That is... strange. I thought the whole point of -fno-builtin-XXX was
to force the compiler to invoke the C library, forgetting whatever it
thinks it knows about the function. Is it possible strcmp() is
actually "inline" in your glibc headers?

Another idea: I suspect it would be pretty simple to hack the
partial-loads-ok logic to mark the "extra" bytes as defined rather
than undefined. Search memcheck/mc_main.c for "partial-loads-ok"; the
change should be just a line or two in mc_LOADV_128_or_256_slow().

...

Specifically, you can try commenting out this "for" loop on lines
1354-1355 of mc_main.c:

      for (j = szL-1; j >= 0; j--)
         res[j] |= pessim[j];

(Note: It has been a long time since I was in this code, so I could be
completely wrong. If so, hopefully Julian will correct me...)

Of course, this does risk masking real errors. But it should be a
quick way to move forward while exploring other options. And this sort
of partially-valid aligned load is probably not terribly common apart
from this sort of optimization.

That said, -fno-builtin-strcmp really should have caused GCC to emit a
call to the C library strcmp(). I suggest investigating why it didn't.

 - Pat

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/valgrind-developers
Reply | Threaded
Open this post in threaded view
|

Re: GCC 7 PPC64 Uninitialized data issue

iraisr
2017-04-26 1:21 GMT+02:00 Patrick J. LoPresti <[hidden email]>:
> On Tue, Apr 25, 2017 at 12:22 PM, Carl E. Love <[hidden email]> wrote:
>>
>> I did try recompiling the test case with -fno-builtin-strcmp and running without any
>> additional Valgrind flags and still got the issue.
>
> Hm. You are sure the warning is still from the application code and
> not the C library?

I'd suggest to try gdb+vgdb combo, as described nicely for example here:
http://valgrind.org/docs/manual/manual-core-adv.html#manual-core-adv.gdbserver

Start your application with: --vgdb=full --vgdb-error=1
[--vgdb-shadow-registers=yes]
and after attaching with gdb you'll get exact location which is
causing all the trouble.

I.

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/valgrind-developers
Reply | Threaded
Open this post in threaded view
|

Re: GCC 7 PPC64 Uninitialized data issue

Julian Seward-2
In reply to this post by Carl E. Love-2

Carl,

There are various refinements in memcheck/mc_translate.c that are related
to accurate definedness tracking in this kind of case, for
Iop_Cmp{32,64}, Iop_{Add,Sub}{32,64}, and Iop_CmpORD{32,64}{S,U}, the
last of which are PPC64 specials.  But they don't seem to have had any
effect here.

To diagnose this we really need to see the front end translation of the
assembly fragment you showed, plus the uninstrumented, optimised IR
relating to it.  You should be able to get those with --tool=none
--trace-flags=10001000 --trace-notbelow=<whatever>.  Please get those
and then we can see what's going on.  Maybe!

J

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/valgrind-developers
Reply | Threaded
Open this post in threaded view
|

Re: GCC 7 PPC64 Uninitialized data issue

Carl E. Love-2
On Wed, 2017-04-26 at 16:11 +0200, Julian Seward wrote:

> Carl,
>
> There are various refinements in memcheck/mc_translate.c that are related
> to accurate definedness tracking in this kind of case, for
> Iop_Cmp{32,64}, Iop_{Add,Sub}{32,64}, and Iop_CmpORD{32,64}{S,U}, the
> last of which are PPC64 specials.  But they don't seem to have had any
> effect here.
>
> To diagnose this we really need to see the front end translation of the
> assembly fragment you showed, plus the uninstrumented, optimised IR
> relating to it.  You should be able to get those with --tool=none
> --trace-flags=10001000 --trace-notbelow=<whatever>.  Please get those
> and then we can see what's going on.  Maybe!
>
> J
>
Julian:

Here is the test program.

#include <string.h>

int main ()
{
   char str1[15];
   char str2[15];
   strcpy(str1, "abcdef");
   strcpy(str2, "ABCDEF");
   return strcmp(str1, str2);
}

Here is a dump of the Power binary produced by gcc7 for the code we are
interested in.

0000000010000460 <main>:
#include <string.h>

int main ()
{
    10000460:   02 10 40 3c     lis     r2,4098
    10000464:   00 7f 42 38     addi    r2,r2,32512
   char str1[15];
   char str2[15];
   strcpy(str1, "abcdef");
   strcpy(str2, "ABCDEF");
    10000468:   fe ff 22 3d     addis   r9,r2,-2
    1000046c:   40 8a 29 81     lwz     r9,-30144(r9)
    10000470:   fe ff 42 3d     addis   r10,r2,-2
    10000474:   46 8a 4a 89     lbz     r10,-30138(r10)
{
    10000478:   c1 ff 21 f8     stdu    r1,-64(r1)
   strcpy(str1, "abcdef");
    1000047c:   fe ff a2 3c     addis   r5,r2,-2
    10000480:   38 8a a5 80     lwz     r5,-30152(r5)
    10000484:   fe ff c2 3c     addis   r6,r2,-2
    10000488:   3c 8a c6 a0     lhz     r6,-30148(r6)
    1000048c:   fe ff e2 3c     addis   r7,r2,-2
    10000490:   3e 8a e7 88     lbz     r7,-30146(r7)
   strcpy(str2, "ABCDEF");
    10000494:   fe ff 02 3d     addis   r8,r2,-2
    10000498:   44 8a 08 a1     lhz     r8,-30140(r8)
    1000049c:   20 00 21 91     stw     r9,32(r1)
    100004a0:   26 00 41 99     stb     r10,38(r1)
   return strcmp(str1, str2);
    100004a4:   30 00 21 39     addi    r9,r1,48
    100004a8:   20 00 41 39     addi    r10,r1,32
   strcpy(str1, "abcdef");
    100004ac:   30 00 a1 90     stw     r5,48(r1)
    100004b0:   34 00 c1 b0     sth     r6,52(r1)
    100004b4:   36 00 e1 98     stb     r7,54(r1)
   strcpy(str2, "ABCDEF");
    100004b8:   24 00 01 b1     sth     r8,36(r1)
   return strcmp(str1, str2);
    100004bc:   28 4c 20 7d     ldbrx   r9,0,r9
    100004c0:   28 54 40 7d     ldbrx   r10,0,r10
    100004c4:   51 48 6a 7c     subf.   r3,r10,r9
    100004c8:   1c 00 82 40     bne     100004e4 <main+0x84>
    100004cc:   f8 1b 2a 7d     cmpb    r10,r9,r3
    100004d0:   00 00 aa 2f     cmpdi   cr7,r10,0
    100004d4:   38 00 9e 41     beq     cr7,1000050c <main+0xac>
}
    100004d8:   b4 07 63 7c     extsw   r3,r3
    100004dc:   40 00 21 38     addi    r1,r1,64
    100004e0:   20 00 80 4e     blr
   return strcmp(str1, str2);
    100004e4:   00 00 00 39     li      r8,0
    100004e8:   f8 53 23 7d     cmpb    r3,r9,r10
    100004ec:   f8 43 28 7d     cmpb    r8,r9,r8
    100004f0:   38 1b 03 7d     orc     r3,r8,r3
    100004f4:   74 00 63 7c     cntlzd  r3,r3
    100004f8:   08 00 63 38     addi    r3,r3,8
    100004fc:   30 1e 29 79     rldcl   r9,r9,r3,56
    10000500:   30 1e 4a 79     rldcl   r10,r10,r3,56
    10000504:   50 48 6a 7c     subf    r3,r10,r9
    10000508:   d0 ff ff 4b     b       100004d8 <main+0x78>

The loads in question are at addresses 100004bc and 100004c0.  The
optimization loads these partially ininitialized values.  The compiler
uses the cmpb instruction to make sure it really only looks at the valid
bytes, but as we said Valgrind doesn't know all that.

I ran valgrind as:

valgrind --tool=none  --trace-flags=10001000
--trace-notbelow=1408  ./bug80497-gcc7  >  bug80497-debug 2>&1

Took a little playing but it looks like SB 1408 corresponds to the
beginning of main and the above assembly code runs thru SB1409.  I
edited down the valgrind output to just SB1408 and SB1409.  I have
attached it as a file.  I have the complete output if I threw away too
much.  Thanks for your help on this.

                           Carl Love


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/valgrind-developers

bug80497-debug-trimmed (28K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: GCC 7 PPC64 Uninitialized data issue

Julian Seward-2

The important IR bits are these:

  100004bc:   28 4c 20 7d     ldbrx   r9,0,r9
  100004c0:   28 54 40 7d     ldbrx   r10,0,r10
  // t143 is r9 after the load
  // t166 is r10 after the load

  100004c4:   51 48 6a 7c     subf.   r3,r10,r9
  t69 = Sub64(t143,t166)  // r3
  t167 = 64to8(CmpORD64S(t69,0x0:I64))

  100004c8:   1c 00 82 40     bne     100004e4 <main+0x84>
  if (CmpNE32(Xor32(And32(8Uto32(t167),0x2:I32),0x2:I32),0x0:I32))
    { PUT(1296) = 0x100004E4:I64; exit-Boring }

My best guess is, this is a problem with the instrumentation of the
CmpORD64S.  What that does is to compare t69 against zero, and produce
a 3 bit value like this:

  8 if t69 < 0, 4 if t69 > 0, 2 if t69 == 0

A bit strange but that's how the Power integer condition codes work.
And you can see that the branch condition in the IR tests for 2, meaning
equal.

This is handled by doCmpORD in mc_translate.c.  It special cases the case
where the second arg is zero, but only for the < (8) case: that bit is a
copy of the most significant bit of t69, so it doesn't matter if all the
other bits are undefined.  But it doesn't special case the == (2) case, and
so that bit is regarded as undefined if any bit in t69 is undefined.  But
that's too pessimistic.  In fact the "is t69 == 0" question is a defined
"no" if we can find any bit in t69 which is nonzero and defined.  The
general logic is explained a bit more in the comment further up, "Accurate
interpretation of CmpEQ/CmpNE."

Applying that to this problem would fix it, I suspect.

J

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/valgrind-developers
Reply | Threaded
Open this post in threaded view
|

Re: GCC 7 PPC64 Uninitialized data issue

Carl E. Love-2
In reply to this post by Carl E. Love-2
On Thu, 2017-04-27 at 17:02 +0200, Julian Seward wrote:

> The important IR bits are these:
>
>   100004bc:   28 4c 20 7d     ldbrx   r9,0,r9
>   100004c0:   28 54 40 7d     ldbrx   r10,0,r10
>   // t143 is r9 after the load
>   // t166 is r10 after the load
>
>   100004c4:   51 48 6a 7c     subf.   r3,r10,r9
>   t69 = Sub64(t143,t166)  // r3
>   t167 = 64to8(CmpORD64S(t69,0x0:I64))
>
>   100004c8:   1c 00 82 40     bne     100004e4 <main+0x84>
>   if (CmpNE32(Xor32(And32(8Uto32(t167),0x2:I32),0x2:I32),0x0:I32))
>     { PUT(1296) = 0x100004E4:I64; exit-Boring }
>
> My best guess is, this is a problem with the instrumentation of the
> CmpORD64S.  What that does is to compare t69 against zero, and produce
> a 3 bit value like this:
>
>   8 if t69 < 0, 4 if t69 > 0, 2 if t69 == 0
>
> A bit strange but that's how the Power integer condition codes work.
> And you can see that the branch condition in the IR tests for 2, meaning
> equal.
>
> This is handled by doCmpORD in mc_translate.c.  It special cases the case
> where the second arg is zero, but only for the < (8) case: that bit is a
> copy of the most significant bit of t69, so it doesn't matter if all the
> other bits are undefined.  But it doesn't special case the == (2) case, and
> so that bit is regarded as undefined if any bit in t69 is undefined.  But
> that's too pessimistic.  In fact the "is t69 == 0" question is a defined
> "no" if we can find any bit in t69 which is nonzero and defined.  The
> general logic is explained a bit more in the comment further up, "Accurate
> interpretation of CmpEQ/CmpNE."
>
> Applying that to this problem would fix it, I suspect.

doCmpORD should be calculating the shadow bit values for the LT, EQ and
GE bits.  The LT bit is 1 if the MSB of xxhash is a 1.  I figured the EQ
bit should be 1 if all of the xxhash bits are undefined since we don't
have any valid bits in xx to determine if the valid bits in xx are equal
to zero.  Similarly, the GT bit would be undefined if there were no
defined bits in xxhash.  This implementation generated more warnings
then the original code.  So either my understanding of what doCmpORD is
calculating is wrong or my algorithm for setting the EQ and GT bits is
wrong.  

Just for grins, I had doCmpORD return  all zeros.  The comparison with
uninitialized bits goes away but I get the error "Syscall param
exit_group(status) contains uninitialised byte".  Not sure why the
syscall now has uninitialized data.  

Anyway, am I wrong on what doCmpORD is doing?  Thoughts on how to
determine the Eq and GT bits for this case?  Thanks for the help.

                  Carl


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/valgrind-developers
Reply | Threaded
Open this post in threaded view
|

Re: GCC 7 PPC64 Uninitialized data issue

Carl E. Love-2
On Mon, 2017-05-01 at 12:58 -0700, Carl E. Love wrote:

> On Thu, 2017-04-27 at 17:02 +0200, Julian Seward wrote:
> > The important IR bits are these:
> >
> >   100004bc:   28 4c 20 7d     ldbrx   r9,0,r9
> >   100004c0:   28 54 40 7d     ldbrx   r10,0,r10
> >   // t143 is r9 after the load
> >   // t166 is r10 after the load
> >
> >   100004c4:   51 48 6a 7c     subf.   r3,r10,r9
> >   t69 = Sub64(t143,t166)  // r3
> >   t167 = 64to8(CmpORD64S(t69,0x0:I64))
> >
> >   100004c8:   1c 00 82 40     bne     100004e4 <main+0x84>
> >   if (CmpNE32(Xor32(And32(8Uto32(t167),0x2:I32),0x2:I32),0x0:I32))
> >     { PUT(1296) = 0x100004E4:I64; exit-Boring }
> >
> > My best guess is, this is a problem with the instrumentation of the
> > CmpORD64S.  What that does is to compare t69 against zero, and produce
> > a 3 bit value like this:
> >
> >   8 if t69 < 0, 4 if t69 > 0, 2 if t69 == 0
> >
> > A bit strange but that's how the Power integer condition codes work.
> > And you can see that the branch condition in the IR tests for 2, meaning
> > equal.
> >
> > This is handled by doCmpORD in mc_translate.c.  It special cases the case
> > where the second arg is zero, but only for the < (8) case: that bit is a
> > copy of the most significant bit of t69, so it doesn't matter if all the
> > other bits are undefined.  But it doesn't special case the == (2) case, and
> > so that bit is regarded as undefined if any bit in t69 is undefined.  But
> > that's too pessimistic.  In fact the "is t69 == 0" question is a defined
> > "no" if we can find any bit in t69 which is nonzero and defined.  The
> > general logic is explained a bit more in the comment further up, "Accurate
> > interpretation of CmpEQ/CmpNE."
> >
> > Applying that to this problem would fix it, I suspect.
>

Here is my patch with two attempts to fix the issue.  I think attempt 1
is actually flawed as I think I was returning the value of the condition
code bits not if they were define.  I think fix 2 is the right one, but
I still get the same error plus an additional error about a syscall with
uninitialized data.

                         Carl Love
>From e7ee8ff9ac7efded61e2288698e09781d37e5520 Mon Sep 17 00:00:00 2001
From: Carl Love <[hidden email]>
Date: Wed, 3 May 2017 11:33:22 -0500
Subject: [PATCH] Attempted fix for compare of uninitialized data.

---
 memcheck/mc_translate.c | 148 +++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 128 insertions(+), 20 deletions(-)

diff --git a/memcheck/mc_translate.c b/memcheck/mc_translate.c
index 0b91840..5f08acd 100644
--- a/memcheck/mc_translate.c
+++ b/memcheck/mc_translate.c
@@ -1108,8 +1108,13 @@ static IRAtom* doCmpORD ( MCEnv*  mce,
 {
    Bool   m64    = cmp_op == Iop_CmpORD64S || cmp_op == Iop_CmpORD64U;
    Bool   syned  = cmp_op == Iop_CmpORD64S || cmp_op == Iop_CmpORD32S;
+   IROp   opNOT  = m64 ? Iop_Not64 : Iop_Not32;
    IROp   opOR   = m64 ? Iop_Or64  : Iop_Or32;
+   IROp   opXOR  = m64 ? Iop_Xor64 : Iop_Xor32;
    IROp   opAND  = m64 ? Iop_And64 : Iop_And32;
+   IROp   opEQ   = m64 ? Iop_CmpEQ64 : Iop_CmpEQ32;
+   IROp   opNE   = m64 ? Iop_CmpNE64 : Iop_CmpNE32;
+   IROp   opWIDEN = m64 ? Iop_1Uto64 : Iop_1Uto32;
    IROp   opSHL  = m64 ? Iop_Shl64 : Iop_Shl32;
    IROp   opSHR  = m64 ? Iop_Shr64 : Iop_Shr32;
    IRType ty     = m64 ? Ity_I64   : Ity_I32;
@@ -1119,6 +1124,10 @@ static IRAtom* doCmpORD ( MCEnv*  mce,
 
    IRAtom* threeLeft1 = NULL;
    IRAtom* sevenLeft1 = NULL;
+   IRAtom* lt_bit =  NULL;
+   IRAtom* gt_bit =  NULL;
+   IRAtom* eq_bit =  NULL;
+   IRAtom* tmp_bit =  NULL;
 
    tl_assert(isShadowAtom(mce,xxhash));
    tl_assert(isShadowAtom(mce,yyhash));
@@ -1139,26 +1148,125 @@ static IRAtom* doCmpORD ( MCEnv*  mce,
       /* if yy is zero, then it must be fully defined (zero#). */
       tl_assert(isZero(yyhash));
       threeLeft1 = m64 ? mkU64(3<<1) : mkU32(3<<1);
-      return
-         binop(
-            opOR,
-            assignNew(
-               'V', mce,ty,
-               binop(
-                  opAND,
-                  mkPCastTo(mce,ty, xxhash),
-                  threeLeft1
-               )),
-            assignNew(
-               'V', mce,ty,
-               binop(
-                  opSHL,
-                  assignNew(
-                     'V', mce,ty,
-                     binop(opSHR, xxhash, mkU8(width-1))),
-                  mkU8(3)
-               ))
- );
+#if 0
+      /* Approach 1.  What I did here was calculate the value of the eq and gt
+ bits for the comparison versus zero, i.e. set the condition code
+         result bit. We use the xxhash bits as a mask to get the valid xx bits
+         and then compare that with zero.  The returned value for EQ is
+         NOT(NOT(xxhash) AND (xx)) EQ 0)  return value should be 0 if the bits
+         are defined and equal to zero.  The value for GT is
+         NOT(NOT(xxhash) AND (xx)) NE 0). When comparing to zero NE implies GT.
+
+         THIS APPROACH IS REALLY CALCULATING THE ACTUAL CONDITION CODE VALUES
+         NOT IF BITS ARE VALID
+
+ This approach results in 384 condigional jump/move on uninitialized
+ values instead of just 1 for the test case.
+      */
+      tmp_bit = assignNew('V', mce,ty,
+  binop(opAND,
+ assignNew('V', mce,ty,
+  unop( opNOT,
+ assignNew('V', mce,ty,
+  mkPCastTo(mce,ty, xxhash)))),
+ assignNew('V', mce,ty, xx )));
+
+      /* The comparison gives 1 for EQ which implies value is defined.  But "defined" says the shadow bit must be zero so we have
+         to NOT the result.
+      */
+      eq_bit = assignNew('V', mce,ty,
+ unop(opNOT,
+      assignNew('V', mce,ty,
+ unop(opWIDEN,
+     assignNew('V', mce,Ity_I1,
+       binop(opEQ, tmp_bit,
+     (m64 ? mkU64(0) : mkU32(0))))))));
+
+      /* since we are comparing with zero, NE implies xx must be GT zero */
+      gt_bit = assignNew('V', mce,ty,
+ unop(opNOT,
+      assignNew('V', mce,ty,
+ unop(opWIDEN,
+     assignNew('V', mce,Ity_I1,
+       binop(opNE, tmp_bit,
+     (m64 ? mkU64(0) : mkU32(0))))))));
+
+      // old gt calc assignNew('V', mce,ty, binop(opXOR, lt_bit, assignNew('V', mce,ty, unop(opNOT, eq_bit))));
+      lt_bit = assignNew('V', mce,ty, binop(opSHR, xxhash, mkU8(width-1)));
+
+      return binop( opOR,
+    assignNew(
+      'V', mce,ty,
+      binop(
+    opAND,
+    mkPCastTo(mce,ty, xxhash),
+    threeLeft1
+    )),
+    assignNew(
+      'V', mce,ty,
+      binop( opOR,
+     assignNew(
+       'V', mce,ty,
+       binop( opOR,
+      assignNew(
+ 'V', mce,ty,
+ binop(
+      opSHL, lt_bit,
+      mkU8(3))),
+      assignNew(
+ 'V', mce,ty,
+ binop(
+      opSHL, eq_bit,
+      mkU8(2))))),
+     assignNew(
+       'V', mce,ty,
+       binop(
+     opSHL, gt_bit,
+     mkU8(1))))));
+#else
+   /* Approach 2 is to return 0 for EQ AND GT if there are some xx bits that
+      are defined for use in the comparison.  Or in other words if everything
+      is undefined, there are no valid bits that can be compared. EQ and GT
+      shadow bits are set to 1, i.e. undefined */
+   lt_bit = assignNew('V', mce,ty, binop(opSHR, xxhash, mkU8(width-1)));
+
+   /* Comparison is 1 if TRUE (i.e. bits are all undefined).
+
+    * THIS APPROACH STILL RESULTS IN 1 CONDITIONAL JUMP/MOVE DEPENDS ON
+    * UNINITIALIZED VALUE INSTEAD PLUS "Syscall param exit_group
+    * (status) contains uninitialized byte" WHICH DID NOT OCCUR
+    * BEFORE
+    */
+   tmp_bit = assignNew('V', mce,ty,
+       unop(opWIDEN,
+    assignNew('V', mce,Ity_I1,
+      binop( opEQ, (m64 ? mkU64(0xFFFFFFFFFFFFFFFF) : mkU32(0xFFFFFFFF)),
+     assignNew('V', mce,ty,
+       mkPCastTo(mce,ty,
+ xxhash))))));
+      return binop( opOR,
+    assignNew(
+      'V', mce,ty,
+      binop(
+    opAND,
+    mkPCastTo(mce,ty, xxhash),
+    threeLeft1
+    )),
+    assignNew(
+      'V', mce,ty,
+      binop( opOR,
+     assignNew(
+       'V', mce,ty,
+       binop( opAND,
+      tmp_bit,
+      (m64 ? mkU64(0x6) : mkU32(0x6)))),
+     assignNew(
+       'V', mce,ty,
+       binop(
+     opSHL, lt_bit,
+       mkU8(3))))));
+#endif
+
    } else {
       /* standard interpretation */
       sevenLeft1 = m64 ? mkU64(7<<1) : mkU32(7<<1);
--
2.11.0




------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/valgrind-developers
Reply | Threaded
Open this post in threaded view
|

Re: GCC 7 PPC64 Uninitialized data issue

Julian Seward-2

Hmm, this bug looks suspiciously similar to
https://bugs.kde.org/show_bug.cgi?id=352364

Carl, do you know if it is identical?

J

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/valgrind-developers
Reply | Threaded
Open this post in threaded view
|

Re: GCC 7 PPC64 Uninitialized data issue

Carl E. Love-2
On Mon, 2017-05-08 at 16:01 +0200, Julian Seward wrote:
> Hmm, this bug looks suspiciously similar to
> https://bugs.kde.org/show_bug.cgi?id=352364
>
> Carl, do you know if it is identical?
>
> J
>

My understanding is this bug does a comparison against a constant that
is not zero.  The bug I was working on compares with zero so they take
different paths in the CmpORD function.  But, the underlying issues seem
to be similar.

This bug talks about completely redoing the PPC condition code handling
to make it more compatible with the other architectures which presumably
would fix both issues.  Not sure exactly what all would be involved in
doing this.

Any thoughts on the patch I sent?  It didn't seem to work as hoped and
not sure why.

               Carl Love


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/valgrind-developers