Helgrind detects race with same lock

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Helgrind detects race with same lock

William Good

Hello,

I am trying to understand this helgrind output.  It says there is a data-race on a read.  However both threads hold the same lock.  How can this be a race when both threads hold the lock during the access?


==31341== ----------------------------------------------------------------
==31341==
==31341==  Lock at 0x5990828 was first observed
==31341==    at 0x4C31A76: pthread_mutex_init (hg_intercepts.c:779)
==31341==    by 0x4026AF: thread_pool_submit (threadpool.c:85)
==31341==    by 0x402012: qsort_internal_parallel (quicksort.c:142)
==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
==31341==    by 0x402450: thread_work (threadpool.c:233)
==31341==    by 0x4C3083E: mythread_wrapper (hg_intercepts.c:389)
==31341==    by 0x4E42DC4: start_thread (in /usr/lib64/libpthread-2.17.so)
==31341==    by 0x5355CEC: clone (in /usr/lib64/libc-2.17.so)
==31341==  Address 0x5990828 is 40 bytes inside a block of size 152 alloc'd
==31341==    at 0x4C2CD95: calloc (vg_replace_malloc.c:711)
==31341==    by 0x4026A1: thread_pool_submit (threadpool.c:84)
==31341==    by 0x402012: qsort_internal_parallel (quicksort.c:142)
==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
==31341==    by 0x40279F: future_get (threadpool.c:112)
==31341==    by 0x402048: qsort_internal_parallel (quicksort.c:152)
==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
==31341==    by 0x402450: thread_work (threadpool.c:233)
==31341==    by 0x4C3083E: mythread_wrapper (hg_intercepts.c:389)
==31341==    by 0x4E42DC4: start_thread (in /usr/lib64/libpthread-2.17.so)
==31341==    by 0x5355CEC: clone (in /usr/lib64/libc-2.17.so)
==31341==  Block was alloc'd by thread #3
==31341==
==31341== Possible data race during read of size 4 at 0x5990880 by thread #2
==31341== Locks held: 1, at address 0x5990828
==31341==    at 0x4023A9: thread_work (threadpool.c:229)
==31341==    by 0x4C3083E: mythread_wrapper (hg_intercepts.c:389)
==31341==    by 0x4E42DC4: start_thread (in /usr/lib64/libpthread-2.17.so)
==31341==    by 0x5355CEC: clone (in /usr/lib64/libc-2.17.so)
==31341==
==31341== This conflicts with a previous write of size 4 by thread #3
==31341== Locks held: 1, at address 0x5990828
==31341==    at 0x4027B3: future_get (threadpool.c:114)
==31341==    by 0x402048: qsort_internal_parallel (quicksort.c:152)
==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
==31341==    by 0x40279F: future_get (threadpool.c:112)
==31341==    by 0x402048: qsort_internal_parallel (quicksort.c:152)
==31341==    by 0x40279F: future_get (threadpool.c:112)
==31341==    by 0x402048: qsort_internal_parallel (quicksort.c:152)
==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
==31341==  Address 0x5990880 is 128 bytes inside a block of size 152 alloc'd
==31341==    at 0x4C2CD95: calloc (vg_replace_malloc.c:711)
==31341==    by 0x4026A1: thread_pool_submit (threadpool.c:84)
==31341==    by 0x402012: qsort_internal_parallel (quicksort.c:142)
==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
==31341==    by 0x40279F: future_get (threadpool.c:112)
==31341==    by 0x402048: qsort_internal_parallel (quicksort.c:152)
==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
==31341==    by 0x402450: thread_work (threadpool.c:233)
==31341==    by 0x4C3083E: mythread_wrapper (hg_intercepts.c:389)
==31341==    by 0x4E42DC4: start_thread (in /usr/lib64/libpthread-2.17.so)
==31341==    by 0x5355CEC: clone (in /usr/lib64/libc-2.17.so)
==31341==  Block was alloc'd by thread #3
==31341==
==31341== ----------------------------------------------------------------


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/valgrind-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Helgrind detects race with same lock

Philippe Waroquiers
You might have been unlucky and have a lock that was freed and then
re-used.

See extract of mk_LockP_from_LockN comments:
   So we check that each LockN is a member of the admin_locks double
   linked list of all Lock structures.  That stops us prodding around
   in potentially freed-up Lock structures.  However, it's not quite a
   proper check: if a new Lock has been reallocated at the same
   address as one which was previously freed, we'll wind up copying
   the new one as the basis for the LockP, which is completely bogus
   because it is unrelated to the previous Lock that lived there.
   Let's hope that doesn't happen too often.

Do you have a small reproducer for the below ?
Philippe


On Mon, 2017-05-29 at 17:33 +0000, William Good wrote:

> Hello,
>
> I am trying to understand this helgrind output.  It says there is a
> data-race on a read.  However both threads hold the same lock.  How
> can this be a race when both threads hold the lock during the access?
>
>
> ==31341==
> ----------------------------------------------------------------
> ==31341==
> ==31341==  Lock at 0x5990828 was first observed
> ==31341==    at 0x4C31A76: pthread_mutex_init (hg_intercepts.c:779)
> ==31341==    by 0x4026AF: thread_pool_submit (threadpool.c:85)
> ==31341==    by 0x402012: qsort_internal_parallel (quicksort.c:142)
> ==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
> ==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
> ==31341==    by 0x402450: thread_work (threadpool.c:233)
> ==31341==    by 0x4C3083E: mythread_wrapper (hg_intercepts.c:389)
> ==31341==    by 0x4E42DC4: start_thread
> (in /usr/lib64/libpthread-2.17.so)
> ==31341==    by 0x5355CEC: clone (in /usr/lib64/libc-2.17.so)
> ==31341==  Address 0x5990828 is 40 bytes inside a block of size 152
> alloc'd
> ==31341==    at 0x4C2CD95: calloc (vg_replace_malloc.c:711)
> ==31341==    by 0x4026A1: thread_pool_submit (threadpool.c:84)
> ==31341==    by 0x402012: qsort_internal_parallel (quicksort.c:142)
> ==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
> ==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
> ==31341==    by 0x40279F: future_get (threadpool.c:112)
> ==31341==    by 0x402048: qsort_internal_parallel (quicksort.c:152)
> ==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
> ==31341==    by 0x402450: thread_work (threadpool.c:233)
> ==31341==    by 0x4C3083E: mythread_wrapper (hg_intercepts.c:389)
> ==31341==    by 0x4E42DC4: start_thread
> (in /usr/lib64/libpthread-2.17.so)
> ==31341==    by 0x5355CEC: clone (in /usr/lib64/libc-2.17.so)
> ==31341==  Block was alloc'd by thread #3
> ==31341==
> ==31341== Possible data race during read of size 4 at 0x5990880 by
> thread #2
> ==31341== Locks held: 1, at address 0x5990828
> ==31341==    at 0x4023A9: thread_work (threadpool.c:229)
> ==31341==    by 0x4C3083E: mythread_wrapper (hg_intercepts.c:389)
> ==31341==    by 0x4E42DC4: start_thread
> (in /usr/lib64/libpthread-2.17.so)
> ==31341==    by 0x5355CEC: clone (in /usr/lib64/libc-2.17.so)
> ==31341==
> ==31341== This conflicts with a previous write of size 4 by thread #3
> ==31341== Locks held: 1, at address 0x5990828
> ==31341==    at 0x4027B3: future_get (threadpool.c:114)
> ==31341==    by 0x402048: qsort_internal_parallel (quicksort.c:152)
> ==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
> ==31341==    by 0x40279F: future_get (threadpool.c:112)
> ==31341==    by 0x402048: qsort_internal_parallel (quicksort.c:152)
> ==31341==    by 0x40279F: future_get (threadpool.c:112)
> ==31341==    by 0x402048: qsort_internal_parallel (quicksort.c:152)
> ==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
> ==31341==  Address 0x5990880 is 128 bytes inside a block of size 152
> alloc'd
> ==31341==    at 0x4C2CD95: calloc (vg_replace_malloc.c:711)
> ==31341==    by 0x4026A1: thread_pool_submit (threadpool.c:84)
> ==31341==    by 0x402012: qsort_internal_parallel (quicksort.c:142)
> ==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
> ==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
> ==31341==    by 0x40279F: future_get (threadpool.c:112)
> ==31341==    by 0x402048: qsort_internal_parallel (quicksort.c:152)
> ==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
> ==31341==    by 0x402450: thread_work (threadpool.c:233)
> ==31341==    by 0x4C3083E: mythread_wrapper (hg_intercepts.c:389)
> ==31341==    by 0x4E42DC4: start_thread
> (in /usr/lib64/libpthread-2.17.so)
> ==31341==    by 0x5355CEC: clone (in /usr/lib64/libc-2.17.so)
> ==31341==  Block was alloc'd by thread #3
> ==31341==
> ==31341==
> ----------------------------------------------------------------
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________ Valgrind-users mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/valgrind-users



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/valgrind-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Helgrind detects race with same lock

William Good

So it is actually two different locks that just happen to occupy the same address at different times?  Usually, helgrind indicates when each lock was first observed but there is no mention of a second lock.  No my reproducer is fairly large.




From: Philippe Waroquiers <[hidden email]>
Sent: Monday, May 29, 2017 5:20 PM
To: William Good
Cc: [hidden email]
Subject: Re: [Valgrind-users] Helgrind detects race with same lock
 
You might have been unlucky and have a lock that was freed and then
re-used.

See extract of mk_LockP_from_LockN comments:
   So we check that each LockN is a member of the admin_locks double
   linked list of all Lock structures.  That stops us prodding around
   in potentially freed-up Lock structures.  However, it's not quite a
   proper check: if a new Lock has been reallocated at the same
   address as one which was previously freed, we'll wind up copying
   the new one as the basis for the LockP, which is completely bogus
   because it is unrelated to the previous Lock that lived there.
   Let's hope that doesn't happen too often.

Do you have a small reproducer for the below ?
Philippe


On Mon, 2017-05-29 at 17:33 +0000, William Good wrote:
> Hello,
>
> I am trying to understand this helgrind output.  It says there is a
> data-race on a read.  However both threads hold the same lock.  How
> can this be a race when both threads hold the lock during the access?
>
>
> ==31341==
> ----------------------------------------------------------------
> ==31341==
> ==31341==  Lock at 0x5990828 was first observed
> ==31341==    at 0x4C31A76: pthread_mutex_init (hg_intercepts.c:779)
> ==31341==    by 0x4026AF: thread_pool_submit (threadpool.c:85)
> ==31341==    by 0x402012: qsort_internal_parallel (quicksort.c:142)
> ==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
> ==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
> ==31341==    by 0x402450: thread_work (threadpool.c:233)
> ==31341==    by 0x4C3083E: mythread_wrapper (hg_intercepts.c:389)
> ==31341==    by 0x4E42DC4: start_thread
> (in /usr/lib64/libpthread-2.17.so)
> ==31341==    by 0x5355CEC: clone (in /usr/lib64/libc-2.17.so)
> ==31341==  Address 0x5990828 is 40 bytes inside a block of size 152
> alloc'd
> ==31341==    at 0x4C2CD95: calloc (vg_replace_malloc.c:711)
> ==31341==    by 0x4026A1: thread_pool_submit (threadpool.c:84)
> ==31341==    by 0x402012: qsort_internal_parallel (quicksort.c:142)
> ==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
> ==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
> ==31341==    by 0x40279F: future_get (threadpool.c:112)
> ==31341==    by 0x402048: qsort_internal_parallel (quicksort.c:152)
> ==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
> ==31341==    by 0x402450: thread_work (threadpool.c:233)
> ==31341==    by 0x4C3083E: mythread_wrapper (hg_intercepts.c:389)
> ==31341==    by 0x4E42DC4: start_thread
> (in /usr/lib64/libpthread-2.17.so)
> ==31341==    by 0x5355CEC: clone (in /usr/lib64/libc-2.17.so)
> ==31341==  Block was alloc'd by thread #3
> ==31341==
> ==31341== Possible data race during read of size 4 at 0x5990880 by
> thread #2
> ==31341== Locks held: 1, at address 0x5990828
> ==31341==    at 0x4023A9: thread_work (threadpool.c:229)
> ==31341==    by 0x4C3083E: mythread_wrapper (hg_intercepts.c:389)
> ==31341==    by 0x4E42DC4: start_thread
> (in /usr/lib64/libpthread-2.17.so)
> ==31341==    by 0x5355CEC: clone (in /usr/lib64/libc-2.17.so)
> ==31341==
> ==31341== This conflicts with a previous write of size 4 by thread #3
> ==31341== Locks held: 1, at address 0x5990828
> ==31341==    at 0x4027B3: future_get (threadpool.c:114)
> ==31341==    by 0x402048: qsort_internal_parallel (quicksort.c:152)
> ==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
> ==31341==    by 0x40279F: future_get (threadpool.c:112)
> ==31341==    by 0x402048: qsort_internal_parallel (quicksort.c:152)
> ==31341==    by 0x40279F: future_get (threadpool.c:112)
> ==31341==    by 0x402048: qsort_internal_parallel (quicksort.c:152)
> ==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
> ==31341==  Address 0x5990880 is 128 bytes inside a block of size 152
> alloc'd
> ==31341==    at 0x4C2CD95: calloc (vg_replace_malloc.c:711)
> ==31341==    by 0x4026A1: thread_pool_submit (threadpool.c:84)
> ==31341==    by 0x402012: qsort_internal_parallel (quicksort.c:142)
> ==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
> ==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
> ==31341==    by 0x40279F: future_get (threadpool.c:112)
> ==31341==    by 0x402048: qsort_internal_parallel (quicksort.c:152)
> ==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
> ==31341==    by 0x402450: thread_work (threadpool.c:233)
> ==31341==    by 0x4C3083E: mythread_wrapper (hg_intercepts.c:389)
> ==31341==    by 0x4E42DC4: start_thread
> (in /usr/lib64/libpthread-2.17.so)
> ==31341==    by 0x5355CEC: clone (in /usr/lib64/libc-2.17.so)
> ==31341==  Block was alloc'd by thread #3
> ==31341==
> ==31341==
> ----------------------------------------------------------------
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________ Valgrind-users mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/valgrind-users





------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/valgrind-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Helgrind detects race with same lock

Philippe Waroquiers
On Wed, 2017-05-31 at 18:26 +0000, William Good wrote:
> So it is actually two different locks that just happen to occupy the
> same address at different times?  Usually, helgrind indicates when
> each lock was first observed but there is no mention of a second lock.
To verify this hypothesis, you might run with -v -v -v.
Each time a lock is pthread_mutex_init-ed, you should see a line
such as:
client request: code 48470103,  addr 0x5400040,  len 0
the request corresponds to the enum client request defined
in helgrind.h : 0x103 =  256 + 3, which is
    _VG_USERREQ__HG_PTHREAD_MUTEX_INIT_POST

If you see such a line twice with the same addr, then that
indicates we had 2 initialisations of a mutex at the same
addr.
And the comment below makes me believe helgrind does
not handle that very cleanly.


>   No my reproducer is fairly large
That is not a surprise :).
If the problem is effectively linked to re-creation of
another mutex at the same addr, then i think a small
reproducer should be easy to write.

But let's first confirm you see 2 initialisations


You might also try with --tool=drd, to see if drd confirms
the race condition.

Philippe

>
>
>
> ______________________________________________________________________
> From: Philippe Waroquiers <[hidden email]>
> Sent: Monday, May 29, 2017 5:20 PM
> To: William Good
> Cc: [hidden email]
> Subject: Re: [Valgrind-users] Helgrind detects race with same lock
>  
> You might have been unlucky and have a lock that was freed and then
> re-used.
>
> See extract of mk_LockP_from_LockN comments:
>    So we check that each LockN is a member of the admin_locks double
>    linked list of all Lock structures.  That stops us prodding around
>    in potentially freed-up Lock structures.  However, it's not quite a
>    proper check: if a new Lock has been reallocated at the same
>    address as one which was previously freed, we'll wind up copying
>    the new one as the basis for the LockP, which is completely bogus
>    because it is unrelated to the previous Lock that lived there.
>    Let's hope that doesn't happen too often.
>
> Do you have a small reproducer for the below ?
> Philippe
>
>
> On Mon, 2017-05-29 at 17:33 +0000, William Good wrote:
> > Hello,
> >
> > I am trying to understand this helgrind output.  It says there is a
> > data-race on a read.  However both threads hold the same lock.  How
> > can this be a race when both threads hold the lock during the
> access?
> >
> >
> > ==31341==
> > ----------------------------------------------------------------
> > ==31341==
> > ==31341==  Lock at 0x5990828 was first observed
> > ==31341==    at 0x4C31A76: pthread_mutex_init (hg_intercepts.c:779)
> > ==31341==    by 0x4026AF: thread_pool_submit (threadpool.c:85)
> > ==31341==    by 0x402012: qsort_internal_parallel (quicksort.c:142)
> > ==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
> > ==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
> > ==31341==    by 0x402450: thread_work (threadpool.c:233)
> > ==31341==    by 0x4C3083E: mythread_wrapper (hg_intercepts.c:389)
> > ==31341==    by 0x4E42DC4: start_thread
> > (in /usr/lib64/libpthread-2.17.so)
> > ==31341==    by 0x5355CEC: clone (in /usr/lib64/libc-2.17.so)
> > ==31341==  Address 0x5990828 is 40 bytes inside a block of size 152
> > alloc'd
> > ==31341==    at 0x4C2CD95: calloc (vg_replace_malloc.c:711)
> > ==31341==    by 0x4026A1: thread_pool_submit (threadpool.c:84)
> > ==31341==    by 0x402012: qsort_internal_parallel (quicksort.c:142)
> > ==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
> > ==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
> > ==31341==    by 0x40279F: future_get (threadpool.c:112)
> > ==31341==    by 0x402048: qsort_internal_parallel (quicksort.c:152)
> > ==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
> > ==31341==    by 0x402450: thread_work (threadpool.c:233)
> > ==31341==    by 0x4C3083E: mythread_wrapper (hg_intercepts.c:389)
> > ==31341==    by 0x4E42DC4: start_thread
> > (in /usr/lib64/libpthread-2.17.so)
> > ==31341==    by 0x5355CEC: clone (in /usr/lib64/libc-2.17.so)
> > ==31341==  Block was alloc'd by thread #3
> > ==31341==
> > ==31341== Possible data race during read of size 4 at 0x5990880 by
> > thread #2
> > ==31341== Locks held: 1, at address 0x5990828
> > ==31341==    at 0x4023A9: thread_work (threadpool.c:229)
> > ==31341==    by 0x4C3083E: mythread_wrapper (hg_intercepts.c:389)
> > ==31341==    by 0x4E42DC4: start_thread
> > (in /usr/lib64/libpthread-2.17.so)
> > ==31341==    by 0x5355CEC: clone (in /usr/lib64/libc-2.17.so)
> > ==31341==
> > ==31341== This conflicts with a previous write of size 4 by thread
> #3
> > ==31341== Locks held: 1, at address 0x5990828
> > ==31341==    at 0x4027B3: future_get (threadpool.c:114)
> > ==31341==    by 0x402048: qsort_internal_parallel (quicksort.c:152)
> > ==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
> > ==31341==    by 0x40279F: future_get (threadpool.c:112)
> > ==31341==    by 0x402048: qsort_internal_parallel (quicksort.c:152)
> > ==31341==    by 0x40279F: future_get (threadpool.c:112)
> > ==31341==    by 0x402048: qsort_internal_parallel (quicksort.c:152)
> > ==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
> > ==31341==  Address 0x5990880 is 128 bytes inside a block of size 152
> > alloc'd
> > ==31341==    at 0x4C2CD95: calloc (vg_replace_malloc.c:711)
> > ==31341==    by 0x4026A1: thread_pool_submit (threadpool.c:84)
> > ==31341==    by 0x402012: qsort_internal_parallel (quicksort.c:142)
> > ==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
> > ==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
> > ==31341==    by 0x40279F: future_get (threadpool.c:112)
> > ==31341==    by 0x402048: qsort_internal_parallel (quicksort.c:152)
> > ==31341==    by 0x402040: qsort_internal_parallel (quicksort.c:151)
> > ==31341==    by 0x402450: thread_work (threadpool.c:233)
> > ==31341==    by 0x4C3083E: mythread_wrapper (hg_intercepts.c:389)
> > ==31341==    by 0x4E42DC4: start_thread
> > (in /usr/lib64/libpthread-2.17.so)
> > ==31341==    by 0x5355CEC: clone (in /usr/lib64/libc-2.17.so)
> > ==31341==  Block was alloc'd by thread #3
> > ==31341==
> > ==31341==
> > ----------------------------------------------------------------
> >
> >
> ------------------------------------------------------------------------------
> > Check out the vibrant tech community on one of the world's most
> > engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> > _______________________________________________ Valgrind-users
> mailing list [hidden email]
> https://lists.sourceforge.net/lists/listinfo/valgrind-users 
>
> Valgrind-users Info Page - SourceForge
> lists.sourceforge.net
> To see the collection of prior postings to the list, visit the
> Valgrind-users Archives. Using Valgrind-users: To post a message to
> all the list ...
>
>
>
>
>



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/valgrind-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Helgrind detects race with same lock

Philippe Waroquiers
On Wed, 2017-05-31 at 23:54 +0200, Philippe Waroquiers wrote:
> If the problem is effectively linked to re-creation of
> another mutex at the same addr, then i think a small
> reproducer should be easy to write.

The below  small program reproduces the behaviour
you have seen: a race condition is reported between
2 threads while helgrind reports they are holding
the same lock.

But in reality, the lock was destroyed and re-created.
Helgrind falsely believes this to be the same lock
(and falsely reports that the first time the lock was observed
is the re-creation).

So, it is probable that what you see is a similar case.
In absence of another synchronisation than this (falsely
presented as a common) lock, you might have a real race
condition.

Philippe
 
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <unistd.h>

pthread_mutex_t mx1;
int x = 0;

void* child_fn ( void* arg )
{
   int r;
   int destroy;
   printf("child_fn\n");
   r= pthread_mutex_lock(&mx1);  assert(!r);
   x += 1;
   destroy = x == 1;
   r= pthread_mutex_unlock(&mx1);  assert(!r);
   if (destroy) {
      printf("destroy/recreate mx1\n");
      r= pthread_mutex_destroy(&mx1);  assert(!r);
      r= pthread_mutex_init(&mx1, NULL);  assert(!r);
   }
   printf("child_fn returning ...\n");
   return NULL;
}

void* child_fn2 ( void* arg )
{
   sleep (20);
   child_fn ( arg );
   return NULL;
}

int main ( int argc, char** argv )
{
   pthread_t child1, child2;
   int r;

   r= pthread_mutex_init(&mx1, NULL);  assert(!r);
   printf("creating threads\n");
   r= pthread_create(&child1, NULL, child_fn, NULL);  assert(!r);
   r= pthread_create(&child2, NULL, child_fn2, NULL);  assert(!r);
   printf("sleeping 5\n");
   sleep (5);

   printf("joining child1\n");
   r= pthread_join(child1, NULL);  assert(!r);
   printf("joining child2\n");
   r= pthread_join(child2, NULL);  assert(!r);
   printf("end\n");

   return 0;
}



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/valgrind-users
Loading...