Callgrind not detecting threads

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Callgrind not detecting threads

Mike Lui
I posted this on the valgrind-dev mailing list a while back but didn't receive a response. Hopefully some users have some input.

I'm working on a project that leverages Callgrind to generate VEX IR traces. I'm using Valgrind 3.12.0.
I also use Callgrind's infrastructure to detect when Valgrind switches thread contexts, however I'm getting unexpected behavior.

It looks like the best place to detect a thread context switch in Callgrind is in CLG_(setup_bbcc) in bbcc.c  (line 561):

  /* This is needed because thread switches can not reliable be tracked
   * with callback CLG_(run_thread) only: we have otherwise no way to get
   * the thread ID after a signal handler returns.
   * This could be removed again if that bug is fixed in Valgrind.
   * This is in the hot path but hopefully not to costly.
   */
  tid = VG_(get_running_tid)();
#if 1
  /* CLG_(switch_thread) is a no-op when tid is equal to CLG_(current_tid).
   * As this is on the hot path, we only call CLG_(switch_thread)(tid)
   * if tid differs from the CLG_(current_tid).
   */
  if (UNLIKELY(tid != CLG_(current_tid)))
     CLG_(switch_thread)(tid);

The above is called every instrumented basic block.
I've noticed strange behavior, where a thread switch would not always be detected.
I detected the unexpected behavior with the following modifications:

To investigate further, I modified the above:
- if (UNLIKELY(tid != CLG_(current_tid)))
+ if (UNLIKELY(tid != CLG_(current_tid))) {
     CLG_(switch_thread)(tid);
+    VG_(printf)("Thread switched to: %d\n", tid);
+ }

  • With this change, I run the parsec 3.0 benchmark blackscholes with 4 threads, input_test.tar, and expect to see threads (numbered 1-5, 1 master and 4 worker threads) printed.
  • Under default flags, I'm seeing all 5 threads printed
  • when I add --fair-sched=yes, often I'd see the last thread (5) not printed.
  • I confirmed this behavior by printing VG_(get_running_tid)() every instrumented basic block.
  • I confirmed this behavior by using --separate-threads=yes for Callgrind. This only outputs 4 per-thread files, instead of 5.
  • I know that the thread switch happened or else the application would have failed.
This does not happen all the time but it happens on the majority of runs. I also noticed that if I put a print statement in the blackscholes worker thread, the unexpected behavior manifests far less often. I conclude it must have something to do with the thread exiting too quickly and not having enough work to do.

Is this considered a bug? If not, how do I detect every time the Valgrind thread context changes. I saw this thread from a long time ago but I'm not sure if there's been any progress.

$ uname -a
Linux ubuntu-VirtualBox 3.19.0-25-generic #26~14.04.1-Ubuntu SMP Fri Jul 24 21:16:20 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

Steps to reproduce:
mkdir detect_thread_switch && cd detect_thread_switch
parsec-3.0/bin/parsecmgmt -a build -p blackscholes -c gcc-pthreads
tar xf parsec-3.0/pkgs/apps/blackscholes/inputs/input_test.tar

# MAKE THE CHANGE TO bbcc.c TO PRINT THREAD ID ON THREAD SWITCH

cd valgrind-3.12.0 && ./autogen.sh && ./configure
make -j4 && cd ..

# WILL SHOW THREADS 1-5
valgrind-3.12.0/vg-in-place --tool=callgrind parsec-3.0/pkgs/apps/blackscholes/inst/amd64-linux.gcc-pthreads/bin/blackscholes 4 in_4.txt prices.txt

# MAY HAVE TO RUN SEVERAL TIMES IN SUCCESSION, WILL EVENTUALLY BE MISSING THREAD 5
valgrind-3.12.0/vg-in-place --fair-sched=yes --tool=callgrind parsec-3.0/pkgs/apps/blackscholes/inst/amd64-linux.gcc-pthreads/bin/blackscholes 4 in_4.txt prices.txt

--------------------------------------------------------
Some example output with default flags:

==3382== Callgrind, a call-graph generating cache profiler
==3382== Copyright (C) 2002-2015, and GNU GPL'd, by Josef Weidendorfer et al.
==3382== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==3382== Command: parsec-3.0/pkgs/apps/blackscholes/inst/amd64-linux.gcc-pthreads/bin/blackscholes 4 in_4.txt prices.txt
==3382== 
==3382== For interactive control, run 'callgrind_control -h'.
PARSEC Benchmark Suite Version 3.0-beta-20150206
Num of Options: 4
Num of Runs: 100
Size of data: 160
Thread switched to: 4
Thread switched to: 3
Thread switched to: 2
Thread switched to: 1
Thread switched to: 5
Thread switched to: 4
Thread switched to: 1
==3382== 
==3382== Events    : Ir
==3382== Collected : 569502
==3382== 
==3382== I   refs:      569,502

With --fair-sched=yes:
==3375== Callgrind, a call-graph generating cache profiler
==3375== Copyright (C) 2002-2015, and GNU GPL'd, by Josef Weidendorfer et al.
==3375== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==3375== Command: parsec-3.0/pkgs/apps/blackscholes/inst/amd64-linux.gcc-pthreads/bin/blackscholes 4 in_4.txt prices.txt
==3375== 
==3375== For interactive control, run 'callgrind_control -h'.
PARSEC Benchmark Suite Version 3.0-beta-20150206
Num of Options: 4
Num of Runs: 100
Size of data: 160
Thread switched to: 2
Thread switched to: 1
Thread switched to: 3
Thread switched to: 2
Thread switched to: 1
Thread switched to: 4
Thread switched to: 3
Thread switched to: 1
Thread switched to: 4
Thread switched to: 2
Thread switched to: 1
Thread switched to: 2
Thread switched to: 1
==3375== 
==3375== Events    : Ir
==3375== Collected : 569505
==3375== 
==3375== I   refs:      569,505

Thanks!


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/valgrind-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Callgrind not detecting threads

Philippe Waroquiers
On Wed, 2017-06-21 at 19:17 +0000, Mike Lui wrote:

> I posted this on the valgrind-dev mailing list a while back but didn't
> receive a response. Hopefully some users have some input.
>
>
> I'm working on a project that leverages Callgrind to generate VEX IR
> traces. I'm using Valgrind 3.12.0.
> I also use Callgrind's infrastructure to detect when Valgrind switches
> thread contexts, however I'm getting unexpected behavior.
>
>
> It looks like the best place to detect a thread context switch
> in Callgrind is in CLG_(setup_bbcc) in bbcc.c  (line 561):
>
>
>   /* This is needed because thread switches can not reliable be
> tracked
>    * with callback CLG_(run_thread) only: we have otherwise no way to
> get
>    * the thread ID after a signal handler returns.
>    * This could be removed again if that bug is fixed in Valgrind.
>    * This is in the hot path but hopefully not to costly.
>    */
>   tid = VG_(get_running_tid)();
> #if 1
>   /* CLG_(switch_thread) is a no-op when tid is equal to
> CLG_(current_tid).
>    * As this is on the hot path, we only call CLG_(switch_thread)(tid)
>    * if tid differs from the CLG_(current_tid).
>    */
>   if (UNLIKELY(tid != CLG_(current_tid)))
>      CLG_(switch_thread)(tid);
>
>
> The above is called every instrumented basic block.
> I've noticed strange behavior, where a thread switch would not always
> be detected.
> I detected the unexpected behavior with the following modifications:
>
>
> To investigate further, I modified the above:
> - if (UNLIKELY(tid != CLG_(current_tid)))
>
> + if (UNLIKELY(tid != CLG_(current_tid))) {
>      CLG_(switch_thread)(tid);
> +    VG_(printf)("Thread switched to: %d\n", tid);
> + }
>
Adding --trace-sched=yes might help to understand what happens.
Also, --trace-signals=yes might be useful if the problem is signal
related.

Philippe




------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/valgrind-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Callgrind not detecting threads

Mike Lui
It appears the thread ids are reused. That is, if thread 2 exits, and another starts, then thread 2 is reused.

Is there anyway to detect this during instrumentation, or is this a limitation of the Valgrind tools?
When trying to separate the work of each thread, having VG_(get_running_tid)() report non-unique ID's is troublesome.

Mike 

On Wed, Jun 21, 2017 at 4:28 PM Philippe Waroquiers <[hidden email]> wrote:
On Wed, 2017-06-21 at 19:17 +0000, Mike Lui wrote:
> I posted this on the valgrind-dev mailing list a while back but didn't
> receive a response. Hopefully some users have some input.
>
>
> I'm working on a project that leverages Callgrind to generate VEX IR
> traces. I'm using Valgrind 3.12.0.
> I also use Callgrind's infrastructure to detect when Valgrind switches
> thread contexts, however I'm getting unexpected behavior.
>
>
> It looks like the best place to detect a thread context switch
> in Callgrind is in CLG_(setup_bbcc) in bbcc.c  (line 561):
>
>
>   /* This is needed because thread switches can not reliable be
> tracked
>    * with callback CLG_(run_thread) only: we have otherwise no way to
> get
>    * the thread ID after a signal handler returns.
>    * This could be removed again if that bug is fixed in Valgrind.
>    * This is in the hot path but hopefully not to costly.
>    */
>   tid = VG_(get_running_tid)();
> #if 1
>   /* CLG_(switch_thread) is a no-op when tid is equal to
> CLG_(current_tid).
>    * As this is on the hot path, we only call CLG_(switch_thread)(tid)
>    * if tid differs from the CLG_(current_tid).
>    */
>   if (UNLIKELY(tid != CLG_(current_tid)))
>      CLG_(switch_thread)(tid);
>
>
> The above is called every instrumented basic block.
> I've noticed strange behavior, where a thread switch would not always
> be detected.
> I detected the unexpected behavior with the following modifications:
>
>
> To investigate further, I modified the above:
> - if (UNLIKELY(tid != CLG_(current_tid)))
>
> + if (UNLIKELY(tid != CLG_(current_tid))) {
>      CLG_(switch_thread)(tid);
> +    VG_(printf)("Thread switched to: %d\n", tid);
> + }
>
Adding --trace-sched=yes might help to understand what happens.
Also, --trace-signals=yes might be useful if the problem is signal
related.

Philippe




------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/valgrind-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Callgrind not detecting threads

John Reiser
On 06/21/2017 Mike Lui wrote:
> It appears the thread ids are reused. That is, if thread 2 exits, and another starts, then thread 2 is reused.
>
> Is there anyway to detect this during instrumentation, or is this a limitation of the Valgrind tools?
> When trying to separate the work of each thread, having VG_(get_running_tid)() report non-unique ID's is troublesome.

C'mon.  The tools re-use thread IDs because that is the easiest
and most efficient way to track the running threads.
If no re-use, then the next easiest way is to increment forever.
Then the threadID cannot be an 'unsigned short', and there must be
an adaptive hash table (or other non-trivial lookup mechanism)
from threadID to internal thread structure, and the hash table
must allow frequent deletions.

Modify the source to suit yourself.  You will see how
un-worthwhile the modifications are for the existing use cases.

--

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/valgrind-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Callgrind not detecting threads

Mike Lui
In reply to this post by Mike Lui
Hi John, I appreciate you taking the time to comment. I agree with your assessments and wanted to comment on a few points. 

> C'mon.  The tools re-use thread IDs because that is the easiest
> and most efficient way to track the running threads.
> If no re-use, then the next easiest way is to increment forever.
> Then the threadID cannot be an 'unsigned short', and there must be
> an adaptive hash table (or other non-trivial lookup mechanism)
> from threadID to internal thread structure, and the hash table
> must allow frequent deletions.

To clarify, I am not suggesting a change to make every thread hash to a unique ID.
I was asking if there were any best-known-methods for detecting a new thread in the Valgrind scheduler. For example, it looks like tools such as Callgrind, rely on calling VG_(get_running_tid)() every basic block to detect different threads. Using this method can produce misleading information by conflating metadata from different threads.

Notably, Linux approaches PID generation by incrementing until the max ID is reached, and then wrapping around and reusing any available IDs. Although I don't know the internals of how the kernel achieves this, this can be naively tracked with a 8KB bit vector for a 16-bit thread ID, .
I'm unaware of how Valgrind currently tracks thread IDs.

> Modify the source to suit yourself.  You will see how
> un-worthwhile the modifications are for the existing use cases.

Again, I'm not contending for Valgrind internal changes.  I would contend, however, that being able to reliably detect when a thread starts/stops within instrumented code is valuable.

I've seen the track_{start,stop}_client_code callbacks suggested, but I've also seen that VG_(get_running_tid)() is supposed to be more reliable. e.g. http://www.mail-archive.com/valgrind-users@.../msg03441.html 

Thank you again,
Mike

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/valgrind-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Callgrind not detecting threads

Philippe Waroquiers
Helgrind maintains a unique thread nr by intercepting pthread_create,
as e.g. helgrind needs to speak about terminated threads
and so, cannot use the tid, as the tid is re-used : a new thread uses
the lowest free tid value.

See hg_main.c evh__pre_thread_ll_create, hooked up with
 VG_(track_pre_thread_ll_create)( evh__pre_thread_ll_create );

so, with this, your tool might maintain an array indexed by tid
mapping to a unique thread nr for every created thread (even if
it died since then).

Philippe


On Thu, 2017-06-22 at 18:58 +0000, Mike Lui wrote:

> Hi John, I appreciate you taking the time to comment. I agree with
> your assessments and wanted to comment on a few points.
>
>
>
> > C'mon.  The tools re-use thread IDs because that is the easiest
>
> > and most efficient way to track the running threads.
> > If no re-use, then the next easiest way is to increment forever.
> > Then the threadID cannot be an 'unsigned short', and there must be
> > an adaptive hash table (or other non-trivial lookup mechanism)
> > from threadID to internal thread structure, and the hash table
> > must allow frequent deletions.
>
>
> To clarify, I am not suggesting a change to make every thread hash to
> a unique ID.
> I was asking if there were any best-known-methods for detecting a new
> thread in the Valgrind scheduler. For example, it looks like tools
> such as Callgrind, rely on calling VG_(get_running_tid)() every basic
> block to detect different threads. Using this method can produce
> misleading information by conflating metadata from different threads.
>
>
> Notably, Linux approaches PID generation by incrementing until the max
> ID is reached, and then wrapping around and reusing any available IDs.
> Although I don't know the internals of how the kernel achieves this,
> this can be naively tracked with a 8KB bit vector for a 16-bit thread
> ID, .
> I'm unaware of how Valgrind currently tracks thread IDs.
>
>
> > Modify the source to suit yourself.  You will see how
> > un-worthwhile the modifications are for the existing use cases.
>
>
> Again, I'm not contending for Valgrind internal changes.  I would
> contend, however, that being able to reliably detect when a thread
> starts/stops within instrumented code is valuable.
>
> I've seen the track_{start,stop}_client_code callbacks suggested, but
> I've also seen that VG_(get_running_tid)() is supposed to be more
> reliable. e.g.
> http://www.mail-archive.com/valgrind-users@.../msg03441.html 
>
>
> Thank you again,
> Mike
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________ Valgrind-users mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/valgrind-users



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/valgrind-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Callgrind not detecting threads

Mike Lui
The implementation in helgrind looks like a good way to go about this. This is what I'm looking for. 

Thanks!
Mike
 
On Thu, Jun 22, 2017 at 3:49 PM Philippe Waroquiers <[hidden email]> wrote:
Helgrind maintains a unique thread nr by intercepting pthread_create,
as e.g. helgrind needs to speak about terminated threads
and so, cannot use the tid, as the tid is re-used : a new thread uses
the lowest free tid value.

See hg_main.c evh__pre_thread_ll_create, hooked up with
 VG_(track_pre_thread_ll_create)( evh__pre_thread_ll_create );

so, with this, your tool might maintain an array indexed by tid
mapping to a unique thread nr for every created thread (even if
it died since then).

Philippe


On Thu, 2017-06-22 at 18:58 +0000, Mike Lui wrote:
> Hi John, I appreciate you taking the time to comment. I agree with
> your assessments and wanted to comment on a few points.
>
>
>
> > C'mon.  The tools re-use thread IDs because that is the easiest
>
> > and most efficient way to track the running threads.
> > If no re-use, then the next easiest way is to increment forever.
> > Then the threadID cannot be an 'unsigned short', and there must be
> > an adaptive hash table (or other non-trivial lookup mechanism)
> > from threadID to internal thread structure, and the hash table
> > must allow frequent deletions.
>
>
> To clarify, I am not suggesting a change to make every thread hash to
> a unique ID.
> I was asking if there were any best-known-methods for detecting a new
> thread in the Valgrind scheduler. For example, it looks like tools
> such as Callgrind, rely on calling VG_(get_running_tid)() every basic
> block to detect different threads. Using this method can produce
> misleading information by conflating metadata from different threads.
>
>
> Notably, Linux approaches PID generation by incrementing until the max
> ID is reached, and then wrapping around and reusing any available IDs.
> Although I don't know the internals of how the kernel achieves this,
> this can be naively tracked with a 8KB bit vector for a 16-bit thread
> ID, .
> I'm unaware of how Valgrind currently tracks thread IDs.
>
>
> > Modify the source to suit yourself.  You will see how
> > un-worthwhile the modifications are for the existing use cases.
>
>
> Again, I'm not contending for Valgrind internal changes.  I would
> contend, however, that being able to reliably detect when a thread
> starts/stops within instrumented code is valuable.
>
> I've seen the track_{start,stop}_client_code callbacks suggested, but
> I've also seen that VG_(get_running_tid)() is supposed to be more
> reliable. e.g.
> http://www.mail-archive.com/valgrind-users@.../msg03441.html
>
>
> Thank you again,
> Mike
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________ Valgrind-users mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/valgrind-users



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/valgrind-users
Loading...