This is fncdump.info, produced by makeinfo version 4.0 from fncdump.texi. START-INFO-DIR-ENTRY * fncdump: (fncdump). Profiling your program's execution END-INFO-DIR-ENTRY This file documents the profiler called fncdump. Copyright (C) 2001, Yannick Perret. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions.  File: fncdump.info, Node: Top, Next: Introduction, Up: (dir) Profiling a Program: Where Does It Spend Its Time? ************************************************** This manual describes the profiler called `fncdump', and how you can use it to determine which parts of a program are taking most of the execution time. We assume that you know how to write, compile, and execute programs. `fncdump' was written by Yannick Perret. Note: this info file is based on `gprof''s one. Thanks to authors. * Menu: * Introduction:: What profiling means, and why it is useful. * Compiling:: How to compile your program for profiling. * Executing:: Executing your program to generate profile data * Invoking:: How to run `fncdump', and its options * Output:: Interpreting `fncdump''s output * Details:: Details of how profiling is done (`fnccheck')  File: fncdump.info, Node: Introduction, Next: Compiling, Prev: Top, Up: Top Introduction to Profiling ************************* Profiling allows you to learn where your program spent its time and which functions called which other functions while it was executing. This information can show you which pieces of your program are slower than you expected, and might be candidates for rewriting to make your program execute faster. It can also tell you which functions are being called more or less often than you expected. This may help you spot bugs that had otherwise been unnoticed. Since the profiler uses information collected during the actual execution of your program, it can be used on programs that are too large or too complex to analyze by reading the source. However, how your program is run will affect the information that shows up in the profile data. If you don't use some feature of your program while it is being profiled, no profile information will be generated for that feature. `fncdump' profiler is divided into two part. The first one is the `fnccheck' library, which needs to be linked with the program to be profiled. The second part is the `fncdump' program itself which extract and display profile information for your program. Profiling has several steps: * You must compile and link your program with profiling enabled. *Note Compiling::. * You must execute your program to generate a profile data file. *Note Executing::. * You must run `fncdump' to analyze the profile data. *Note Invoking::. The next three chapters explain these steps in greater detail. Several forms of output are available from the analysis. The "flat profile" shows how much time your program spent in each function, and how many times that function was called. If you simply want to know which functions burn most of the cycles, it is stated concisely here. *Note Flat Profile::. The "call graph" shows, for each function, which functions called it or which other functions it called. *Note Call Graph::. To better understand how profiling works, you may wish to read a description of its implementation. *Note Implementation::.  File: fncdump.info, Node: Compiling, Next: Executing, Prev: Introduction, Up: Top Compiling a Program for Profiling ********************************* The first step in generating profile information for your program is to compile and link it with profiling enabled. Remember that `fncdump' can only be used with `gcc' V2.95.2 or higher. To compile a source file for profiling, specify the `-finstrument-functions -g' options when you run the compiler (This is in addition to the options you normally use). To link the program for profiling, add the `fnccheck' library to your link command (`-lfnccheck'). Here are examples: gcc -finstrument-functions -g -c myprog.c utils.c gcc -o myprog myprog.o utils.o -lfnccheck The `-finstrument-function -g' options also works with a command that both compiles and links: gcc -o myprog myprog.c utils.c -finstrument-functions -g -lfnccheck If you compile only some of the modules of the program with `-finstrument-function -g', you can still profile the program, but you won't get complete information about the modules that were compiled without `-finstrument-function -g'. The only information you get is the time spend in them, seen as local time for the calling functions.  File: fncdump.info, Node: Executing, Next: Invoking, Prev: Compiling, Up: Top Executing the Program ********************* Once the program is compiled for profiling, you must run it in order to generate the information that `fncdump' needs. Simply run the program as usual, using the normal arguments, file names, etc. The program should run normally, producing the same output as usual. It will, however, run somewhat slower than normal because of the time spent collecting and the writing the profile data. The way you run the program--the arguments and input that you give it--may have a dramatic effect on what the profile information shows. The profile data will describe the parts of the program that were activated for the particular input you use. For example, if the first command you give to your program is to quit, the profile data will show the time used in initialization and in cleanup, but not much else. Your program will write the profile data into a file called `fnccheck.out' just before exiting. If there is already a file called `fnccheck.out', its contents are overwritten. You can specify an other file name by setting the shell variable `FNCCHK_OUTFILE' with the needed file name. If your program uses `fork()' function, two processes will have to dump their data. In this case, the main process will dump into `fnccheck.out' or whatever name you set, and the child(s) process(es) will dump into `fnccheck.out_PID_OF_CHILD'. You can prevent childs to dump by using the shell variable `FNCCHK_NOCHILD'. The `fnccheck.out' is written when returning from `main' or by calling `exit'. Some signals are catched to perform file writing on kills, at the condition that the program don't redirect them. The `fnccheck.out' file is written in the program's _current working directory_ at the time it exits. This means that if your program calls `chdir', the `fnccheck.out' file will be left in the last directory your program `chdir''d to. If you don't have permission to write in this directory, the file is not written, and you will get an error message. Several shell variables can be used to change profile behavior: * `FNCCHK_STOP': if set, all profiling treatments are deactivated. It is useful to run a compiled program in a normal way. * `FNCCHK_OUTFILE': if set, replaces the profile data file name (default is `./fnccheck.out'). * `FNCCHK_NOCHILD': if set, childs of the process are not dumped to file at all. * `FNCCHK_STACKSIZE': set the default stack size used in the profiler. Default value is 1000000. * `FNCCHK_NBFUNC': set the default function table size used in the profiler. Default value is 1024. * `FNCCHK_DYNAMIC': the profiler default behavior is to generate an error if stack is full or if the maximum number of functions is reached. If `FNCCHK_DYNAMIC' is set, reallocation of these tables is performed during execution, but it generates bias in time computation. * `FNCCHK_DEBUG': ORed values of debug options. Current values are: 2: display collisions for the hash table at the end. 4: details on each fnccheck functions. 8: details on treatments. Note: you have to compile the library with debug activated.  File: fncdump.info, Node: Invoking, Next: Output, Prev: Executing, Up: Top `fncdump' Command Summary ************************* After you have a profile data file `fnccheck.out', you can run `fncdump' to interpret the information in it. The `fncdump' program prints a flat profile and a call graph on standard output. Typically you would redirect the output of `fncdump' into a file with `>'. You run `fncdump' like this: fncdump [OPTIONS] EXECUTABLE-FILE [> OUTFILE] Here square-brackets indicate optional arguments. * Menu: * Output Options:: Controlling `fncdump''s output style * Misc Options:: Other options * Symspecs:: Specifying functions to include or exclude  File: fncdump.info, Node: Output Options, Next: Misc Options, Up: Invoking Output Options ============== These options specify which of several output formats `fncdump' should produce. `-avg' `-average' The `-avg' option causes `fncdump' to display an estimated over-cost time of `fnccheck' library, expressed in time per function. `-sfile ' If the `-sfile' option is given, `fncdump' uses `' as profile data file instead of default `./fnccheck.out'. `-sort / +sort ' The `-sort' specify how `fncdump' sorts the flat profile. Available sorts are: 1: sorted by local time. 2: sorted by total time (default). 3: sorted by number of calls. 4: sorted by functions name. 5: not sorted (execution order). The `+sort' option reverse the sort order. `-calls / +calls' This option causes `fncdump' to display calls list of each function, instead of displaying list of callers of each functions. The `+calls' option display both the two lists. `-no-minmax' This option causes `fncdump' not to display MIN and MAX time list. This list shows for each function the MIN and MAX total time spend in it. `-func-details' The `-func-details' option causes `fncdump' to print a list of each function, with its corresponding file and line number. This option can't be used with `-nm' option. `-call-details / -calls-details' The `-call-details' option causes `fncdump' to display file and line number of callers site in calls list. This option can't be used with `-nm' option. `-fullname' Used with `-func-details' or `-call-details', causes `fncdump' to print full pathnames for files instead of basenames. `-call-graph' This option causes `fncdump' to output the call-graph of the profiled program, using `VCG' format (a GPL call-graph displayer). No other information is then displayed. `-cycles' This option causes `fncdump' to add the detected cycles during execution. Two kind of cycles can be displayed. Simple recursions (a function calls itself), and cycles, in which case the begin and the end of the cycle are displayed. `-sum-profile 'f1.out' 'f2.out' 'fr.out'' This option causes `fncdump' to add the profile data files `f1.out' and `f2.out' contents, and to generates a new profile data file called `fr.out'. If `fr.out' is `-', `stdout' is used. If you sum files from different programs or different compilations, the new profile data file will have very few sence. Note: you can use the same name for `fr.out' as for `f1.out' or `f2.out' (overwrite is possible).  File: fncdump.info, Node: Misc Options, Next: Symspecs, Prev: Output Options, Up: Invoking Misc Options ============ These options control various aspects of the `fncdump' behavior. `-nm' This option indicates to `fncdump' to use `nm' program to extract function names from symbol addresses, instead of using library `bfd' from binutils. In this case, file and line number of functions are not available. `-addr2line' This option causes `fncdump' to use `addr2line' program to extract function names from symbol addresses, instead of using library `bfd' from binutils. Note: if library `bfd' is not available on your system, `fncdump' can be compiled without using it. In this case, default method is `addr2line'. `--help / -help / --h / -h' Causes `fncdump' to display a help message, with options list . `--version /-version / --v / -v' Causes `fncdump' to display its version number. `--misc' Causes `fncdump' to display author, contact and bug report information. `--details' Cause `fncdump' to display explanations on available options, output contents and data profile file.  File: fncdump.info, Node: Symspecs, Prev: Misc Options, Up: Invoking Symspecs ======== These options specify how to set or remove functions from `fncdump' display. `-only ' The `-only' option set which functions are to be displayed in the `fncdump' functions list. `' is a comma separated list of function names. Only these functions will be used to generate profile, call-graph... `-not ' The `-not' option set which functions are to be removed from the `fncdump' function list. `' is a comma separated list of function names. All these functions will be removed from functions list to generate profile, call-graph... `-propagate' This option indicates that functions set or removed (using `-only' or `-not' options) are propagated to childs. It means that if a function `f()' calls a function `g()', using options `-only f -propagate' will make both `f()' and `g()' to be used in functions list. In the same way, using options `-not f -propagate' will make both `f()' and `g()' to be removed from functions list. `-rpropagate' This option indicates that functions set or removed (using `-only' or `-not' options) are propagated to callers. It means that if a function `g()' is called by a function `f()', using options `-only g -rpropagate' will make both `f()' and `g()' to be used in functions list. In the same way, using options `-not g -rpropagate' will make both `f()' and `g()' to be removed from functions list. `-no-spontaneous' The `-no-spontaneous' option causes `fncdump' to not display functions that seem to be called spontaneously. This can happen if you profile functions that are called by non-profiled functions. Note: this option not apply to `main()' function. `-no-unresolved' The `-no-unresolved' option causes `fncdump' to not display functions for which names are not find in the executable. This can happen if you give an invalid executable name to `fncdump', or if you profile functions which are loaded at execution time (using `dlopen(...)' from `dlfcn.h'. In this case the functions addresses are not available in executable. The default behavior of `fncdump' is to replace unresolved function names by their symbol address (i.e. `<0x8049178>'). `-real-maxtime' The `-real-time' option causes `fncdump' to compute total execution time using final functions list (after `-not' or `-only' options applied). The total execution time is used to compute time percentages in flat profile display. WARNING: this option is desactivated in version 1.2, due to changes in memory representation of call-graph. It will be activated again in future release.  File: fncdump.info, Node: Output, Next: Details, Prev: Invoking, Up: Top Interpreting `fncdump''s Output ******************************* `fncdump' first displays the cycles detected during execution, if the option `-cycles' is activated. `fncdump' then displays various informations about the profiled program. `the executable name' This is the name of the profiled program. `Total execution time' Most of the time, this time is the total execution time of `main()' function. It can be different if `main()' is not in a profiled object, or if `main()' has been removed from functions list using `-not' or `-only' options. In this case the execution time is followed by the message `(not main() time)'. `The clock type used' Indicates which kind of clock was used to compute time. Possible values are `real clock time', which indicates that the 'user' clock is used; the second possible value is `system time', which indicates that CPU process time is used. `number of realloc' Indicates how many reallocations where performed during profile. This number is always zero if `FNCCHK_DYNAMIC' is not set. If this number is not zero, bias can be present in times (but it is not possible to know where the bias is). Then the flat profile is displayed (see below). After the flat profile, some other informations are displayed: `number of unresolved functions not shown' It is the number of hidden functions due to option `-no-unresolved'. `number of `-not/-only' functions not shown' It is the number of hidden functions due to options `-not' and `-only'. `final stack size' It is the effective stack size used during execution. `number of functions' It is the total number of functions. Then the MIN/MAX time table is displayed. For each function, its max and min total execution time is displayed. This table is not displayed if option `-no-minmax' is used. The final data displayed is the call-graph (see below). * Menu: * Flat Profile:: The flat profile shows how much time was spent executing directly in each function. * Call Graph:: The call graph shows which functions called which others, and how much time each function used when its subroutine calls are included.  File: fncdump.info, Node: Flat Profile, Next: Call Graph, Up: Output The Flat Profile ================ The flat profile gives for each function various informations, such as local time spend in the function, total execution time, number of calls... Here is a example of flat profile on a small program: | local | total | | | | sec. | % | sec. | % | calls |t. sec/call| name |-----------|-----|-----------|-----|----------|-----------|-------- | 0.068976| 0.3| 23.331874|100.0| 1| 23.331874| main | 5.056025| 21.7| 10.078856| 43.2| 4| 2.519714| recurs_a | 5.022831| 21.5| 8.822895| 37.8| 4| 2.205724| recurs_b | 4.037846| 17.3| 8.077933| 34.6| 4| 2.019483| recurs_1s | 5.105893| 21.9| 5.105893| 21.9| 1| 5.105893| test | 4.040068| 17.3| 4.040068| 17.3| 4| 1.010017| s1 | 0.000097| 0.0| 0.000180| 0.0| 32| 0.000006| recurs | 0.000020| 0.0| 0.000030| 0.0| 1| 0.000030| f3 | 0.000010| 0.0| 0.000010| 0.0| 3| 0.000003| f1 | 0.000006| 0.0| 0.000006| 0.0| 1| 0.000006| small The first two columns display the local time spend in each function. Local time is the total execution time of the function minus the time spend in (profiled) child functions. The second column is the percentage of this time regards to the total execution time. The next two columns display the total time spend in each function. The second column is the percentage of this time regards to the total execution time. The fifth column shows the number of calls during execution for each function. The next column indicate the average time per call, computed as the total execution time divided by the number of calls. The last column displays the corresponding function name.  File: fncdump.info, Node: Call Graph, Prev: Flat Profile, Up: Output The Call Graph ============== Call-graph shows for each function the list of its callers, or its childs if using option `-calls'. Here is a example of call-graph for a small program: 'main' [0] spontaneously called. 'recurs_a' [1] called by: [0], [2], 'recurs_b' [2] called by: [1], 'recurs_1s' [3] called by: [0], [3], 'test' [4] called by: [0], 's1' [5] called by: [3], 'recurs' [6] called by: [0], [6], 'f3' [7] called by: [0], 'f1' [8] called by: [7], 'small' [9] called by: [0], If option `-calls' is used, callers are replaced by list of child functions. For the same example it becomes: 'main' [0] calls: [1], [3], [4], [6], [7], [9], 'recurs_a' [1] calls: [2], 'recurs_b' [2] calls: [1], 'recurs_1s' [3] calls: [3], [5], 'test' [4] calls: nobody. 's1' [5] calls: nobody. 'recurs' [6] calls: [6], 'f3' [7] calls: [8], 'f1' [8] calls: nobody. 'small' [9] calls: nobody. If option `-call-details' is used, file and line number of each function and each caller is added to output. Here is an example for the same program: 'main' [0] (essai.c:128) spontaneously called. 'recurs_a' [1] (essai.c:68) called by: [0](essai.c:132), [2](essai.c:94), 'recurs_b' [2] (essai.c:82) called by: [1](essai.c:78), 'recurs_1s' [3] (essai.c:56) called by: [0](essai.c:134), [3](essai.c:61), 'test' [4] (essai.c:121) called by: [0](essai.c:139), 's1' [5] (essai.c:16) called by: [3](essai.c:58), 'recurs' [6] (essai.c:48) called by: [0](essai.c:130), [6](essai.c:51), 'f3' [7] (essai.c:40) called by: [0](essai.c:142), 'f1' [8] (essai.c:22) called by: [7](essai.c:41), 'small' [9] (essai.c:113) called by: [0](essai.c:145),  File: fncdump.info, Node: Details, Prev: Output, Up: Top Details of Profiling ******************** More details about how `fnccheck' library perform its job. * Menu: * Implementation:: How the library collects profiling information * File Format:: Format of `fnccheck.out' files * Internals:: `fnccheck''s internal operation * Debugging:: Using debug while executing programs  File: fncdump.info, Node: Implementation, Next: File Format, Up: Details Implementation of Profiling =========================== The `fnccheck' library uses a new feature of `gcc' V2.95.2 (or higher) called `-finstrument-functions'. This option makes `gcc' to add to every functions a call at enter and at exit. The `fnccheck' library gives these two functions called `void __cyg_profile_func_enter(void *this_fn, void *call_site)' and `void __cyg_profile_func_exit(void *this_fn, void *call_site)'. The `fnccheck' library manages a stack of function calls and a list of functions. For each of them, the number of calls, the time spend in them (local and total) is computed. Moreover a list of callers is associated to functions in order to manage call-graph. At exit time, all these data are dumped into a file `./fnccheck.out' in order to be used with `fncdump' program. Note: the `fnccheck' library only works with symbol addresses of functions, and not with names. Names are extracted from executable by `fncdump'.  File: fncdump.info, Node: File Format, Next: Internals, Prev: Implementation, Up: Details Profiling Data File Format ========================== The profile data file `./fnccheck.out' format is: HEADER # of elements f1_ptr #calls sec usec locsec locusec minsec minusec maxsec maxusec f2_ptr #calls sec usec locsec locusec minsec minusec maxsec maxusec ... # of realloc stack_size func_table_size nb_of_callers_f1 caller1_ptr rc1_ptr caller2_ptr rc2_ptr ... nb_of_callers_f2 caller1_ptr rc1_ptr caller2_ptr rc2_ptr ... ... time_mode exec_name HEADER : identifier (including file-format version) of stat-files # el : number of functions fi_ptr : symbol of the function nb_calls : number of calls for this function nb_sec : total time in the function nb_usec : corresponding usec _locsec : local time in the function _locusec : corresponding usec min_sec : minimum time spend in the function _subusec : corresponding usec min_sec : maximum time spend in the function _subusec : corresponding usec #realloc : number of 'realloc' performed during execution stack_size: final stack size used during execution func_tbl_s: number of function stored during execution nbofcaller: number of callers for the corresp. function calleri_pt: symbol of ith caller for the function rci_ptr : symbol of caller of the function time_mode : the FNCCHK_TIME type exec_name : corresp. exec name (not implemented)  File: fncdump.info, Node: Internals, Next: Debugging, Prev: File Format, Up: Details `fnccheck''s Internal Operation =============================== Here are the treatments performed by `fnccheck' library for each enter/exit of functions: At the first call: Initializations of hash-tables, options, stack. Trapping exits with `atexit()' and `signal()' functions. When entering a function: * check if the function is still registered, and register it if it is not the case. * add the call-site to callers list for this function * store the function and its entering time in the stack * compute the additional local time for the calling function. It is the time between the last active time of this function minus the current time. When exiting a function: * compute the total execution time (difference between entering time and current time) * increment number of calls * compute the additional local time for the calling function. It is the time between the last active time of this function minus the current time. * check if we are exiting `main()' function, and is yes call the dumping function. When leaving the program: Creates the profile data file and dumps data into it. Freed allocated tables and remove treatments (because in some cases, in particular for `C++' programs, calls to exit treatments can occur AFTER exiting from `main()'.  File: fncdump.info, Node: Debugging, Prev: Internals, Up: Details Debugging during execution -------------------------- In order to allow debug during execution, you have to compile the library `fnccheck' using target `dfnccheck'. This allows you to use the shell variable `FNCCHK_DEBUG'. Valid values are: 2 display collision in hash table; 4 details on `fnccheck' library functions; 8 details on `fnccheck' library treatments.  Tag Table: Node: Top847 Node: Introduction1753 Node: Compiling3968 Node: Executing5232 Node: Invoking8522 Node: Output Options9240 Node: Misc Options11941 Node: Symspecs13123 Node: Output15960 Node: Flat Profile18375 Node: Call Graph20320 Node: Details22619 Node: Implementation23041 Node: File Format24087 Node: Internals25670 Node: Debugging27136  End Tag Table