If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

Profiling using gprof on Teensy 4

I created a project to partially support gprof-style profiling of applications on Teensy 4. This shows how much time is spent in each function. In theory, this could also work on Teensy 3, but it doesn't have enough memory to do it well so I didn't implement that yet. This is the repository: https://github.com/ftrias/TeensyProf

It will only support the histogram feature of gprof. The Teensyduino arm cross-compiler does not support more advanced features (but it could! ...if Paul includes gprof executable and libc_p.a). You will need to make a few changes to the Arduino setup described in the README.md file.

Overview
-------------

The profiler samples the current instruction every 1 millisecond. That is at each millisecond it looks to see what function is currently running and keeps a counter of how many times this happens. Over time, this gives an approximate value of how much time each function consumes.

On desktops, gprof will also keep track of which function calls which, but that is not supported on Teensy 4.

After a while, a file with the function counters is sent out to the serial port. A python program is listening on the serial port for this and writes out "gmon.out". This file is then cross-referenced by gprof with the original executable to generate a table of execution times.

The output you show is generated by the Linux/Mac/Win executable which is listening on a serial port right?

I should have explained more. When compiling ends, it copies the elf file to /tmp/build.elf. Then the program uploads and runs as normal. When the profiling ends, the Teensy (using the library) sends a specially formatted data to the serial port. There is a python program listening to the serial port on the desktop. It detects the special codes and writes out the gmon.out file. Then it executes gprof on it cross-referencing the /tmp/build.elf binary.

I created a project to partially support gprof-style profiling of applications on Teensy 4. This shows how much time is spent in each function. In theory, this could also work on Teensy 3, but it doesn't have enough memory to do it well so I didn't implement that yet. This is the repository: https://github.com/ftrias/TeensyProf

It will only support the histogram feature of gprof. The Teensyduino arm cross-compiler does not support more advanced features (but it could! ...if Paul includes gprof executable and libc_p.a). You will need to make a few changes to the Arduino setup described in the README.md file.

Overview
-------------

The profiler samples the current instruction every 1 millisecond. That is at each millisecond it looks to see what function is currently running and keeps a counter of how many times this happens. Over time, this gives an approximate value of how much time each function consumes.

On desktops, gprof will also keep track of which function calls which, but that is not supported on Teensy 4.

After a while, a file with the function counters is sent out to the serial port. A python program is listening on the serial port for this and writes out "gmon.out". This file is then cross-referenced by gprof with the original executable to generate a table of execution times.

Modify `imxrt1062.ld` in the directory Arduino.../Contents/Java/hardware/teensy/avr/cores/teensy4. All references to `.text.itcm` must be changed to `.text`. Gprof expects the code to be in a segment named `.text`.

It works similarly to my last attempt. The main difference is that it also implements the call hierarchy and thus you can see which function calls which and the total cumulative time used by functions. This is very helpful.

Setup is a little more involved and requires modifying Teensyduino files `boards.txt`, `platform.txt` and `imxrt1062.ld`. See README.md for details.

`imxrt1062.ld` is modified to put all the code in a segment called `.text`. gprof is hard-coded to expect this because it's an almost universal convention. Teensy 3 does this. But for some reason, Teensy 4 does not.

Please post here if you try. I'm curious to know how it works for people.

Wanted to try in steps and started with the following (I'm using a makefile & vsCode not the Arduino IDE)

Board: Teensy 3.2
Changed/added compile flags: -O0 -gp

Result:
Compiling a test sketch, your library and linking does work without problems.
But: The program seems to badly mess up something. The Teensy vanishes from the USB bus and a simple blink does not work anymore. Need to press the program button to revive the Board.

If I link with -gp it complains about the missing libc_p (which was expected I think)

Wanted to try in steps and started with the following (I'm using a makefile & vsCode not the Arduino IDE)

Board: Teensy 3.2
Changed/added compile flags: -O0 -gp

Result:
Compiling a test sketch, your library and linking does work without problems.
But: The program seems to badly mess up something. The Teensy vanishes from the USB bus and a simple blink does not work anymore. Need to press the program button to revive the Board.

If I link with -gp it complains about the missing libc_p (which was expected I think)

Any ideas?

One theory: I believe this happens when you use the "-pg" option in the link stage. This is why the "boards.txt" file adds a new parameter "profile" with "-O0 -pg" and then only adds it in "platform.txt" in the compile stage of C++ files. If you just add it to the "optimize" parameter in "boards.txt" it won't work.

Ok, writes weird stuff on Serial after the waiting period.
Next step will be to generate a file out of this. Don't want to install python for that. Is it just binary data which needs to be stored to a file?

Ok, writes weird stuff on Serial after the waiting period.
Next step will be to generate a file out of this. Don't want to install python for that. Is it just binary data which needs to be stored to a file?

You can look at TeensyFile.cpp for the very simple format used.

You can also configure it to write out the file using hexadecimal and then use something like hex2bin to convert it to a file.

One remark:
Would be great if one could pass a Stream into the file generation instead of the hardcoded Serial as output. I more or less always use Serial for PC/Teensy command communication so, having the file spit out on USB Serial is rather troublesome.

One remark:
Would be great if one could pass a Stream into the file generation instead of the hardcoded Serial as output. I more or less always use Serial for PC/Teensy command communication so, having the file spit out on USB Serial is rather troublesome.

Very interesting. You can, of course, add your own implementation of the TeensyProf_open(), TeensyProf_close() and TeensyProf_write() functions to send the output to where you want. They're very simple and emulate the standard library open(), close() and write(). I have included 4 implementations: SD card, Serial using a binary format, Serial using hex, and Midi using Sysex messages.