|
||
---|---|---|
src | ||
.gitignore | ||
LICENSE | ||
Makefile | ||
README.md | ||
rvprof.h | ||
rvprof_module.f90 |
README.md
rvprof
A lightweight profiling library for RISC-V applications, inspired by vftrace. rvprof provides function-level timing analysis with minimal overhead and automatic instrumentation support.
Features
- Automatic Function Profiling: Uses cycgnus function hooks inserted via
-finstrument-functions
for zero-code-change profiling - Manual Region Profiling: Explicit profiling regions for fine-grained control
- Symbol Resolution: ELF parsing for human-readable function names
- Timing Precision: Nanosecond-resolution timing with optional cycle counter support
- Memory Tracking: Built-in profiler overhead monitoring
- Stack Analysis: Complete call stack tracking and analysis
- Multi-language Support: Native C/C++ and Fortran APIs
- Configurable Output: Environment variable control and customizable reports
Quick Start
1. Build the Library
make
This creates librvprof.a
static library.
2. Automatic Profiling (Recommended)
Compile your program with function instrumentation:
# C/C++
gcc -finstrument-functions your_program.c -L. -lrvprof -o your_program
# Fortran
gfortran -finstrument-functions your_program.f90 -L. -lrvprof -o your_program
Run your program normally - profiling happens automatically:
./your_program
# Creates: your_program_rvprof.log
3. Manual Profiling
For (additional) manual control over profiling regions:
C/C++:
#include "rvprof.h"
int main() {
rvprof_region_begin("initialization");
// your initialization code
rvprof_region_end("initialization");
rvprof_region_begin("computation");
// your computation code
rvprof_region_end("computation");
return 0;
}
Fortran:
program example
use rvprof
implicit none
call rvprof_region_begin("initialization")
! your initialization code
call rvprof_region_end("initialization")
call rvprof_region_begin("computation")
! your computation code
call rvprof_region_end("computation")
end program example
If the code is compiled using -finstrument-functions
, the user-defined regions will show up in addition to the automatic ones.
Configuration
Control rvprof behavior with environment variables:
Variable | Default | Description |
---|---|---|
RVPROF_OUTPUT |
<program>_rvprof.log |
Output filename |
RVPROF_DISABLE_HOOKS |
0 |
Set to 1 to disable automatic function hooks |
RVPROF_DISABLE_MERGE |
0 |
Set to 1 to keep separate entries for same function from different callers |
Examples
# Custom output file
RVPROF_OUTPUT=my_profile.txt ./your_program
# Disable automatic profiling, use manual regions only
RVPROF_DISABLE_HOOKS=1 ./your_program
# Keep separate entries for functions called from different contexts
RVPROF_DISABLE_MERGE=1 ./your_program
Understanding the Output
rvprof generates a detailed text report with multiple sections:
Function Profile Table
+-------+-----------+-----------+-----------+-----------+---------------+---------------+---------------------------------------------+---------------------------------------------+------+
| Calls | t_excl[s] | t_excl[%] | t_incl[s] | t_incl[%] | excl_cycles | incl_cycles | Function | Caller | STID |
+-------+-----------+-----------+-----------+-----------+---------------+---------------+---------------------------------------------+---------------------------------------------+------+
| 1 | 2.450 | 89.2 | 2.750 | 100.0 | 2450000000 | 2750000000 | main | --- | 1 |
| 100 | 0.250 | 9.1 | 0.300 | 10.9 | 250000000 | 300000000 | compute_heavy | main | 2 |
+-------+-----------+-----------+-----------+-----------+---------------+---------------+---------------------------------------------+---------------------------------------------+------+
Columns explained:
- Calls: Number of times function was called
- t_excl[s]: Exclusive time (time spent in function itself, excluding children)
- t_excl[%]: Exclusive time as percentage of total program runtime
- t_incl[s]: Inclusive time (time spent in function including all children)
- t_incl[%]: Inclusive time as percentage of total program runtime
- excl_cycles/incl_cycles: Cycle counts (if RISC-V cycle counter available)
- Function: Function name (resolved from symbols or address)
- Caller: Calling function (or "various" if called from multiple places)
- STID: Stack ID referencing the call stack table
Call Stack Analysis
Global call stacks:
--------------------------------------------------------------------
STID Call stack
--------------------------------------------------------------------
STID1 main
STID2 main<compute_heavy
STID3 main<compute_heavy<helper_function
--------------------------------------------------------------------
Shows the complete call paths for each stack ID, with <
separating stack levels.
Performance Summary
Summary:
Total execution time: 2.750 seconds
Total cycles: 2750000000 cycles (informational)
Timer resolution: nanosecond timer
Number of functions: 15
Number of stacks: 8
Function hooks: enabled
Symbol table: 142 symbols loaded (ELF parsing)
Region merging: enabled
Memory footprint: 45.2 KB
API Reference
C/C++ API
// Manual region profiling
void rvprof_region_begin(const char* name);
void rvprof_region_end(const char* name);
// Automatic hooks (called by -finstrument-functions)
void __cyg_profile_func_enter(void *this_fn, void *call_site);
void __cyg_profile_func_exit(void *this_fn, void *call_site);
Fortran API
! Manual region profiling
subroutine rvprof_region_begin(name)
subroutine rvprof_region_end(name)
Performance Considerations
- Overhead: rvprof is designed for minimal overhead (~1-5% typical) but this can still get significant for a huge number of calls. Consider adding
__attribute__((no_instrument_function))
to functions that should not be profiled. - Memory: Scales with number of unique functions and call stacks
- Accuracy: Nanosecond timing resolution; cycle counter when available
Symbol Resolution
rvprof automatically attempts to resolve function addresses to names by:
- Parsing the ELF symbol table from
/proc/self/exe
- Handling Position-Independent Executables (PIE)
- Falling back to address display if symbols unavailable
Troubleshooting
Common Issues
"No symbols loaded"
- Compile with debug symbols: add
-g
flag - Ensure executable has symbol table:
objdump -t your_program
"Cycle counter unavailable"
- Normal on some RISC-V implementations
- Timing still works with nanosecond precision
Missing function names
- Some optimized builds strip symbols
- Functions may be inlined by compiler
Debugging
Enable verbose output:
# rvprof prints initialization messages to stderr
./your_program 2>debug.log
Building from Source
Requirements
- RISC-V GCC toolchain (clang/gcc)
- RISC-V Fortran compiler (flang/gfortran) for Fortran support
- Make
Build Options
# Default build
make
# Clean build
make clean
# Debug build
make CFLAGS="-g -O0 -DDEBUG"
Integration Examples
CMake Integration
# In your CMakeLists.txt
add_library(rvprof STATIC IMPORTED)
set_target_properties(rvprof PROPERTIES IMPORTED_LOCATION /path/to/librvprof.a)
target_link_libraries(your_target rvprof)
target_compile_options(your_target PRIVATE -finstrument-functions)
Makefile Integration
RVPROF_DIR = /path/to/rvprof
CFLAGS += -finstrument-functions -I$(RVPROF_DIR)
LDFLAGS += -L$(RVPROF_DIR) -lrvprof
your_program: your_program.o
$(CC) $< $(LDFLAGS) -o $@
License
This software is provided under the MIT license. See LICENSE for details.
Contributing
rvprof is designed for RISC-V systems but the core profiling logic is portable. Contributions welcome for:
- Additional architecture support
- Output format improvements
- Performance optimizations
- Additional analysis features
Acknowledgments
Inspired by vftrace from the SX-Aurora project.