Add README

This commit is contained in:
Patrick Lipka 2025-08-18 18:03:01 +02:00
parent 4611874301
commit 41a7eccad3
1 changed files with 279 additions and 1 deletions

280
README.md
View File

@ -1,3 +1,281 @@
# rvprof
riscv profling tool, inspired by vftrace
A lightweight profiling library for RISC-V applications, inspired by [vftrace](https://github.com/SX-Aurora/Vftrace). rvprof provides function-level timing analysis with minimal overhead and automatic instrumentation support.
## Features
- **Automatic Function Profiling**: Uses cycgnus function hooks inserted via `-finstrument-functions` for zero-code-change profiling
- **Manual Region Profiling**: Explicit profiling regions for fine-grained control
- **Symbol Resolution**: ELF parsing for human-readable function names
- **Timing Precision**: Nanosecond-resolution timing with optional cycle counter support
- **Memory Tracking**: Built-in profiler overhead monitoring
- **Stack Analysis**: Complete call stack tracking and analysis
- **Multi-language Support**: Native C/C++ and Fortran APIs
- **Configurable Output**: Environment variable control and customizable reports
## Quick Start
### 1. Build the Library
```bash
make
```
This creates `librvprof.a` static library.
### 2. Automatic Profiling (Recommended)
Compile your program with function instrumentation:
```bash
# C/C++
gcc -finstrument-functions your_program.c -L. -lrvprof -o your_program
# Fortran
gfortran -finstrument-functions your_program.f90 -L. -lrvprof -o your_program
```
Run your program normally - profiling happens automatically:
```bash
./your_program
# Creates: your_program_rvprof.log
```
### 3. Manual Profiling
For (additional) manual control over profiling regions:
**C/C++:**
```c
#include "rvprof.h"
int main() {
rvprof_region_begin("initialization");
// your initialization code
rvprof_region_end("initialization");
rvprof_region_begin("computation");
// your computation code
rvprof_region_end("computation");
return 0;
}
```
**Fortran:**
```fortran
program example
use rvprof
implicit none
call rvprof_region_begin("initialization")
! your initialization code
call rvprof_region_end("initialization")
call rvprof_region_begin("computation")
! your computation code
call rvprof_region_end("computation")
end program example
```
If the code is compiled using `-finstrument-functions`, the user-defined regions will show up in addition to the automatic ones.
## Configuration
Control rvprof behavior with environment variables:
| Variable | Default | Description |
|----------|---------|-------------|
| `RVPROF_OUTPUT` | `<program>_rvprof.log` | Output filename |
| `RVPROF_DISABLE_HOOKS` | `0` | Set to `1` to disable automatic function hooks |
| `RVPROF_DISABLE_MERGE` | `0` | Set to `1` to keep separate entries for same function from different callers |
### Examples
```bash
# Custom output file
RVPROF_OUTPUT=my_profile.txt ./your_program
# Disable automatic profiling, use manual regions only
RVPROF_DISABLE_HOOKS=1 ./your_program
# Keep separate entries for functions called from different contexts
RVPROF_DISABLE_MERGE=1 ./your_program
```
## Understanding the Output
rvprof generates a detailed text report with multiple sections:
### Function Profile Table
```
+-------+-----------+-----------+-----------+-----------+---------------+---------------+---------------------------------------------+---------------------------------------------+------+
| Calls | t_excl[s] | t_excl[%] | t_incl[s] | t_incl[%] | excl_cycles | incl_cycles | Function | Caller | STID |
+-------+-----------+-----------+-----------+-----------+---------------+---------------+---------------------------------------------+---------------------------------------------+------+
| 1 | 2.450 | 89.2 | 2.750 | 100.0 | 2450000000 | 2750000000 | main | --- | 1 |
| 100 | 0.250 | 9.1 | 0.300 | 10.9 | 250000000 | 300000000 | compute_heavy | main | 2 |
+-------+-----------+-----------+-----------+-----------+---------------+---------------+---------------------------------------------+---------------------------------------------+------+
```
**Columns explained:**
- **Calls**: Number of times function was called
- **t_excl[s]**: Exclusive time (time spent in function itself, excluding children)
- **t_excl[%]**: Exclusive time as percentage of total program runtime
- **t_incl[s]**: Inclusive time (time spent in function including all children)
- **t_incl[%]**: Inclusive time as percentage of total program runtime
- **excl_cycles/incl_cycles**: Cycle counts (if RISC-V cycle counter available)
- **Function**: Function name (resolved from symbols or address)
- **Caller**: Calling function (or "various" if called from multiple places)
- **STID**: Stack ID referencing the call stack table
### Call Stack Analysis
```
Global call stacks:
--------------------------------------------------------------------
STID Call stack
--------------------------------------------------------------------
STID1 main
STID2 main<compute_heavy
STID3 main<compute_heavy<helper_function
--------------------------------------------------------------------
```
Shows the complete call paths for each stack ID, with `<` separating stack levels.
### Performance Summary
```
Summary:
Total execution time: 2.750 seconds
Total cycles: 2750000000 cycles (informational)
Timer resolution: nanosecond timer
Number of functions: 15
Number of stacks: 8
Function hooks: enabled
Symbol table: 142 symbols loaded (ELF parsing)
Region merging: enabled
Memory footprint: 45.2 KB
```
## API Reference
### C/C++ API
```c
// Manual region profiling
void rvprof_region_begin(const char* name);
void rvprof_region_end(const char* name);
// Automatic hooks (called by -finstrument-functions)
void __cyg_profile_func_enter(void *this_fn, void *call_site);
void __cyg_profile_func_exit(void *this_fn, void *call_site);
```
### Fortran API
```fortran
! Manual region profiling
subroutine rvprof_region_begin(name)
subroutine rvprof_region_end(name)
```
### Performance Considerations
- **Overhead**: rvprof is designed for minimal overhead (~1-5% typical) but this can still get significant for a huge number of calls. Consider adding `__attribute__((no_instrument_function))` to functions that should not be profiled.
- **Memory**: Scales with number of unique functions and call stacks
- **Accuracy**: Nanosecond timing resolution; cycle counter when available
### Symbol Resolution
rvprof automatically attempts to resolve function addresses to names by:
1. Parsing the ELF symbol table from `/proc/self/exe`
2. Handling Position-Independent Executables (PIE)
3. Falling back to address display if symbols unavailable
## Troubleshooting
### Common Issues
**"No symbols loaded"**
- Compile with debug symbols: add `-g` flag
- Ensure executable has symbol table: `objdump -t your_program`
**"Cycle counter unavailable"**
- Normal on some RISC-V implementations
- Timing still works with nanosecond precision
**Missing function names**
- Some optimized builds strip symbols
- Functions may be inlined by compiler
### Debugging
Enable verbose output:
```bash
# rvprof prints initialization messages to stderr
./your_program 2>debug.log
```
## Building from Source
### Requirements
- RISC-V GCC toolchain (clang/gcc)
- RISC-V Fortran compiler (flang/gfortran) for Fortran support
- Make
### Build Options
```bash
# Default build
make
# Clean build
make clean
# Debug build
make CFLAGS="-g -O0 -DDEBUG"
```
## Integration Examples
### CMake Integration
```cmake
# In your CMakeLists.txt
add_library(rvprof STATIC IMPORTED)
set_target_properties(rvprof PROPERTIES IMPORTED_LOCATION /path/to/librvprof.a)
target_link_libraries(your_target rvprof)
target_compile_options(your_target PRIVATE -finstrument-functions)
```
### Makefile Integration
```makefile
RVPROF_DIR = /path/to/rvprof
CFLAGS += -finstrument-functions -I$(RVPROF_DIR)
LDFLAGS += -L$(RVPROF_DIR) -lrvprof
your_program: your_program.o
$(CC) $< $(LDFLAGS) -o $@
```
## License
This software is provided under the MIT license. See [LICENSE](LICENSE) for details.
## Contributing
rvprof is designed for RISC-V systems but the core profiling logic is portable. Contributions welcome for:
- Additional architecture support
- Output format improvements
- Performance optimizations
- Additional analysis features
## Acknowledgments
Inspired by [vftrace](https://github.com/SX-Aurora/Vftrace) from the SX-Aurora project.