Compare commits
2 Commits
6ba13bbeda
...
41a7eccad3
Author | SHA1 | Date |
---|---|---|
|
41a7eccad3 | |
|
4611874301 |
22
LICENSE
22
LICENSE
|
@ -1,5 +1,21 @@
|
|||
Copyright (C) 2025 by patrick patrick.lipka@posteo.de
|
||||
MIT License
|
||||
|
||||
Permission to use, copy, modify, and/or distribute this software for any purpose with or without fee is hereby granted.
|
||||
Copyright (c) 2025 Patrick Lipka
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE.
|
280
README.md
280
README.md
|
@ -1,3 +1,281 @@
|
|||
# rvprof
|
||||
|
||||
riscv profling tool, inspired by vftrace
|
||||
A lightweight profiling library for RISC-V applications, inspired by [vftrace](https://github.com/SX-Aurora/Vftrace). rvprof provides function-level timing analysis with minimal overhead and automatic instrumentation support.
|
||||
|
||||
## Features
|
||||
|
||||
- **Automatic Function Profiling**: Uses cycgnus function hooks inserted via `-finstrument-functions` for zero-code-change profiling
|
||||
- **Manual Region Profiling**: Explicit profiling regions for fine-grained control
|
||||
- **Symbol Resolution**: ELF parsing for human-readable function names
|
||||
- **Timing Precision**: Nanosecond-resolution timing with optional cycle counter support
|
||||
- **Memory Tracking**: Built-in profiler overhead monitoring
|
||||
- **Stack Analysis**: Complete call stack tracking and analysis
|
||||
- **Multi-language Support**: Native C/C++ and Fortran APIs
|
||||
- **Configurable Output**: Environment variable control and customizable reports
|
||||
|
||||
## Quick Start
|
||||
|
||||
### 1. Build the Library
|
||||
|
||||
```bash
|
||||
make
|
||||
```
|
||||
|
||||
This creates `librvprof.a` static library.
|
||||
|
||||
### 2. Automatic Profiling (Recommended)
|
||||
|
||||
Compile your program with function instrumentation:
|
||||
|
||||
```bash
|
||||
# C/C++
|
||||
gcc -finstrument-functions your_program.c -L. -lrvprof -o your_program
|
||||
|
||||
# Fortran
|
||||
gfortran -finstrument-functions your_program.f90 -L. -lrvprof -o your_program
|
||||
```
|
||||
|
||||
Run your program normally - profiling happens automatically:
|
||||
|
||||
```bash
|
||||
./your_program
|
||||
# Creates: your_program_rvprof.log
|
||||
```
|
||||
|
||||
### 3. Manual Profiling
|
||||
|
||||
For (additional) manual control over profiling regions:
|
||||
|
||||
**C/C++:**
|
||||
```c
|
||||
#include "rvprof.h"
|
||||
|
||||
int main() {
|
||||
rvprof_region_begin("initialization");
|
||||
// your initialization code
|
||||
rvprof_region_end("initialization");
|
||||
|
||||
rvprof_region_begin("computation");
|
||||
// your computation code
|
||||
rvprof_region_end("computation");
|
||||
|
||||
return 0;
|
||||
}
|
||||
```
|
||||
|
||||
**Fortran:**
|
||||
```fortran
|
||||
program example
|
||||
use rvprof
|
||||
implicit none
|
||||
|
||||
call rvprof_region_begin("initialization")
|
||||
! your initialization code
|
||||
call rvprof_region_end("initialization")
|
||||
|
||||
call rvprof_region_begin("computation")
|
||||
! your computation code
|
||||
call rvprof_region_end("computation")
|
||||
|
||||
end program example
|
||||
```
|
||||
|
||||
If the code is compiled using `-finstrument-functions`, the user-defined regions will show up in addition to the automatic ones.
|
||||
|
||||
## Configuration
|
||||
|
||||
Control rvprof behavior with environment variables:
|
||||
|
||||
| Variable | Default | Description |
|
||||
|----------|---------|-------------|
|
||||
| `RVPROF_OUTPUT` | `<program>_rvprof.log` | Output filename |
|
||||
| `RVPROF_DISABLE_HOOKS` | `0` | Set to `1` to disable automatic function hooks |
|
||||
| `RVPROF_DISABLE_MERGE` | `0` | Set to `1` to keep separate entries for same function from different callers |
|
||||
|
||||
### Examples
|
||||
|
||||
```bash
|
||||
# Custom output file
|
||||
RVPROF_OUTPUT=my_profile.txt ./your_program
|
||||
|
||||
# Disable automatic profiling, use manual regions only
|
||||
RVPROF_DISABLE_HOOKS=1 ./your_program
|
||||
|
||||
# Keep separate entries for functions called from different contexts
|
||||
RVPROF_DISABLE_MERGE=1 ./your_program
|
||||
```
|
||||
|
||||
## Understanding the Output
|
||||
|
||||
rvprof generates a detailed text report with multiple sections:
|
||||
|
||||
### Function Profile Table
|
||||
|
||||
```
|
||||
+-------+-----------+-----------+-----------+-----------+---------------+---------------+---------------------------------------------+---------------------------------------------+------+
|
||||
| Calls | t_excl[s] | t_excl[%] | t_incl[s] | t_incl[%] | excl_cycles | incl_cycles | Function | Caller | STID |
|
||||
+-------+-----------+-----------+-----------+-----------+---------------+---------------+---------------------------------------------+---------------------------------------------+------+
|
||||
| 1 | 2.450 | 89.2 | 2.750 | 100.0 | 2450000000 | 2750000000 | main | --- | 1 |
|
||||
| 100 | 0.250 | 9.1 | 0.300 | 10.9 | 250000000 | 300000000 | compute_heavy | main | 2 |
|
||||
+-------+-----------+-----------+-----------+-----------+---------------+---------------+---------------------------------------------+---------------------------------------------+------+
|
||||
```
|
||||
|
||||
**Columns explained:**
|
||||
- **Calls**: Number of times function was called
|
||||
- **t_excl[s]**: Exclusive time (time spent in function itself, excluding children)
|
||||
- **t_excl[%]**: Exclusive time as percentage of total program runtime
|
||||
- **t_incl[s]**: Inclusive time (time spent in function including all children)
|
||||
- **t_incl[%]**: Inclusive time as percentage of total program runtime
|
||||
- **excl_cycles/incl_cycles**: Cycle counts (if RISC-V cycle counter available)
|
||||
- **Function**: Function name (resolved from symbols or address)
|
||||
- **Caller**: Calling function (or "various" if called from multiple places)
|
||||
- **STID**: Stack ID referencing the call stack table
|
||||
|
||||
### Call Stack Analysis
|
||||
|
||||
```
|
||||
Global call stacks:
|
||||
--------------------------------------------------------------------
|
||||
STID Call stack
|
||||
--------------------------------------------------------------------
|
||||
STID1 main
|
||||
STID2 main<compute_heavy
|
||||
STID3 main<compute_heavy<helper_function
|
||||
--------------------------------------------------------------------
|
||||
```
|
||||
|
||||
Shows the complete call paths for each stack ID, with `<` separating stack levels.
|
||||
|
||||
### Performance Summary
|
||||
|
||||
```
|
||||
Summary:
|
||||
Total execution time: 2.750 seconds
|
||||
Total cycles: 2750000000 cycles (informational)
|
||||
Timer resolution: nanosecond timer
|
||||
Number of functions: 15
|
||||
Number of stacks: 8
|
||||
Function hooks: enabled
|
||||
Symbol table: 142 symbols loaded (ELF parsing)
|
||||
Region merging: enabled
|
||||
Memory footprint: 45.2 KB
|
||||
```
|
||||
|
||||
## API Reference
|
||||
|
||||
### C/C++ API
|
||||
|
||||
```c
|
||||
// Manual region profiling
|
||||
void rvprof_region_begin(const char* name);
|
||||
void rvprof_region_end(const char* name);
|
||||
|
||||
// Automatic hooks (called by -finstrument-functions)
|
||||
void __cyg_profile_func_enter(void *this_fn, void *call_site);
|
||||
void __cyg_profile_func_exit(void *this_fn, void *call_site);
|
||||
```
|
||||
|
||||
### Fortran API
|
||||
|
||||
```fortran
|
||||
! Manual region profiling
|
||||
subroutine rvprof_region_begin(name)
|
||||
subroutine rvprof_region_end(name)
|
||||
```
|
||||
|
||||
|
||||
### Performance Considerations
|
||||
|
||||
- **Overhead**: rvprof is designed for minimal overhead (~1-5% typical) but this can still get significant for a huge number of calls. Consider adding `__attribute__((no_instrument_function))` to functions that should not be profiled.
|
||||
- **Memory**: Scales with number of unique functions and call stacks
|
||||
- **Accuracy**: Nanosecond timing resolution; cycle counter when available
|
||||
|
||||
### Symbol Resolution
|
||||
|
||||
rvprof automatically attempts to resolve function addresses to names by:
|
||||
1. Parsing the ELF symbol table from `/proc/self/exe`
|
||||
2. Handling Position-Independent Executables (PIE)
|
||||
3. Falling back to address display if symbols unavailable
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
**"No symbols loaded"**
|
||||
- Compile with debug symbols: add `-g` flag
|
||||
- Ensure executable has symbol table: `objdump -t your_program`
|
||||
|
||||
**"Cycle counter unavailable"**
|
||||
- Normal on some RISC-V implementations
|
||||
- Timing still works with nanosecond precision
|
||||
|
||||
**Missing function names**
|
||||
- Some optimized builds strip symbols
|
||||
- Functions may be inlined by compiler
|
||||
|
||||
### Debugging
|
||||
|
||||
Enable verbose output:
|
||||
```bash
|
||||
# rvprof prints initialization messages to stderr
|
||||
./your_program 2>debug.log
|
||||
```
|
||||
|
||||
## Building from Source
|
||||
|
||||
### Requirements
|
||||
- RISC-V GCC toolchain (clang/gcc)
|
||||
- RISC-V Fortran compiler (flang/gfortran) for Fortran support
|
||||
- Make
|
||||
|
||||
### Build Options
|
||||
```bash
|
||||
# Default build
|
||||
make
|
||||
|
||||
# Clean build
|
||||
make clean
|
||||
|
||||
# Debug build
|
||||
make CFLAGS="-g -O0 -DDEBUG"
|
||||
```
|
||||
|
||||
## Integration Examples
|
||||
|
||||
### CMake Integration
|
||||
|
||||
```cmake
|
||||
# In your CMakeLists.txt
|
||||
add_library(rvprof STATIC IMPORTED)
|
||||
set_target_properties(rvprof PROPERTIES IMPORTED_LOCATION /path/to/librvprof.a)
|
||||
|
||||
target_link_libraries(your_target rvprof)
|
||||
target_compile_options(your_target PRIVATE -finstrument-functions)
|
||||
```
|
||||
|
||||
### Makefile Integration
|
||||
|
||||
```makefile
|
||||
RVPROF_DIR = /path/to/rvprof
|
||||
CFLAGS += -finstrument-functions -I$(RVPROF_DIR)
|
||||
LDFLAGS += -L$(RVPROF_DIR) -lrvprof
|
||||
|
||||
your_program: your_program.o
|
||||
$(CC) $< $(LDFLAGS) -o $@
|
||||
```
|
||||
|
||||
## License
|
||||
|
||||
This software is provided under the MIT license. See [LICENSE](LICENSE) for details.
|
||||
|
||||
## Contributing
|
||||
|
||||
rvprof is designed for RISC-V systems but the core profiling logic is portable. Contributions welcome for:
|
||||
- Additional architecture support
|
||||
- Output format improvements
|
||||
- Performance optimizations
|
||||
- Additional analysis features
|
||||
|
||||
## Acknowledgments
|
||||
|
||||
Inspired by [vftrace](https://github.com/SX-Aurora/Vftrace) from the SX-Aurora project.
|
Loading…
Reference in New Issue