Add README
This commit is contained in:
parent
4611874301
commit
41a7eccad3
280
README.md
280
README.md
|
@ -1,3 +1,281 @@
|
|||
# rvprof
|
||||
|
||||
riscv profling tool, inspired by vftrace
|
||||
A lightweight profiling library for RISC-V applications, inspired by [vftrace](https://github.com/SX-Aurora/Vftrace). rvprof provides function-level timing analysis with minimal overhead and automatic instrumentation support.
|
||||
|
||||
## Features
|
||||
|
||||
- **Automatic Function Profiling**: Uses cycgnus function hooks inserted via `-finstrument-functions` for zero-code-change profiling
|
||||
- **Manual Region Profiling**: Explicit profiling regions for fine-grained control
|
||||
- **Symbol Resolution**: ELF parsing for human-readable function names
|
||||
- **Timing Precision**: Nanosecond-resolution timing with optional cycle counter support
|
||||
- **Memory Tracking**: Built-in profiler overhead monitoring
|
||||
- **Stack Analysis**: Complete call stack tracking and analysis
|
||||
- **Multi-language Support**: Native C/C++ and Fortran APIs
|
||||
- **Configurable Output**: Environment variable control and customizable reports
|
||||
|
||||
## Quick Start
|
||||
|
||||
### 1. Build the Library
|
||||
|
||||
```bash
|
||||
make
|
||||
```
|
||||
|
||||
This creates `librvprof.a` static library.
|
||||
|
||||
### 2. Automatic Profiling (Recommended)
|
||||
|
||||
Compile your program with function instrumentation:
|
||||
|
||||
```bash
|
||||
# C/C++
|
||||
gcc -finstrument-functions your_program.c -L. -lrvprof -o your_program
|
||||
|
||||
# Fortran
|
||||
gfortran -finstrument-functions your_program.f90 -L. -lrvprof -o your_program
|
||||
```
|
||||
|
||||
Run your program normally - profiling happens automatically:
|
||||
|
||||
```bash
|
||||
./your_program
|
||||
# Creates: your_program_rvprof.log
|
||||
```
|
||||
|
||||
### 3. Manual Profiling
|
||||
|
||||
For (additional) manual control over profiling regions:
|
||||
|
||||
**C/C++:**
|
||||
```c
|
||||
#include "rvprof.h"
|
||||
|
||||
int main() {
|
||||
rvprof_region_begin("initialization");
|
||||
// your initialization code
|
||||
rvprof_region_end("initialization");
|
||||
|
||||
rvprof_region_begin("computation");
|
||||
// your computation code
|
||||
rvprof_region_end("computation");
|
||||
|
||||
return 0;
|
||||
}
|
||||
```
|
||||
|
||||
**Fortran:**
|
||||
```fortran
|
||||
program example
|
||||
use rvprof
|
||||
implicit none
|
||||
|
||||
call rvprof_region_begin("initialization")
|
||||
! your initialization code
|
||||
call rvprof_region_end("initialization")
|
||||
|
||||
call rvprof_region_begin("computation")
|
||||
! your computation code
|
||||
call rvprof_region_end("computation")
|
||||
|
||||
end program example
|
||||
```
|
||||
|
||||
If the code is compiled using `-finstrument-functions`, the user-defined regions will show up in addition to the automatic ones.
|
||||
|
||||
## Configuration
|
||||
|
||||
Control rvprof behavior with environment variables:
|
||||
|
||||
| Variable | Default | Description |
|
||||
|----------|---------|-------------|
|
||||
| `RVPROF_OUTPUT` | `<program>_rvprof.log` | Output filename |
|
||||
| `RVPROF_DISABLE_HOOKS` | `0` | Set to `1` to disable automatic function hooks |
|
||||
| `RVPROF_DISABLE_MERGE` | `0` | Set to `1` to keep separate entries for same function from different callers |
|
||||
|
||||
### Examples
|
||||
|
||||
```bash
|
||||
# Custom output file
|
||||
RVPROF_OUTPUT=my_profile.txt ./your_program
|
||||
|
||||
# Disable automatic profiling, use manual regions only
|
||||
RVPROF_DISABLE_HOOKS=1 ./your_program
|
||||
|
||||
# Keep separate entries for functions called from different contexts
|
||||
RVPROF_DISABLE_MERGE=1 ./your_program
|
||||
```
|
||||
|
||||
## Understanding the Output
|
||||
|
||||
rvprof generates a detailed text report with multiple sections:
|
||||
|
||||
### Function Profile Table
|
||||
|
||||
```
|
||||
+-------+-----------+-----------+-----------+-----------+---------------+---------------+---------------------------------------------+---------------------------------------------+------+
|
||||
| Calls | t_excl[s] | t_excl[%] | t_incl[s] | t_incl[%] | excl_cycles | incl_cycles | Function | Caller | STID |
|
||||
+-------+-----------+-----------+-----------+-----------+---------------+---------------+---------------------------------------------+---------------------------------------------+------+
|
||||
| 1 | 2.450 | 89.2 | 2.750 | 100.0 | 2450000000 | 2750000000 | main | --- | 1 |
|
||||
| 100 | 0.250 | 9.1 | 0.300 | 10.9 | 250000000 | 300000000 | compute_heavy | main | 2 |
|
||||
+-------+-----------+-----------+-----------+-----------+---------------+---------------+---------------------------------------------+---------------------------------------------+------+
|
||||
```
|
||||
|
||||
**Columns explained:**
|
||||
- **Calls**: Number of times function was called
|
||||
- **t_excl[s]**: Exclusive time (time spent in function itself, excluding children)
|
||||
- **t_excl[%]**: Exclusive time as percentage of total program runtime
|
||||
- **t_incl[s]**: Inclusive time (time spent in function including all children)
|
||||
- **t_incl[%]**: Inclusive time as percentage of total program runtime
|
||||
- **excl_cycles/incl_cycles**: Cycle counts (if RISC-V cycle counter available)
|
||||
- **Function**: Function name (resolved from symbols or address)
|
||||
- **Caller**: Calling function (or "various" if called from multiple places)
|
||||
- **STID**: Stack ID referencing the call stack table
|
||||
|
||||
### Call Stack Analysis
|
||||
|
||||
```
|
||||
Global call stacks:
|
||||
--------------------------------------------------------------------
|
||||
STID Call stack
|
||||
--------------------------------------------------------------------
|
||||
STID1 main
|
||||
STID2 main<compute_heavy
|
||||
STID3 main<compute_heavy<helper_function
|
||||
--------------------------------------------------------------------
|
||||
```
|
||||
|
||||
Shows the complete call paths for each stack ID, with `<` separating stack levels.
|
||||
|
||||
### Performance Summary
|
||||
|
||||
```
|
||||
Summary:
|
||||
Total execution time: 2.750 seconds
|
||||
Total cycles: 2750000000 cycles (informational)
|
||||
Timer resolution: nanosecond timer
|
||||
Number of functions: 15
|
||||
Number of stacks: 8
|
||||
Function hooks: enabled
|
||||
Symbol table: 142 symbols loaded (ELF parsing)
|
||||
Region merging: enabled
|
||||
Memory footprint: 45.2 KB
|
||||
```
|
||||
|
||||
## API Reference
|
||||
|
||||
### C/C++ API
|
||||
|
||||
```c
|
||||
// Manual region profiling
|
||||
void rvprof_region_begin(const char* name);
|
||||
void rvprof_region_end(const char* name);
|
||||
|
||||
// Automatic hooks (called by -finstrument-functions)
|
||||
void __cyg_profile_func_enter(void *this_fn, void *call_site);
|
||||
void __cyg_profile_func_exit(void *this_fn, void *call_site);
|
||||
```
|
||||
|
||||
### Fortran API
|
||||
|
||||
```fortran
|
||||
! Manual region profiling
|
||||
subroutine rvprof_region_begin(name)
|
||||
subroutine rvprof_region_end(name)
|
||||
```
|
||||
|
||||
|
||||
### Performance Considerations
|
||||
|
||||
- **Overhead**: rvprof is designed for minimal overhead (~1-5% typical) but this can still get significant for a huge number of calls. Consider adding `__attribute__((no_instrument_function))` to functions that should not be profiled.
|
||||
- **Memory**: Scales with number of unique functions and call stacks
|
||||
- **Accuracy**: Nanosecond timing resolution; cycle counter when available
|
||||
|
||||
### Symbol Resolution
|
||||
|
||||
rvprof automatically attempts to resolve function addresses to names by:
|
||||
1. Parsing the ELF symbol table from `/proc/self/exe`
|
||||
2. Handling Position-Independent Executables (PIE)
|
||||
3. Falling back to address display if symbols unavailable
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
**"No symbols loaded"**
|
||||
- Compile with debug symbols: add `-g` flag
|
||||
- Ensure executable has symbol table: `objdump -t your_program`
|
||||
|
||||
**"Cycle counter unavailable"**
|
||||
- Normal on some RISC-V implementations
|
||||
- Timing still works with nanosecond precision
|
||||
|
||||
**Missing function names**
|
||||
- Some optimized builds strip symbols
|
||||
- Functions may be inlined by compiler
|
||||
|
||||
### Debugging
|
||||
|
||||
Enable verbose output:
|
||||
```bash
|
||||
# rvprof prints initialization messages to stderr
|
||||
./your_program 2>debug.log
|
||||
```
|
||||
|
||||
## Building from Source
|
||||
|
||||
### Requirements
|
||||
- RISC-V GCC toolchain (clang/gcc)
|
||||
- RISC-V Fortran compiler (flang/gfortran) for Fortran support
|
||||
- Make
|
||||
|
||||
### Build Options
|
||||
```bash
|
||||
# Default build
|
||||
make
|
||||
|
||||
# Clean build
|
||||
make clean
|
||||
|
||||
# Debug build
|
||||
make CFLAGS="-g -O0 -DDEBUG"
|
||||
```
|
||||
|
||||
## Integration Examples
|
||||
|
||||
### CMake Integration
|
||||
|
||||
```cmake
|
||||
# In your CMakeLists.txt
|
||||
add_library(rvprof STATIC IMPORTED)
|
||||
set_target_properties(rvprof PROPERTIES IMPORTED_LOCATION /path/to/librvprof.a)
|
||||
|
||||
target_link_libraries(your_target rvprof)
|
||||
target_compile_options(your_target PRIVATE -finstrument-functions)
|
||||
```
|
||||
|
||||
### Makefile Integration
|
||||
|
||||
```makefile
|
||||
RVPROF_DIR = /path/to/rvprof
|
||||
CFLAGS += -finstrument-functions -I$(RVPROF_DIR)
|
||||
LDFLAGS += -L$(RVPROF_DIR) -lrvprof
|
||||
|
||||
your_program: your_program.o
|
||||
$(CC) $< $(LDFLAGS) -o $@
|
||||
```
|
||||
|
||||
## License
|
||||
|
||||
This software is provided under the MIT license. See [LICENSE](LICENSE) for details.
|
||||
|
||||
## Contributing
|
||||
|
||||
rvprof is designed for RISC-V systems but the core profiling logic is portable. Contributions welcome for:
|
||||
- Additional architecture support
|
||||
- Output format improvements
|
||||
- Performance optimizations
|
||||
- Additional analysis features
|
||||
|
||||
## Acknowledgments
|
||||
|
||||
Inspired by [vftrace](https://github.com/SX-Aurora/Vftrace) from the SX-Aurora project.
|
Loading…
Reference in New Issue