This project provides a benchmarking framework for parallel computing kernels, where the execution of the kernels can be parallelized using OpenMP or Eventify to compare both for the FlexFMM collaborative project.
The application is designed to make adding kernels and parallelization strategies as easy as possible.
## Features
- **Kernel Registry**: A registry that allows the user to register and execute different computational kernels easily.
- **Parallelization Strategies**: Two strategies for parallelizing the execution of kernel loops:
- **OpenMP**: Uses OpenMP directives to parallelize the outermost loop.
- **Eventify**: Uses the Eventify tasking system for parallelism.
- **Kernel Execution**: Kernels such as **STREAM TRIAD** and **DAXPY** are implemented, and their execution can be timed and compared across different parallelization strategies.
- **Eventify**: Ensure that the Eventify library is properly installed and the environment variable `EVENTIFY_ROOT` points to the root directory of the Eventify installation.
## Building the Project
To build the project, run:
```
make
```
This will compile the source files and generate an executable called `benchmark` in the `bin/` directory.
-`<kernel_name>`: The name of the kernel to run. Example: `stream_triad`
-`<strategy>`: The parallelization strategy to use. Available options: `omp` (for OpenMP) and `eventify` (for Eventify).
-`<num_threads_or_tasks>`: The number of threads or tasks to use for parallel execution. This depends on the parallelization strategy (e.g., number of threads for OpenMP, number of tasks for Eventify).
### Example:
To run the `stream_triad` kernel with the OpenMP strategy using 4 threads:
```
./bin/benchmark stream_triad omp 4
```
To run the `daxpy` kernel with the Eventify strategy using 8 tasks:
```
./bin/benchmark daxpy eventify 8
```
### Error Handling
- If an invalid kernel name is provided, the program will print an error message and list available kernels.
- The number, types and initialization of arguments can be choosen freely.
- Note that you only need to provide the loop body / inner loops of a loop nest. The outer loop with induction variable `int i` is defined as part of the parallelization strategy already.
-`a`, `b`, and `c` are the vectors used for the operation.
-`prepare` initializes these vectors and fills them with random values using the `initialize_vector` function.
-`execute` contains the vector product logic, where each element in vector `a` is computed as the product of corresponding elements in vectors `b` and `c`.
2.**Register the Kernel**:
- The new kernel should be automatically registered when the `initialize_registry` function is called. This is done dynamically through the registry.
3.**Use the Kernel**:
- Once you have added the kernel to the registry, you can run it just like the existing kernels using the `./bin/benchmark` command. For example:
```
./bin/benchmark vector_product omp 4
```
### Notes on Adding Kernels:
- Kernels must be registered with a **name** (e.g., `"vector_product"`) and should include the corresponding **allocations and data initialization** (`prepare`) and **kernel logic** (`execute`).
- Kernels must consist out of an outer loop at least for now.
- The kernel’s execution should be parallelizable using all of the available strategies (`omp` (OpenMP) and `eventify` (Eventify) for now). You can add more strategies by extending the `strategy` namespace.
- The `VECTOR_SIZE` preprocessor variable defines the size of the input data and should be appropriate for the kernel you are implementing.