Abstract
Thread level parallelism (TLP) is a common approach to achieve parallelism where Instruction level parallelism (ILP) is insufficient. Hardware multithreading is a prevalent approach in the micro-architecture layer for tolerating long events such as memory access, mispredictions, and accelerators latencies by utilizing idle cycles and avoiding CPU stalling. Multithreading architectures are commonly used across many processors and various embedded edge devices to improve performance. This work suggests a new 2-way superscalar in-order Multithreading (MT) micro-architecture applied to a superscalar RISC-V processor. This work also proposes efficient prefetch-scheduling and Issue-scheduling algorithms for Simultaneous Multithreading (SMT) processors. The suggested micro-architecture was implemented on an FPGA chip. The results show that the proposed prefetch-scheduling and issue-scheduling algorithms outperform the 2-way superscalar Fine-Grained Multi-Threaded (FGMT) and the Coarse-Grained Multi-Threaded (CGMT) approaches. The proposed architecture is evaluated using the standard MiBench [16] and RISCV [17] benchmarks, demonstrating an average processor core utilization improvement of up to 52% in terms of IPC using four threads.
Original language | English |
---|---|
Journal | IEEE Access |
DOIs | |
State | Accepted/In press - 1 Jan 2025 |
Keywords
- Hardware Multithreading
- In-order pipeline
- Issue scheduling
- RISC-V
- Superscalar
- prefetch scheduling
ASJC Scopus subject areas
- General Computer Science
- General Materials Science
- General Engineering