RISC-V: custom instruction and its simulation(转)
Agenda
This article shows how to add a new instruction to RISC-V and simulate it.
These topics are covered along the way:
- Whole GNU
riscv
toolchain installation; - Implementation of a new instruction for
spike
RISC-V ISA simulator; - Manual instruction encoding in C/C++;
- Custom instruction simulation (with visible output);
- [riscv32-]GCC plugin development;
You may find associated repository useful.
Many things can go wrong. Be prepared to fix upcoming issues by yourself.
The final result is very rewarding, I promise.
Toolchain installation
Choose installation directory. Call it RISCV
.
Add these lines to your ~/.bashrc
:
# Directory which will contain everything we need.
export RISCV_HOME=~/riscv-home
# $RISCV will point to toolchain install location.
export RISCV="${RISCV_HOME}/riscv"
export PATH="${PATH}:${RISCV}/bin"
Run mkdir -p "${RISCV_HOME}" "${RISCV}"
.
Use 1_install/2_download-repos script to clone all required repositories.
If you wish to save some time and traffic, avoid recursive clone of toolchain repository. Instead, clone sub-modules by hand. You may exclude “riscv-glibc”.
Be warned: I have not tested partial toolchain build, caveat emptor
Satisfy GNU toolchain prerequisites by installing all required packages. In addition, spike requires device-tree-compiler
package.
We choose:
- RISCV32 over RISCV64
- newlib over glibc
Repositories must be built in this order:
- riscv-gnu-toolchain
- riscv-fesvr, riscv-pk
- riscv-isa-sim
You can use 1_install/3_build-repos script as a guideline.
To check installation, use 1_install/4_check-install.
Custom instruction description
Within the framework of this article, we will implement mac instruction.
rv32im
has mul
and add
instructions, mac
combines them.
It defined as a0 := a0 + a1 * a2
(ordinary 3-address instruction).
# Without mac (preserve registers):
mv t0, a0 # addi r0, a0, 0
mul a1, a2, a3
add a1, a1, t0
# With mac:
mac a1, a2, a3
Adding “mac” instruction to the rv32im
To add an instruction to the simulator: 1. Describe the instruction’s functional behavior; 2. Add the opcode and opcode mask to “riscv/opcodes.h”;
First step is accomplished by adding a riscv/insns/mac.h
file:
/* file "$RISCV_HOME/riscv-isa-sim/riscv/insns/mac.h" */
// 'M' extension means we require integer mul/div standard extension.
require_extension('M');
// RD = RD + RS1 * RS2
reg_t tmp = sext_xlen(RS1 * RS2);
WRITE_RD(sext_xlen(READ_REG(insn.rd()) + tmp));
For the second step, we use riscv-opcodes.
cd "${RISCV_HOME}/riscv-opcodes"
echo -e "mac rd rs1 rs2 31..25=1 14..12=0 6..2=0x1A 1..0=3\n" >> opcodes
make install
It turns out there is a third step which is not documented. New entry must be added to the riscv_insn_list
.
sed -i 's/riscv_insn_list = \\/riscv_insn_list = mac\\/g' \
"${RISCV_HOME}/riscv-isa-sim/riscv/riscv.mk.in"
Rebuild the simulator.
cd "${RISCV}/riscv-isa-sim/build"
sudo make install
Testing rv32im brand new instruction
At this stage:
- Compiler knows nothing about
mac
. It can not emit that instruction; - Assembler knows nothing about
mac
. We can not usemac
in inline assembly;
Our last resort is manual encoding.
#include <stdio.h>
// Needed to verify results.
int mac_c(int a, int b, int c) {
a += b * c; // Semantically, it is "mac"
return a;
}
// Should not be inlined, because we expect arguments
// in particular registers.
__attribute__((noinline))
int mac_asm(int a, int b, int c) {
asm __volatile__ (".word 0x02C5856B\n");
return a;
}
int main(int argc, char** argv) {
int a = 2, b = 3, c = 4;
printf("%d =?= %d\n", mac_c(a, b, c), mac_asm(a, b, c));
}
Save test program as test_mac.c
.
riscv32-unknown-elf-gcc test_mac.c -O1 -march=rv32im -o test_mac
spike --isa=RV32IM "${RISCV_PK}" test_mac
You should see 14 =?= 14
printed to stdout.
If result differs, riscv32-unknown-elf-gdb
can help you in troubleshooting.
Mac encoding explained
Be sure to look at official specifications if you aim for precise descriptions.
mac
will mimic mul
encoding, but use different opcode.
# file "riscv-opcodes/opcodes"
# differs
# |
# v
mac rd rs1 rs2 31..25=1 14..12=0 6..2=0x1A 1..0=3
mul rd rs1 rs2 31..25=1 14..12=0 6..2=0x0C 1..0=3
# ^ ^ ^ ^ ^ ^ ^
# | | | | | | |
# | | | | | | |
# | | | | | | also opcode 3 bits
# | | | | | opcode 5 bits
# | | | | funct3 3 bits
# | | | funct7 7 bits
# | | rs2 (src2) 5 bits
# | rs1 (src1) 5 bits
# dest 5 bits
Actual encoding has different order of components and opcode is really single 7 bit segment.
5 bits per register operand means that we have 32 addressable registers.
# Encoding used for "mac a0, a1, a2"
0x02C5856B [base 16]
==
10110001011000010101101011 [base 2]
==
00000010110001011000010101101011 [base 2]
# Group by related bit chunks:
0000001 01100 01011 000 01010 1101011
^ ^ ^ ^ ^ ^
| | | | | |
| | | | | opcode (6..2=0x0C 1..0=3)
| | | | dest (10 : a0)
| | | funct3 (14..12=0)
| | src1 (11 : a1)
| src2 (12 : a2)
funct7 (31..25=1)
Plugin vs patch
There are two ways to extend GCC:
- Patch GCC itself
- Write loadable plugin for GCC
Prefer plugins to GCC patches whenever possible.
GCC wiki “plugins” page described advantages in the “Background” section.
In this guide, both methods will be covered.
Useful links:
- Simple GCC plugin series of posts
- GCC plugins manual
GCC “rv32imMac” plugin
TODO
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· .NET Core 中如何实现缓存的预热?
· 从 HTTP 原因短语缺失研究 HTTP/2 和 HTTP/3 的设计差异
· AI与.NET技术实操系列:向量存储与相似性搜索在 .NET 中的实现
· 基于Microsoft.Extensions.AI核心库实现RAG应用
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· TypeScript + Deepseek 打造卜卦网站:技术与玄学的结合
· Manus的开源复刻OpenManus初探
· .NET Core 中如何实现缓存的预热?
· 三行代码完成国际化适配,妙~啊~
· 阿里巴巴 QwQ-32B真的超越了 DeepSeek R-1吗?