RV32I

RISC-V Base Integer Instruction Set

Technical Reference for Emulator Developers

Version 1.0 · Covers RV32I (32-bit base integer ISA)

0.0 - Table of Contents

0.0 - Table of Contents
0.1 - Using This Document
1.0 - About RISC-V
2.0 - RV32I Specifications
2.1 - Memory
2.2 - Registers
2.3 - Program Counter
2.4 - Instruction Formats
2.5 - Immediate Encoding
3.0 - RV32I Instructions
3.1 - Integer Register-Register
ADD SUB AND OR XOR SLL SRL SRA SLT SLTU
3.2 - Integer Register-Immediate
ADDI ANDI ORI XORI SLTI SLTIU SLLI SRLI SRAI
3.3 - Upper Immediate
LUI AUIPC
3.4 - Loads
LW LH LB LHU LBU
3.5 - Stores
SW SH SB
3.6 - Branches
BEQ BNE BLT BGE BLTU BGEU
3.7 - Jumps
JAL JALR
3.8 - System
ECALL EBREAK
4.0 - Opcode Map
5.0 - Emulator Implementation Notes
5.1 - Fetch-Decode-Execute Loop
5.2 - Field Extraction in C
5.3 - Sign Extension

0.1 - Using This Document [TOC]

This document is a technical reference for the RISC-V RV32I base integer instruction set, written specifically for people building an emulator in C. It covers only the 32-bit base integer ISA (RV32I). Extensions such as M (multiply/divide), A (atomics), F (float), and D (double) are not covered here.

Throughout this document, the following notation is used:

x[rd] — the destination register
x[rs1] — the first source register
x[rs2] — the second source register
imm — a sign-extended immediate value
PC — the Program Counter
M[addr] — memory at address addr
sext(x) — sign-extend x to 32 bits
All integer values are in hexadecimal unless otherwise stated

1.0 - About RISC-V [TOC]

RISC-V (pronounced "risk five") is a free and open instruction set architecture (ISA) developed at UC Berkeley, first released in 2010. Unlike x86 or ARM, the RISC-V ISA specification is publicly available and not encumbered by patents or licensing fees, making it ideal for education, research, and custom hardware design.

RISC-V follows the Reduced Instruction Set Computer (RISC) philosophy: a small number of simple instructions that execute in one cycle, fixed-width instruction encoding, and a large register file. The base integer ISA, called RV32I, contains only 47 instructions. This makes it the ideal architecture for building a first CPU emulator.

The RISC-V ISA is modular. The base integer ISA (I) can be extended with standard extensions:

Extension	Name	Adds
M	Multiply	MUL, DIV, REM and their unsigned variants
A	Atomic	Atomic memory operations for multi-core
F	Float	Single-precision floating-point (32-bit)
D	Double	Double-precision floating-point (64-bit)
C	Compressed	16-bit compressed instructions

This document covers RV32I only. Implement this base set first before adding any extensions.

2.0 - RV32I Specifications [TOC]

This section describes the memory model, register file, program counter, and instruction encoding of RV32I.

2.1 - Memory [TOC]

RV32I uses a flat, byte-addressable memory space of 2³² bytes (4 GB). Addresses are 32-bit unsigned integers. All memory accesses are performed through load and store instructions — there are no memory-to-memory operations.

For an emulator, memory is implemented as a simple byte array:

#define MEM_SIZE (1024 * 1024)   /* 1 MB is enough to run small programs */
uint8_t memory[MEM_SIZE];

A typical memory layout when loading a program looks like this:

+---------------------------+= 0xFFFFFFFF  top of address space
|                           |
|       (unmapped)          |
|                           |
+---------------------------+
|                           |
|        Stack              |  grows downward from high address
|           |               |
|           v               |
|                           |
|           ^               |
|           |               |
|        Heap               |  grows upward
|                           |
+---------------------------+= depends on program
|       .data / .bss        |  initialized and uninitialised globals
+---------------------------+
|       .text               |  program instructions (read-only)
+---------------------------+= entry point (from ELF header, e.g. 0x00010000)
|                           |
|      (reserved)           |
|                           |
+---------------------------+= 0x00000000  bottom of address space

RISC-V is a little-endian architecture. The least-significant byte of a multi-byte value is stored at the lowest address. On an x86 or x86-64 host, this matches the host byte order, so you can read multi-byte values with a direct pointer cast.

2.2 - Registers [TOC]

RV32I has 32 general-purpose integer registers, each 32 bits wide, named x0 through x31. Register x0 is special: it is hardwired to the value zero. Any write to x0 is silently discarded. Any read from x0 always returns 0x00000000.

Each register also has an ABI (Application Binary Interface) name used in assembly language. The ABI names reflect the register's conventional purpose when calling functions:

Register	ABI Name	Description	Preserved across call?
x0	zero	Hardwired zero — reads always return 0, writes ignored	— (constant)
x1	`ra`	Return address — where to jump back after a function call	No (caller saves)
x2	`sp`	Stack pointer — top of the current stack frame	Yes (callee saves)
x3	`gp`	Global pointer	—
x4	`tp`	Thread pointer	—
x5	`t0`	Temporary / alternate link register	No
x6–x7	`t1–t2`	Temporaries	No
x8	`s0 / fp`	Saved register / frame pointer	Yes
x9	`s1`	Saved register	Yes
x10–x11	`a0–a1`	Function arguments / return values	No
x12–x17	`a2–a7`	Function arguments	No
x18–x27	`s2–s11`	Saved registers	Yes
x28–x31	`t3–t6`	Temporaries	No

In your emulator, implement two wrapper functions for register access that enforce the x0 rule:

uint32_t reg_read(CPU *cpu, uint32_t r)              { return (r == 0) ? 0 : cpu->regs[r]; }
void     reg_write(CPU *cpu, uint32_t r, uint32_t v) { if (r != 0) cpu->regs[r] = v; }

2.3 - Program Counter [TOC]

The Program Counter (PC) is a separate 32-bit register, not part of the general-purpose register file. It holds the address of the instruction currently being executed.

After most instructions the PC advances by 4 (one instruction = 4 bytes). Branch and jump instructions set the PC to a computed target address instead. The PC must always be aligned to a 4-byte boundary. Misaligned PC values cause an instruction-address-misaligned exception (which you can ignore in a basic emulator).

2.4 - Instruction Formats [TOC]

Every RV32I instruction is exactly 32 bits wide. There are six instruction formats. The opcode field is always in bits [6:0] regardless of format. The other fields change position depending on the format.

Bit range
31–25	24–20	19–15	14–12	11–7	6–0	Format
funct7	rs2	rs1	funct3	rd	opcode	R-type
imm[11:0]		rs1	funct3	rd	opcode	I-type
imm[11:5]	rs2	rs1	funct3	imm[4:0]	opcode	S-type
imm[12,10:5]	rs2	rs1	funct3	imm[4:1,11]	opcode	B-type
imm[31:12]				rd	opcode	U-type
imm[20,10:1,11,19:12]				rd	opcode	J-type

Field sizes:

Field	Bits	Description
`opcode`	7	Identifies the instruction group
`rd`	5	Destination register (0–31)
`funct3`	3	Sub-opcode — distinguishes instructions within a group
`rs1`	5	First source register (0–31)
`rs2`	5	Second source register (0–31)
`funct7`	7	Second sub-opcode — used in R-type and shift I-type
`imm`	12–20	Immediate value, format-dependent

2.5 - Immediate Encoding [TOC]

Immediate values are embedded in the instruction word. The sign bit of every immediate is always placed at bit 31 of the instruction, which allows for fast sign extension in hardware. Some formats (B and J) have their immediate bits shuffled to simplify hardware layout — you must reassemble them in the correct order in your emulator.

Immediate extraction in C for each format:

/* I-type: 12-bit signed immediate, bits [31:20], sign-extended */
int32_t imm_i = (int32_t)instr >> 20;

/* S-type: 12-bit signed immediate, split across [31:25] and [11:7] */
int32_t imm_s = ((int32_t)instr >> 20 & ~0x1F) | ((instr >> 7) & 0x1F);

/* B-type: 13-bit signed immediate, bit 0 always 0 (halfword aligned) */
/*   bit[12] = instr[31]   bit[11] = instr[7]                          */
/*   bit[10:5] = instr[30:25]   bit[4:1] = instr[11:8]                 */
int32_t imm_b = ((int32_t)(instr & 0x80000000) >> 19)
              | ((instr & 0x00000080) << 4)
              | ((instr >> 20) & 0x7E0)
              | ((instr >> 7)  & 0x1E);

/* U-type: 20-bit immediate in upper bits, lower 12 bits are zero */
uint32_t imm_u = instr & 0xFFFFF000;

/* J-type: 21-bit signed immediate, bit 0 always 0                     */
/*   bit[20] = instr[31]   bit[19:12] = instr[19:12]                   */
/*   bit[11] = instr[20]   bit[10:1] = instr[30:21]                    */
int32_t imm_j = ((int32_t)(instr & 0x80000000) >> 11)
              | (instr & 0x000FF000)
              | ((instr >> 9)  & 0x800)
              | ((instr >> 20) & 0x7FE);

Note: B-type and J-type immediates always have their lowest bit equal to 0, because instructions must be aligned to 2-byte boundaries. The hardware never stores bit 0 of these immediates — it is always implied to be 0. This is why branch offsets are always even numbers.

3.0 - RV32I Instructions [TOC]

All 47 base RV32I instructions are described below, grouped by instruction type. Each entry shows the assembly syntax, the opcode and function codes needed to identify it, and the operation it performs.

3.1 - Integer Register-Register [TOC]

These instructions take two source registers and write a result to a destination register. All use opcode 0x33 (R-type). The funct3 field selects the operation; funct7 distinguishes ADD/SUB and SRL/SRA.

ADD rd, rs1, rs2 Add

opcode=0x33 funct3=0x0 funct7=0x00

x[rd] = x[rs1] + x[rs2]
Adds the values in rs1 and rs2 and stores the result in rd. Overflow is ignored — the result wraps around modulo 2³².

SUB rd, rs1, rs2 Subtract

opcode=0x33 funct3=0x0 funct7=0x20

x[rd] = x[rs1] - x[rs2]
Subtracts rs2 from rs1. Same opcode as ADD — distinguished by funct7. Overflow wraps.

AND rd, rs1, rs2 Bitwise AND

opcode=0x33 funct3=0x7 funct7=0x00

x[rd] = x[rs1] & x[rs2]
Performs bitwise AND of rs1 and rs2.

OR rd, rs1, rs2 Bitwise OR

opcode=0x33 funct3=0x6 funct7=0x00

x[rd] = x[rs1] | x[rs2]
Performs bitwise OR of rs1 and rs2.

XOR rd, rs1, rs2 Bitwise XOR

opcode=0x33 funct3=0x4 funct7=0x00

x[rd] = x[rs1] ^ x[rs2]
Performs bitwise exclusive OR of rs1 and rs2.

SLL rd, rs1, rs2 Shift Left Logical

opcode=0x33 funct3=0x1 funct7=0x00

x[rd] = x[rs1] << (x[rs2] & 0x1F)
Shifts rs1 left by the shift amount held in the lower 5 bits of rs2. Zeros are shifted into the lower bits.

SRL rd, rs1, rs2 Shift Right Logical

opcode=0x33 funct3=0x5 funct7=0x00

x[rd] = x[rs1] >> (x[rs2] & 0x1F) (unsigned)
Shifts rs1 right logically by the lower 5 bits of rs2. Zeros are shifted into the upper bits (unsigned shift).

SRA rd, rs1, rs2 Shift Right Arithmetic

opcode=0x33 funct3=0x5 funct7=0x20

x[rd] = x[rs1] >> (x[rs2] & 0x1F) (signed)
Shifts rs1 right arithmetically. The sign bit is replicated into the upper bits. In C: (int32_t)rs1 >> shamt.

SLT rd, rs1, rs2 Set Less Than

opcode=0x33 funct3=0x2 funct7=0x00

x[rd] = ((int32_t)x[rs1] < (int32_t)x[rs2]) ? 1 : 0
Sets rd to 1 if rs1 is less than rs2 (signed comparison), 0 otherwise.

SLTU rd, rs1, rs2 Set Less Than Unsigned

opcode=0x33 funct3=0x3 funct7=0x00

x[rd] = (x[rs1] < x[rs2]) ? 1 : 0 (unsigned)
Same as SLT but treats both operands as unsigned. Note: SLTU rd, x0, rs2 sets rd to 1 if rs2 is nonzero.

3.2 - Integer Register-Immediate [TOC]

These instructions take one source register and a 12-bit signed immediate. All use opcode 0x13 (I-type). The immediate is sign-extended to 32 bits before the operation.

ADDI rd, rs1, imm Add Immediate

opcode=0x13 funct3=0x0

x[rd] = x[rs1] + sext(imm)
The most commonly used instruction in any program. Used to set registers (ADDI rd, x0, imm), copy registers (ADDI rd, rs1, 0), and adjust pointer offsets. There is no SUBI — use a negative immediate instead.

ANDI rd, rs1, imm AND Immediate

opcode=0x13 funct3=0x7

x[rd] = x[rs1] & sext(imm)
Bitwise AND with sign-extended immediate. Used for masking bits.

ORI rd, rs1, imm OR Immediate

opcode=0x13 funct3=0x6

x[rd] = x[rs1] | sext(imm)
Bitwise OR with sign-extended immediate. Used for setting bits.

XORI rd, rs1, imm XOR Immediate

opcode=0x13 funct3=0x4

x[rd] = x[rs1] ^ sext(imm)
Bitwise XOR with sign-extended immediate. Note: XORI rd, rs1, -1 (imm=0xFFF) inverts all bits of rs1 (bitwise NOT).

SLTI rd, rs1, imm Set Less Than Immediate

opcode=0x13 funct3=0x2

x[rd] = ((int32_t)x[rs1] < sext(imm)) ? 1 : 0
Sets rd to 1 if rs1 is less than the signed immediate, 0 otherwise.

SLTIU rd, rs1, imm Set Less Than Immediate Unsigned

opcode=0x13 funct3=0x3

x[rd] = (x[rs1] < (uint32_t)sext(imm)) ? 1 : 0
Unsigned version of SLTI. The immediate is sign-extended first, then treated as unsigned. Note: SLTIU rd, rs1, 1 sets rd to 1 if rs1 equals zero.

SLLI rd, rs1, shamt Shift Left Logical Immediate

opcode=0x13 funct3=0x1 funct7=0x00

x[rd] = x[rs1] << shamt
Shifts rs1 left by the 5-bit immediate shift amount (shamt, bits [24:20]). Zeros fill the lower bits. The upper 7 bits of the immediate field (bits [31:25]) must be 0x00.

SRLI rd, rs1, shamt Shift Right Logical Immediate

opcode=0x13 funct3=0x5 funct7=0x00

x[rd] = x[rs1] >> shamt (unsigned)
Shifts rs1 right logically. Zeros fill the upper bits.

SRAI rd, rs1, shamt Shift Right Arithmetic Immediate

opcode=0x13 funct3=0x5 funct7=0x20

x[rd] = (int32_t)x[rs1] >> shamt (signed)
Shifts rs1 right arithmetically. The sign bit replicates into the upper bits. Distinguished from SRLI by funct7=0x20.

3.3 - Upper Immediate [TOC]

These two instructions load a 20-bit immediate into the upper 20 bits of a register. They are used together with I-type instructions to construct 32-bit constants and addresses.

LUI rd, imm Load Upper Immediate

opcode=0x37 U-type

x[rd] = imm << 12 (lower 12 bits zeroed)
Places the 20-bit immediate into bits [31:12] of rd and zeros the lower 12 bits. Used to load large constants: follow with ADDI to set the lower 12 bits.

/* Load 0xDEADB000 into x5 */
LUI  x5, 0xDEADB     /* x5 = 0xDEADB000 */
ADDI x5, x5, 0x000   /* optionally set low 12 bits */

AUIPC rd, imm Add Upper Immediate to PC

opcode=0x17 U-type

x[rd] = PC + (imm << 12)
Adds the 20-bit immediate (shifted left 12) to the current PC and stores the result in rd. Used for position-independent code to compute PC-relative addresses.

3.4 - Loads [TOC]

Load instructions read from memory at address x[rs1] + sext(imm) and write the result to rd. All use opcode 0x03 (I-type). The funct3 field selects the width and sign treatment.

LW rd, imm(rs1) Load Word

opcode=0x03 funct3=0x2

x[rd] = M[x[rs1] + sext(imm)][31:0]
Loads a 32-bit word from memory. The address must be 4-byte aligned (in a basic emulator you can ignore alignment). This is the most commonly used load instruction.

LH rd, imm(rs1) Load Halfword

opcode=0x03 funct3=0x1

x[rd] = sext(M[x[rs1] + sext(imm)][15:0])
Loads a 16-bit halfword and sign-extends it to 32 bits. In C: (int32_t)(int16_t)mem_read16(...).

LB rd, imm(rs1) Load Byte

opcode=0x03 funct3=0x0

x[rd] = sext(M[x[rs1] + sext(imm)][7:0])
Loads a byte and sign-extends it to 32 bits. In C: (int32_t)(int8_t)mem_read8(...).

LHU rd, imm(rs1) Load Halfword Unsigned

opcode=0x03 funct3=0x5

x[rd] = M[x[rs1] + sext(imm)][15:0] (zero-extended)
Loads a 16-bit halfword and zero-extends it to 32 bits (no sign extension).

LBU rd, imm(rs1) Load Byte Unsigned

opcode=0x03 funct3=0x4

x[rd] = M[x[rs1] + sext(imm)][7:0] (zero-extended)
Loads a byte and zero-extends it to 32 bits.

3.5 - Stores [TOC]

Store instructions write a register value to memory at address x[rs1] + sext(imm). All use opcode 0x23 (S-type). There is no destination register. The S-type immediate is split across two fields — see Section 2.5 for how to reassemble it.

SW rs2, imm(rs1) Store Word

opcode=0x23 funct3=0x2

M[x[rs1] + sext(imm)] = x[rs2][31:0]
Stores the 32-bit value of rs2 to memory.

SH rs2, imm(rs1) Store Halfword

opcode=0x23 funct3=0x1

M[x[rs1] + sext(imm)] = x[rs2][15:0]
Stores the lower 16 bits of rs2 to memory.

SB rs2, imm(rs1) Store Byte

opcode=0x23 funct3=0x0

M[x[rs1] + sext(imm)] = x[rs2][7:0]
Stores the lowest byte of rs2 to memory.

3.6 - Branches [TOC]

Branch instructions compare two registers and conditionally add a signed offset to the PC. All use opcode 0x63 (B-type). The branch target is PC + sext(imm), where imm is a 13-bit signed offset. The offset is always even (bit 0 is implied zero). If the branch is not taken, execution continues at PC + 4 as normal.

Important for emulators: branch instructions set the PC themselves. In your cpu_step() function, the branch handler must return immediately after setting the PC — do not add 4 after returning from the branch handler.

BEQ rs1, rs2, imm Branch if Equal

opcode=0x63 funct3=0x0

if (x[rs1] == x[rs2]) PC += sext(imm) else PC += 4
Branches if rs1 equals rs2.

BNE rs1, rs2, imm Branch if Not Equal

opcode=0x63 funct3=0x1

if (x[rs1] != x[rs2]) PC += sext(imm) else PC += 4
Branches if rs1 does not equal rs2. Used to implement loops.

BLT rs1, rs2, imm Branch if Less Than

opcode=0x63 funct3=0x4

if ((int32_t)x[rs1] < (int32_t)x[rs2]) PC += sext(imm) else PC += 4
Signed comparison. Branches if rs1 < rs2.

BGE rs1, rs2, imm Branch if Greater or Equal

opcode=0x63 funct3=0x5

if ((int32_t)x[rs1] >= (int32_t)x[rs2]) PC += sext(imm) else PC += 4
Signed comparison. Branches if rs1 >= rs2.

BLTU rs1, rs2, imm Branch if Less Than Unsigned

opcode=0x63 funct3=0x6

if (x[rs1] < x[rs2]) PC += sext(imm) else PC += 4 (unsigned)
Unsigned version of BLT.

BGEU rs1, rs2, imm Branch if Greater or Equal Unsigned

opcode=0x63 funct3=0x7

if (x[rs1] >= x[rs2]) PC += sext(imm) else PC += 4 (unsigned)
Unsigned version of BGE.

3.7 - Jumps [TOC]

Jump instructions unconditionally transfer control to a target address. They also write the return address (PC + 4) into a register, enabling function calls. Like branches, jump handlers must set the PC themselves and return without adding 4.

JAL rd, imm Jump and Link

opcode=0x6F J-type

x[rd] = PC + 4; PC += sext(imm)
Saves the address of the next instruction into rd (the return address), then jumps to PC + offset. The offset is a 21-bit signed value. Used for function calls: JAL ra, function_name. If rd is x0, this is a plain unconditional jump (no link).

JALR rd, rs1, imm Jump and Link Register

opcode=0x67 funct3=0x0 I-type

x[rd] = PC + 4; PC = (x[rs1] + sext(imm)) & ~1
Jumps to an address stored in a register (plus an optional offset). The lowest bit of the target is forced to 0. Used to return from functions: JALR x0, ra, 0 (jump to return address, discard link).

Note: The target address clears bit 0 (& ~1) to ensure alignment. Always implement this masking.

3.8 - System [TOC]

ECALL Environment Call

opcode=0x73 funct3=0x0 imm=0x000

Makes a request to the execution environment (operating system or emulator). The syscall number is in register a0 (x10). Arguments are in a1–a7 (x11–x17).

Common syscall numbers used in RISC-V programs:

a0 value	Syscall	Arguments
1	print integer	a1 = integer to print
4	print string	a1 = address of null-terminated string
10	exit	—
93	exit (Linux)	a1 = exit code

EBREAK Environment Break

opcode=0x73 funct3=0x0 imm=0x001

Transfers control to the debugger. In a basic emulator, treat this the same as ECALL with a0=10 (halt the emulator) or simply print a debug message and stop.

4.0 - Opcode Map [TOC]

Quick reference: opcode → instruction group. Use this as your switch statement guide.

Opcode (hex)	Format	Instructions	Identified by
`0x33`	R	ADD SUB AND OR XOR SLL SRL SRA SLT SLTU	funct3, funct7
`0x13`	I	ADDI ANDI ORI XORI SLTI SLTIU SLLI SRLI SRAI	funct3, (funct7 for shifts)
`0x03`	I	LB LH LW LBU LHU	funct3
`0x23`	S	SB SH SW	funct3
`0x63`	B	BEQ BNE BLT BGE BLTU BGEU	funct3
`0x6F`	J	JAL	— (only one)
`0x67`	I	JALR	funct3=0
`0x37`	U	LUI	— (only one)
`0x17`	U	AUIPC	— (only one)
`0x73`	I	ECALL EBREAK	imm (0=ECALL, 1=EBREAK)

5.0 - Emulator Implementation Notes [TOC]

This section contains practical notes for implementing the emulator in C.

5.1 - Fetch-Decode-Execute Loop [TOC]

The CPU executes one instruction per call to cpu_step(). The main loop calls this function repeatedly until a halt condition is reached.

void cpu_step(CPU *cpu) {

    /* 1. FETCH — read 4 bytes from memory at current PC */
    uint32_t instr = mem_read32(cpu, cpu->pc);

    /* 2. DECODE — extract common fields */
    uint32_t opcode = instr        & 0x7F;
    uint32_t rd     = (instr >> 7) & 0x1F;
    uint32_t funct3 = (instr >> 12) & 0x7;
    uint32_t rs1    = (instr >> 15) & 0x1F;
    uint32_t rs2    = (instr >> 20) & 0x1F;
    uint32_t funct7 = (instr >> 25) & 0x7F;

    /* 3. EXECUTE — dispatch to handler */
    switch (opcode) {
        case 0x33: exec_rtype (cpu, rd, rs1, rs2, funct3, funct7); break;
        case 0x13: exec_itype (cpu, rd, rs1, funct3, instr);       break;
        case 0x03: exec_load  (cpu, rd, rs1, funct3, instr);       break;
        case 0x23: exec_store (cpu, rs1, rs2, funct3, instr);      break;
        case 0x63: exec_branch(cpu, rs1, rs2, funct3, instr);      return;
        case 0x6F: exec_jal   (cpu, rd, instr);                    return;
        case 0x67: exec_jalr  (cpu, rd, rs1, instr);               return;
        case 0x37: exec_lui   (cpu, rd, instr);                    break;
        case 0x17: exec_auipc (cpu, rd, instr);                    break;
        case 0x73: exec_ecall (cpu);                               return;
        default:
            printf("Unknown opcode 0x%02X at PC=0x%08X\n", opcode, cpu->pc);
            return;
    }

    /* 4. ADVANCE PC (branches/jumps return early and skip this) */
    cpu->pc += 4;
}

5.2 - Field Extraction in C [TOC]

All instruction fields are extracted with two operations: right-shift to bring the field down to bit 0, then AND with a mask to clear all bits above the field.

/* Pattern: (instr >> START_BIT) & MASK */
/* Mask for N bits = (1 << N) - 1       */

/* 7-bit mask  = 0x7F  (bits 6:0)  — used for opcode, funct7  */
/* 5-bit mask  = 0x1F  (bits 4:0)  — used for rd, rs1, rs2    */
/* 3-bit mask  = 0x7   (bits 2:0)  — used for funct3           */

uint32_t opcode = instr        & 0x7F;   /* bits [6:0]   */
uint32_t rd     = (instr >> 7) & 0x1F;  /* bits [11:7]  */
uint32_t funct3 = (instr >> 12) & 0x7;  /* bits [14:12] */
uint32_t rs1    = (instr >> 15) & 0x1F; /* bits [19:15] */
uint32_t rs2    = (instr >> 20) & 0x1F; /* bits [24:20] */
uint32_t funct7 = (instr >> 25) & 0x7F; /* bits [31:25] */

5.3 - Sign Extension [TOC]

RISC-V immediates are signed. A 12-bit immediate can represent values from -2048 to +2047. When you use it in a 32-bit addition, you must first sign-extend it — that is, fill the upper 20 bits with copies of the immediate's sign bit (bit 11).

The simplest way to sign-extend in C is to cast the instruction to int32_t before shifting right. Arithmetic right shift on a signed type replicates the sign bit:

/* Sign-extend a 12-bit I-type immediate */
int32_t imm = (int32_t)instr >> 20;
/* The cast to int32_t makes >> an arithmetic shift.     */
/* This replicates bit 31 (originally bit 31 of instr,   */
/* which is the sign bit of the 12-bit immediate) into   */
/* all upper bits automatically.                          */

/* Example: instr = 0xFFF00013  (ADDI x0, x0, -1) */
/* (int32_t)0xFFF00013 >> 20  =  0xFFFFFFFF  =  -1 */

Note: Arithmetic right shift (signed_val >> n) is implementation-defined in C for negative values, but in practice every compiler you will ever use (GCC, Clang, MSVC) implements it as arithmetic shift. It is safe to rely on this for emulator development.