Register File

The nu+ register file is composed by a scalar register file and a vector register file; each one containing 64 registers.

The scalar register file has 64 registers. The first 58 are general purpose registers, while the remaining 8 are special purpose registers. Each scalar register can store up to 32 bits of data. However the nu+ architecture can support also 64 bits of data, storing it in a couple of contiguous registers.

The vector register file has 64 general purpose registers Each vector register can store up to 512 bits of data. Each vector can store 16 x 32 bits or 8 x 64 bits of data.

Data Types

The following table sums up the data types that are possible to use in nu+. The Type column has the C/C++ type names, the LLVM type column presents the type names used in LLVM and the Register column shows the register type in which a value of a specific type is stored.

Type	LLVM Type	Register	Notes
bool	i1	scalar (32 bits)	It is expanded to 32 bits
char	i8	scalar (32 bits)	It is expanded to 32 bits
short	i16	scalar (32 bits)	It is expanded to 32 bits
int	i32	scalar (32 bits)
float	f32	scalar (32 bits)
long long int	i64	scalar (64 bits)
double	f64	scalar (64 bits)
vec16i8, vec16u8	v16i8	vector (16 x 32 bits)	It is expanded to 32 bits vector
vec16i16, vec16u16	v16i16	vector (16 x 32 bits)	It is expanded to 32 bits vector
vec16i32, vec16u32	v16i32	vector (16 x 32 bits)
vec16f32	v16f32	vector (16 x 32 bits)
vec8i8, vec8u8	v8i8	vector (8 x 64 bits)	It is expanded to 64 bits vector
vec8i16, vec8u16	v8i16	vector (8 x 64 bits)	It is expanded to 64 bits vector
vec8i32, vec8u32	v8i32	vector (8 x 64 bits)	It is expanded to 64 bits vector
vec8f32	v8f32	vector (16 x 32 bits)	It is considered as a 16 elements vector
vec8i64, vec8u64	v8i64	vector (8 x 64 bits)
vec8f64	v8f64	vector (8 x 64 bits)

Instructions Format

The nu+ instructions have a fixed length of 32 bits. They are grouped in seven types:

The R type includes the logical and arithmetic operations and memory operations.

The I type includes the logical and arithmetic operations between a register operand and an immediate operand.

The MOVEI type includes the load operations of an immediate operand in a register.

The C type used for control operations and for synchronization instructions.

The JR type includes jump instructions.

The M type includes the instructions used to access memory.

The M-poly type is used for memory instructions which uses a polyhedral access pattern.

R type instructions

RR (Register to Register) has a destination register and two source registers.
RI (Register Immediate) has a destination register and one source registers and an immediate encoded in the instruction word.

or	1	or	Rb
and	2	and	Rd = Ra & Rb
xor	3	xor	Rd = Ra ^ Rb
add	4	addition	Rd = Ra + Rb
sub	5	subtraction	Rd = Ra – Rb
mull	6	multiplication	Rd = Ra * Rb
mulh	7	high multiply	Rd = Ra * Rb
mulhu	8	high multiply unsigned	Rd = Ra * Rb
ashr	9	arithmetic shift right	Rd = Ra ‘>> Rb
shr	10	shift right	Rd = Ra >> Rb
shl	11	shift left	Rd = Ra << Rb
clz	12	count leading zeros
ctz	13	count trailing zeros
shuffle	24	vector shuffle	Rd[i] = Ra[Rb[i]]
getlane	25	Get lane from vector	Rd = Ra[Rb]
move	32	move register	Rd = Ra
add_f	33	floating point add	Rd = Ra + Rb
sub_f	34	floating point sub	Rd = Ra – Rb
mul_f	35	floating point multiplication	Rd = Ra * Rb
div_f	36	floating point division	Rd = Ra / Rb
sext8	43	sign extend 8 bits
sext16	44	sign extend 16 bits
sext32	45	sign extend 32 bits
f32tof64	46	cast float to double
f64tof32	47	cast double to float
i32tof32	48	cast integer to float
f32toi32	49	cast float to integer
cmpeq	14	compare equal	Rd = Ra == Rb
cmpne	15	compare not equal	Rd = Ra != Rb
cmpgt	16	compare greater then	Rd = Ra > Rb
cmpge	17	compare greater or equal	Rd = Ra >= Rb
cmplt	18	compare less then	Rd = Ra < Rb
cmple	19	compare less or equal	Rd = Ra <= Rb
cmpgt_u	20	unsigned compare greater then	Rd = Ra > Rb
cmpge_u	21	unsigned compare greater or equal	Rd = Ra >= Rb
cmplt_u	22	unsigned compare less then	Rd = Ra < Rb
cmple_u	23	unsigned compare less or equal	Rd = Ra <= Rb
cmpeq_fp	37	floating point compare equal	Rd = Ra == Rb
cmpne_fp	38	floating point compare not equal	Rd = Ra != Rb
cmpgt_fp	39	floating point compare greater then	Rd = Ra > Rb
cmpge_fp	40	floating point compare greater or equal	Rd = Ra >= Rb
cmplt_fp	41	floating point compare less then	Rd = Ra < Rb
cmple_fp	42	floating point compare less or equal	Rd = Ra <= Rb

I type instructions

Mnemonic	Opcode	Meaning	Operation
ori	1	or	Imm
andi	2	and	Rd = Ra & Imm
xori	3	xor	Rd = Ra ^ Imm
addi	4	addition	Rd = Ra + Imm
subi	5	subtraction	Rd = Ra – Imm
mulli	6	multiplication	Rd = Ra * Imm
mulhi	7	high multiply	Rd = Ra * Imm
mulhui	8	high multiply unsigned	Rd = Ra * Imm
ashri	9	arithmetic shift right	Rd = Ra ‘>> Imm
shri	10	shift right	Rd = Ra >> Imm
shli	11	shift left	Rd = Ra << Imm
getlane	25	Get lane from vector	Rd = Ra[Imm]

MOVEI type instructions

MVI (Move Immediate) has a destination register and a 16 bit instruction encoded immediate.

Mnemonic	Opcode	Meaning	Operation
moveil	0	move the 16 less significant bits	Rd = Ra & 0xFFFF
moveih	1	move the 16 most significant bits	Rd = (Ra >> 16) & 0xFFFF
movei	2	move the 16 less significant bits with zero extension	Rd = (Rd ^ Rd) & (Ra & 0xFFFF)

C type instructions

Mnemonic	Opcode	Meaning
barrier_core	0	barrier through all the nu+’s cores
barrier_thread	1	barrier through all the threads of a core
flush	2	flush a cache line to the system memory

JR type instructions

J type instructions

M type instructions

MEM (Memory Instruction) has a destination/source field, in case of load the first register asses the destination register, otherwise in case of store the first register contains the store value. Next in both cases there is the base address and the immediate. The sum of base address and immediate will give the effective memory address.

Mnemonic	Opcode	Meaning	Operation
loadXD_s8	0	load 1 byte with sign extension	Rd = [Rbase + Offset]
loadXD_s16	1	load 2 bytes with sign extension	Rd = [Rbase + Offset]
load32D	2	load 1 word	Rd = [Rbase + Offset]
loadXD_u8	4	load 1 byte with zero extension	Rd = [Rbase + Offset]
loadXD_u16	5	load 2 bytes with zero extension	Rd = [Rbase + Offset]
load64D_s32	2	load 1 word sign-extended to 1 double-word	Rd = [Rbase + Offset]
load64D_u32	6	load 1 word zero-extended to 1 double-word	Rd = [Rbase + Offset]
load64D	3	load 1 double-word	Rd = [Rbase + Offset]
loadD_vYi8	7	load a vector of Y bytes with sign extension	Rd = [Rbase + Offset]
loadD_vYi16	8	load a vector of Y 2 bytes with sign extension	Rd = [Rbase + Offset]
loadD_vYi32	9	load a vector of Y words with sign extension	Rd = [Rbase + Offset]
loadD_v8i64	10	load a vector of 8 double-words	Rd = [Rbase + Offset]
loadD_vYu8	11	load a vector of Y bytes with zero extension	Rd = [Rbase + Offset]
loadD_vYu16	12	load a vector of Y 2 bytes with zero extension	Rd = [Rbase + Offset]
loadD_vYu32	13	load a vector of Y words with zero extension	Rd = [Rbase + Offset]
loadD_g_32	16	load 16 words from different memory addresses	Rd[i] = [Rbase[i]]
storeXD_8	32	store 1 byte	[Rbase + Offset] = Rs
storeXD_16	33	store 2 bytes	[Rbase + Offset] = Rs
store32D	34	store 1 word	[Rbase + Offset] = Rs
store64D_32	34	store 1 word	[Rbase + Offset] = Rs
store64D	35	store 1 double-word	[Rbase + Offset] = Rs
storeD_vYi8	32	store Y bytes	[Rbase + Offset] = Rs
storeD_vYi16	33	store Y 2 bytes	[Rbase + Offset] = Rs
storeD_vYi32	34	store Y words	[Rbase + Offset] = Rs
storeD_v8i64	35	store Y double-words	[Rbase + Offset] = Rs
storeD_s_32	42	store 16 words to different memory addresses	[Rbase[i]] = Rs[i]

ISA

Contents

Register File

Data Types

Instructions Format

R type instructions

I type instructions

MOVEI type instructions

C type instructions

JR type instructions

J type instructions

M type instructions

M-poly type instructions

NOP instruction

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools