Coherence Injection

From NaplesPU Documentation
Revision as of 21:44, 18 December 2018 by Andrea (talk | contribs) (Created page with "In this guide you will find all the information on how to setup a ''Vivado'' project to test coherence injection and how to use the testbench correctly. == Setup ''Vivado'' p...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

In this guide you will find all the information on how to setup a Vivado project to test coherence injection and how to use the testbench correctly.

Setup Vivado project

Install Vivado

For first you should install a licensed Version of Vivado. For this guide we assume to use Vivado 2018.2.2 with WebPack license. This free license is enough for simulation-purposes.

Get nu+

Now you have to git clone the latest version of nu+ repository with git and checkout coherence_testing branch. If the file tb_coherence_injection.sv exists, then you are on the right branch.

Run Vivado

Open a terminal window and place into the nu+ repository directory. Type ls o dir command, you should see these folders:

  • src
  • toolchain
  • tools

Now, you have to source Vivado settings and then run the setup_project.tcl script. In the following, it will be shown how to do this in Linux and Windows environment.

Linux

Go into the directory of the cloned repository:

cd ~/nuplus

Create the folder in which place a sample kernel image and paste the kernel there:

mkdir -p software/kernels && cp -R "path/to/sample-kernel/mmsc" software/kernels

Now source Vivado settings:

source ~/Xilinx/Vivado/2018.2/settings64.sh

Finally, setup the project and wait for Vivado to be launched:

tools/vivado/setup_project.sh -k mmsc -c 3 -t $(( 16#F )) -m gui

About the parameters:

  • 3 is the one-hot mask that says which core should be active: 3 is (11)2 so 2 cores;
  • $(( 16#F )) is the one-hot mask that says which thread should be active for a core: F is (00001111)2 so 4 threads active;
  • gui is the mode in which execute Vivado (you can run it as batch if you prefer).

However, this parameters are actually ignored, because in the coherence testbench the core nu+ is physically disconnected from the cache controller.

Windows

We will assume Windows 10 as OS and the presence of a git for Windows installation.

Go into the directory of the cloned repository:

cd C:\nuplus\

Create the folder in which place a sample kernel image and paste the kernel there:

mkdir software\kernels

xcopy /s /i "path/to/sample-kernel" C:\nuplus\software\kernels

Now source Vivado settings:

C:\Xilinx\Vivado\2018.2\settings64.bat

Finally, setup the project and wait for Vivado to be launched:

vivado -nolog -nojournal -mode gui -source tools/vivado/setup_project.tcl -tclargs mmsc 15 3

About the parameters:

  • 15 is the one-hot mask that says which thread should be active for a core: 15 is (00001111)2 so 4 threads active
  • 3 is the one-hot mask that says which core should be active: 3 is (11)2 so 2 cores.
  • gui is the mode in which execute Vivado (you can run it as batch if you prefer).

However, this parameters are actually ignored, because in the coherence testbench the core nu+ is physically disconnected from the cache controller.

Understanding the Testbench

In the tb_coherence_injection.sv testbench we are injecting miss request directly to the cache controller, bypassing the nu+ core.

Tb coherence injection schema.jpg

This is done by a pre-conditional assignament of the cache-controller's input signals. Following, the instantiation of the module l1d_cache, representing the L2 cache.

// file: src/mc/tile/tile_nuplus.sv

l1d_cache #(
	.TILE_ID( TILE_ID ),
	.CORE_ID( CORE_ID ) )
u_l1d_cache (
	.clk                               ( clk       ),
	.reset                             ( reset     ),

`ifndef COHERENCE_INJECTION

	.ldst_instruction 	( ldst_instruction 	),
	.ldst_address 		( ldst_address 		),
	.ldst_miss 			( ldst_miss 		),
	.ldst_evict 		( ldst_evict 		),
	.ldst_cache_line 	( ldst_cache_line 	),
	.ldst_flush 		( ldst_flush 		),
	.ldst_dinv 			( ldst_dinv 		),
	.ldst_dirty_mask 	( ldst_dirty_mask 	),

`else

	.ldst_instruction 	( ldst_instruction_inject 	),
	.ldst_address 		( ldst_address_inject 		),
	.ldst_miss 			( ldst_miss_inject 			),
	.ldst_evict 		( ldst_evict_inject 		),
	.ldst_cache_line 	( ldst_cache_line_inject 	),
	.ldst_flush 		( ldst_flush_inject 		),
	.ldst_dinv 			( ldst_dinv_inject 			),
	.ldst_dirty_mask 	( ldst_dirty_mask_inject 	),

`endif
	
	//other signals assignament
	...
);

When the COHERENCE_INJECTION symbol is defined (see src/include/nuplus_user_defines.sv), the input signals of the Cache Controller related to the core requests are driven by the injection tasks rather than the Load-Store Unit, and they are named _inject, that are unlinked by default. Beware, if the injection testbench is enabled, but the setup_injection_channel task is not called, the Cache Controller has no input stimulus, which implies it remains idle.

In this way, from the testbech it's possible to attach these signals and send coherence requests directly to cache controller, pretending to be the Load-Store Unit. This is the job of the task setup_injection_channel, that contains instruction of this type:

	...
	assign u_nuplus_noc.NOC_ROW_GEN[0].NOC_COL_GEN[0].TILE_NUPLUS_INST.u_tile_nuplus.ldst_instruction_inject = t_00_ldst_instruction_inject;
	...

Using the Testbench

Now it's time to use the testbench, stressing the cache-controller with ad-hoc built requests.

Generate and submit a random coherence request

Before submitting a coherence request, we have to create one first. This can be done using the function create_random_coherence_request, which populates a tb_coherence_request_t randomly, using $urandom system task.

The struct tb_coherence_request_t is made of five self-explained fields:

typedef struct packed {
	tb_request_type_t 		request_type;
	dcache_address_t		address;
	requestor_t				requestor;
	dcache_line_t			cache_line;
	dcache_store_mask_t 	dirty_mask;
} tb_coherence_request_t;

Once the coherence request has been created it is possible to submit it to a cache controller. This is done by the task submit_request, that is composed by two different parts:

  1. From the analysis of the request type, it opportunely sets the injection signals;
  2. From the analysis of the requestor, it decides in which tile inject the request.

Thread emulation logic

Trying to emulate the same behaviour of a nu+ core, the thread scheduling uses a stalling logic that prohibits to submit subsequent requests from the same thread, if a previous request from this one is currently in progress. In fact, after a submit from a thread, a bitmask is adeguatelly set to guarantee this requirement. For this reason, a de-stalling logic is needed and it is implemented using a different process:

always_ff @( posedge clk ) begin : TILE0_WAKEUP
	if(t_00_wakeup)
	begin
		t_00_stall_mask[t_00_wakeup_thread_id] <= 1'b0;
		$display("[Time %t] [TESTBENCH] [TILE0] Waking up thread %d", $time(), t_00_wakeup_thread_id);
	end
end

Termination logic

A termination logic for the testbench has been implemented. It will terminate if one of the following event occurrs:

  • One of the nu+ tiles has reached the target number of requests. This number is given by the parameter TB_REQUEST_NUMBER;
  • One of the nu+ tiles has all the threads stalled (aka the stall_mask contains all ones) and a certain number of subsequent clock ticks occurred. This number is given by the parameter TB_DEADLOCK_THRESHOLD.

Obviously, the first case can be seen as a success, while the second can be seen as an event of deadlock of an entire nu+ core.


Attachments: File:Sample-kernel.zip