Practical 9 Pipelined hazard resolution
Objectives
This section is not a list of tasks for you to do. It is a list of skills you will have or things you will know after you complete the practical.
Following completion of this practical you should be able to:
- Implement data hazard resolution in a pipelined processor by employing write-before-read, data forwarding, and stalls.
- Implement control hazard resolution by flushing the pipeline
- Use waveform diagrams to debug a pipelined processor implementation with instructions in all stages of the pipe
- Use verilog test benches and a testing framework to test a processor implementation
Guidelines
- Because you will be iteratively adding functionality to one processor module, we strongly recommend that you periodically add and commit your progress to git as a backup.
Your Tasks
Follow this sequence of instructions to complete the practical.
This practical will all be done in your practical-pipe
repository
1 Run the hazards tests
During this practical, you will gradually be fixing data hazards for R-types until you've fixed them all; then you will look at other types of hazards to fix.
- To begin, open up the file in
test_asm/datahaz/test_datahaz_x2.asm
and read the test code (and comments) provided. - On the worksheet, identify potential dependencies in the code, create a pipeline diagram for the code, and mark how you intend to solve the data hazards (forwarding and write-before-read).
- Open up the
tb_Pipe_hazards.v
test bench and scroll to the bottom. Notice there are a sequence of test tasks commented out, much like in the last practical and the first one (test_no_hazard_detection()
) is the only one uncommented. - Scroll up to the implementation of
test_no_hazard_detection()
and observe that it (and many other tasks) simply check that the final states of the registers are correct. Answer the question in the worksheet about these tests. - Review
check_data_hazard_general()
to ensure it will work with your pipelined processor implementation. It uses the same shortcuts inpipeline_test_tools.vh
that you may have edited for Practical 8, so hopefully there won't be much to change. - Open the ModelSim project you created for Practical8 and add
tb_Pipe_hazards.v
to the project. Compile it and simulate this test bench. Fix any bugs or errors until you can gettest_no_hazard_detection()
to pass its test. (Note: that you may not pass this test if you already have implemented the write-then-read behavior. Consider your answer to the 1.4 question on the worksheet.)
2 Write then read
- Once you've passed the
check_data_hazard_general()
tests, comment it out in the test bench's main initial block. - In that same initial block, uncomment the
test_write_then_read_hazard_detection()
task and the call toCLEAR_PIPE()
that follows it. (See comments in that block) - Compile and run the test bench in ModelSim. It might fail if you've not implmented write-before-read in your datapath. That's ok!
- Figure out how to make your reg file write before it reads
- hint: consider when you should write to the register file so it can be read at the right time (but before the pipeline stage registers get written).
- Once you get this test to pass, answer the next question on the worksheet:
Describe the process you plan to follow to incrementally address data hazards in your pipeline for R-type instructions. If you’re not sure what process to follow, review the comments in the ASM file (
test_datahaz_x2.asm
) and the Test Bench (tb_Pipe_hazards.v
).
3 Data forwarding
-
Uncomment the next test in the test bench (
test_WB_to_EX_fwd()
) and compile then run the test bench again. -
On the worksheet, write some pseudocode that describes how you will detect the need to forward data to one of the two register operands (A or B) when an instruction in EX needs data from WB.
-
Create forwarding unit and add it to your Processor.
- SUGGESTION: connect some outputs from the
MEM_WB
pipeline stage register and from theID_EX
pipeline stage register to determine whether the hazard exists, then create an output that will control a mux to use forwarded data (fromMEM_WB
) or the standard data from the EX cycle. - Get the first forwarding (WB -> EX) working before you try to address the other conditions.
- SUGGESTION: connect some outputs from the
-
Handle WB -> EX forwarding
- uncomment the test for this in the test bench, and update your forwarding unit and Processor accordingly.
-
Handle MEM -> EX forwarding
- uncomment the test for this in the test bench, and update your forwarding unit and Processor accordingly.
-
At the end of this step, your test bench should run and pass the following tests, in sequence:
test_write_then_read_hazard_detection()
test_WB_to_EX_fwd()
test_MEM_to_EX_fwd()
-
Add, commit, and push your code changes to git. Be sure to add your assembled versions of the asm files.
4 lw
stall
-
Examine test asm file
test_datahaz_lw.asm
, then assemble it. -
Construct a hazard unit module
-
Handle stall when
lw
is in EX and the next instruction will use its rd value (see page 322 in the textbook)- hint: disable writing to the
IF_ID
register and PC when stalling. - hint: need to insert bubble in
ID_EX
(hint: add a flush capability to the stage register that puts all zeroes into its control bits and any instruction data you're carrying) - hint: special case for UJ and U types that follow a
lw
: they don't use register sources!
- hint: disable writing to the
-
Uncomment and run our tests (
test_lw_stall()
) -
Now test
sw
: read and assembletest_datahaz_sw.asm
. -
Uncomment and run our tests (
test_sw_forwarding()
) -
Fix any errors that you need to make those tests pass. (You may have to update your forwarding unit.)
-
On the worksheet, answer the question about forwarding from
lw
tosw
. -
Now is a great time to commit your changes to git. Include any assembled versions of the asm files.
5 Flushing the pipe
- On the worksheet, draw a pipeline diagram for the instructions given and indicate any data forwarding or stalls or flushes.
- Examine test asm file
test_ctlhaz_beq.asm
- NOTE: this test assumes branch writes the PC in the memory cycle (see textbook). This means everything following it must be flushed.
- If your processor takes the branch earlier (in ID), you will need to edit the test asm file and test bench.
- Figure out how to insert bubbles as instructions leave the
EX
,ID
, andIF
cycles- hint: use the same trick used to insert a stall to flush
EX_MEM
,ID_EX
, andIF_ID
stage registers while the PC is being written to the branch target by the instruction inMEM
.
- hint: use the same trick used to insert a stall to flush
- Run our tests (
test_beq_flush()
) and fix any bugs in your processor. - We do not provide you with tests for jal/jalr, which also need to flush the pipeline when they jump.
- Write new tests for jal and jalr and run them.
- Add the asm and assembled code for these tests to git and push your commit!
- Be sure to add tasks to the
tb_Pipe_hazards.v
test bench to run your tests.
6 Write and run a bigger test
Examine the following code:
// Array A's memory location is in x5
int[] A = {1, 2, 3, 4, 5};
int idx = 0;
while(idx < 5) {
A[idx] = A[idx] + 1;
idx = idx + 1;
}
- Write the code for this on the worksheet (and put it in an .asm file in the
test_asm
folder that you add, commit, and push to git)- To initialize the array, it is ok to pick an address in memory and put the integers in your assembled .txt file there. (You don't need to write RISC-V instructions to do that).
- To initialize
x5
to have the address ofA
, load the address as an immediate (rememberlui
andaddi
? Or maybe you have an assembler that supports pseudoinstructions likeli
?) in your code. idx
can be any register of your choice and does not need to be stored in memory.
-
Open
tb_Processor_Program.v
in VS Code and observe how it loads a .txt file and runs the program in that file. -
Make a copy of the
testProgramA()
task in the test bench and modify the copy to run the code you wrote above.- HINT: you can use
CHECK_MEM()
to check contents of memory in your test bench. Do this to see what the array values are after the program runs. - HINT:
testProgramA
takes an argument and an expected result; you can remove those from your copy for this test.
- HINT: you can use
-
For HW 10, you wrote a program that includes
relPrime
andgcd
. Assemble your code for those procedures into something that your processor can run. Put that code in thetest_asm
folder in your git repo.- Add, commit, and push your assembly (.asm file) and the assembled code (.txt file).
-
Make another copy of the
testProgramA()
task in the test bench and edit the copy to run yourrelPrime
program.- Use the task's argument (
n
) as the initial argument forrelPrime
, and theexpected
argument for the expected output.
- Use the task's argument (
-
On the worksheet, explain how you plan to test that
relPrime
works; specifically, how will pass the input argument to your program from the test bench, and how will your test bench know when the program has finished running (so it can check the result)?- There are many ways to do this; think about the Input/Output lecture from class for a few ideas, or think about how you could tell that the program is done by inspecting a register or the PC.
-
Test your relPrime program on your processor with many inputs, including at least these three:
relPrime(6) = 5
relPrime(5040) = 11
relPrime(30030) = 17
7 Design a new instruction
Your last task is to design and implement a new instruction and implement it in your pipeline. You need to provide clear documentation for how it will work, and justify it's inclusion in the instruction set.
- As you plan your design you should consider inventing an instruction that makes relprime run faster (this generally would combine multiple instructions into one new instruction).
- Document the design (in the practical worksheet +10 pts) and explain how you plan to add it to the pipeline.
- maybe add a stage to support extra work
- or stall the pipeline
- or add more hardware to existing stages
- Explain how you expect the new instruction to impact the performance of your processor.
- Implement your design.
- Run relprime with your new instruction (you'll have to rewrite relprime - make sure you keep both versions in your repository.)
- Compare the two runtimes (number of cycles for each run)
8 Bonus - add MMIO
We discussed I/O in class, one way of implementing I/O is Memory Mapped I/O. For an extra points on this practical you can implement MMIO. You will need to write a test bench to show this works. If you do this you need to do the following:
- Add a datapath drawing to the worksheet which shows the modifications for MMIO.
- Put a Test Plan (following the format from previous practicals) together to show that I/O works.
- Include a clear screenshot of a waveform in your worksheet that shows that the IO succeeded. You should annotate this waveform to indicate key events (e.g. point an arrow at a signal when an input number gets into a register.)
Full credit will only be awarded if you communicate how this works sufficiently in your worksheet. The graders will not look at your code for this problem.
This is a challenge problem, there is less support for this, you are expected to take ownership if you want to complete this challenge.
Turn It In
Grading Rubric
General Requirements for all Practicals:
- The solution fits the need
- Aspects of performance are discussed
- The solution is tested for correctness
- The submission shows iteration and documentation
Fill out the Practical Worksheet
In the worksheet, explain how you satisfy each of these items. Some guidelines:
Practical 9 Rubric items | Possible Points |
---|---|
Practical Worksheet | 60 |
Write-Then-Read RegFile | 5 |
Forwarding unit | 10 |
lw stall (Hazard Unit) | 10 |
branch/jump flushing | 5 |
New Instruction (impl) | 10 |
Extra points (MMIO) | 10 |
Total out of | 100 |
-
Submit your completed Practical Worksheet to gradescope.
-
Practical code will be submitted to your
D
git repository as new files and committed modifications to the repo we provided you. You must include your name and your teammates' names in a comment at the top of all files you submit.