Practical 5 Single Cycle Processor
Objectives
This section is not a list of tasks for you to do. It is a list of skills you will have or things you will know after you complete the practical.
Following completion of this practical you should be able to:
- Instantiate and combine modules in verilog.
- Read and interpret simulated waveforms.
- Implement and test a limited instruction set single-cycle processor (R, I, and S types) in ModelSim.
- Use waveform diagrams and verilog test benches to debug a processor implementation.
- Discuss clocking strategies for a set of sequential logic that must happen in a specific order.
Guidelines
- Because you will be iteratively adding functionality to one processor module, we strongly recommend that you periodically add and commit your progress to git as a backup.
Your Tasks
Time estimate This practical took students in previous terms ~5-8 hours per team member for a team of 3.
Follow this sequence of instructions to complete the practical.
0 Obtain your practical-sc
git repo
First, get your team name from your professor (it is probably on moodle or CATME). When you follow the link to get your repo you need to enter your team name exactly as your professor provided it. Only the first person in your group to set up the repo needs to do this, later people can select the correct team from the list of teams. Get your repo at this link. Don't click that link without reading the sentences above.
1 Implement R-type Instructions
You will begin by implementing a processor that only implements R-Type RISC-V instructions.
- On the practical worksheet, there's a datapath drawing.
- Using the RTL from class, trace the wires used for the
add
instruction. - Label each traced wire with a name. For example, consider the wire coming out of the right side of Instruction Memory; you could label this wire
inst
to be consistent with the RTL. - As you trace through blocks of logic (register file, muxes, etc) circle the names of any control bits necessary for the
add
instruction.
- Using the RTL from class, trace the wires used for the
We've provided you with the start of a verilog control unit, Control.v
and a test bench tb_Control.v
. It doesn't currently do much, but has all the inputs and outputs you'll need.
-
Implement Control
- Open up your
practical-sc
repo in VS Code (or your favorite text editor) - Edit
Control.v
- In the
always @(opcode)
block of the control unit, look for the case block forR_OPCODE
. Between thebegin
andend
, add value assignments for all the control signals you will need to set to makeadd
happen.- You can see Figure 4.26 in your textbook (should be page 281) to verify which signals and which values matter for "R-format" instructions.
- HINT:
ALUOp
is a two-bit bus, unlike in the texbook table that treats each bit separately; do not set each bit separately, instead assign a two-bit binary value. - Note: you should also include every "write" control signal and explicitly disable them (set to zero) for things you don't want to be written (for example,
memWrite
).
- Open up your
-
Test Control
Once you've coded control for R-types, you need to make sure it emits the right signals!
- In VS Code, open the
tb_Control.v
file and review the test we provide for R-type instructions. - Open ModelSim, and create a new
Practical5
project file in yourpractical-sc
git repo. - Add all the verilog files in the repo to your ModelSim project.
- Compile all the verilog files, fixing any errors in
Control.v
andtb_Control.v
. Don't worry about errors in the other modules for now. - Start simulation for the
tb_Control
module andrun -all
. The tests should pass; if they don't, check your work. - You could add another test, but since all R-format instructions have the same opcode, there's no need.
- In VS Code, open the
When you get control working, consider doing a quick git pull, add, commit, and push to save your work.
-
Implement the Datapath for R-types
In this part you will use your control unit and design a datapath around it to execute R-type instructions from memory.
- Examine the
Processor.v
file. We're providing you with a processor module that has a few component instances in it, but they're not hooked up. Compare this set of instances with the components you traced on the worksheet. - Add new instances of components you will need to make your processor execute only R-types.
There's no need to connect them yet, just make instances. Here are some hints and suggestions:- We are providing you with working implementations of
ALU.v
andALUCtl.v
. They should already be in yourpractical-sc
repository. - We've also included copies of the
DP_Memory.v
, andRegister.v
from Practical 4. - There are lots of suggestions and tips in the comments in the files provided, make use of these.
- Instruction and data memory will be the same component. We're using both halves of the
DP_Memory
module. See the comments inProcessor.v
for details. - Don't make multiplexers. You can make them with raw verilog later.
- There's no register file in your git repo so you will have to make your own!
- SUGGESTION: Since you created and tested this in the previous practical, one member of your team should copy their implementation into your
practical-sc
repo and use them for this practical. - Don't forget to add this file to your ModelSim project after you've made it.
- SUGGESTION: Since you created and tested this in the previous practical, one member of your team should copy their implementation into your
- We are providing you with working implementations of
- Create wires to connect the components. Create an instance of each wire you labeled on the practical worksheet's datapath; use the same name you wrote on the worksheet.
- Some of your wires will need to be 32 bits:
wire [31:0] myWideWire;
- Put the wire declarations inside the
Processor
module before you declare the other major components. This will ensure they're ready to use by any component that needs it.
- Some of your wires will need to be 32 bits:
- Attach the wires to your component's input or output pins
- This is as simple as writing the wire name between the parentheses next to the input or output where you want to attach it. In this example, the output of instance
A
is connected to the input of instanceB
:wire [4:0] myWire; Thingy A( .InputPin1(), .OutputPin(myWire) // <-- attach one end here ); Thingy B( .InputPin1(myWire), // <-- attach the other end here .OutputPin() );
- STRONG SUGGESTION: Because you are not implementing immediates, memory, or branch instructions yet, ignore the three muxes used by the ALU (from ImmGen) branches (From the adder) and data memory (output of memory) for now. Assume the wire goes straight through the mux. For example, you might have a wire
B
that goes directly from the register file into the second input on the ALU. You can add these muxes later. - A note about Memory: When you hook up memory you will need to adjust the address. RISC-V uses byte addressing, but the provided memory modules use word addressing. Because our words are 4 bytes long, every byte address is 4 times too big. For example, the second instruction in memory will be at byte address
0x0004
but word address0x0001
. We can simply shift the byte address right by 2 to divide by 4 and convert it to a word address. Additionally, the provided memory only has 10 bit (word) addresses (meaning 2^12 byte addresses), this is because of the limitations on the amount of memory on the FPGA we are simulating. Feed in the least significant bits of your address (after left shifting) to account for this (we're essentially cutting off the top of memory when we do this). As you debug keep in mind that if you look at the values in memory the addresses will be 4 times smaller than the address you would "expect".
- This is as simple as writing the wire name between the parentheses next to the input or output where you want to attach it. In this example, the output of instance
- Connect the
reset
input onProcessor
to all components that have a reset (most likely the PC, and RegFile). When you "reset" the processor, you want to clear out all the registers and make sure the PC goes back to the beginning of your code. - Be sure your
Processor
compiles in ModelSim before moving on. Make sure you've added all the files used in your Processor or the tests to the ModelSim project.
- Examine the
-
Test with a few R-types
-
Examine
memory-R.txt
. This is the first set of instructions you will test on your processor. It has anadd
and asub
that have been assembled and put into memory in this order. -
In the
tb_Processor_R.v
file, you can see a start at a test bench for your processor. It loadsmemory-R.txt
into memory, then resets your processor, then allows the clock to cycle and checks the changes to the register file after each cycle.- Modify the test to properly inspect the contents of your register file. Follow the instructions in the code to tell the test bench how to look at your regsters' values.
-
In ModelSim, run the
tb_Processor_R
tests!- If they don't pass, consider building a waveform with control and all your labeled wires to inspect what it's doing.
- If you build a waveform, be sure to save it! We recommend calling it something similar to the test bench where it is useful (
tb_Processor_R_waves.do
for example).
-
Once the two instructions are succeeding, assemble a few more R-type instructions and add them to the memory file, then add tests to the verilog test bench.
- HINT: you can use your assembler from practicals 1/2, or RARS to quickly assemble instructions.
-
-
Once you've got R-types working, save your progress in git!
- Be sure to add, commit, and push only the verilog files, changed memory file, and any waveform
.do
file you edited or created. - SUGGESTION: Have each member of your group create their own
Practical5.mpf
project file in Model Sim. Don't commit these to git. This will save you some time. - Take a screenshot of your waveform and annotate it, labelling which portions of the waveform are running which tests. You may need to add an additional temporary wire to your testbench to read the
result
output from the ALU. Be sure to submit this annotated waveform for your worksheet. Below is a sample of an annotated waveform:
- Be sure to add, commit, and push only the verilog files, changed memory file, and any waveform
2 Implement Basic I-type Instructions (skip lw
and jalr
for now)
-
On the practical worksheet, and using the RTL from class, trace the wires used for the
addi
instruction. 2. Label each traced wire with a name. 3. As you trace through blocks of logic (register file, muxes, etc) circle the names of any control bits necessary for the instruction. -
Update
Control.v
andtb_Control.v
for I-type instructions (except notlw
orjalr
, don't do those yet).- The table in your textbook doesn't have the set of control values for these, but they are very similar to R-format.
-
Test Control
- In the control test bench add a test or two for I-format instructions.
- Run your tests and verify they pass.
-
Update the Datapath for I-types
-
Add to your
Processor
verilog module any components you traced for I-types on the practical worksheet datapath. -
Create instances for any wires you traced.
-
Attach the wires to input/output ports of your componets.
-
HINT: there's a mux controlling the second input of the ALU. Now you need to implement that mux! There are many ways to do this and we'll show you three of them.
First, you can "conditionally connect" a wire to the input. For example, this code connects wire A to the input when "ACONTROL" is 1, and otherwise it connects B:
wire [4:0] A; wire [4:0] B; Thingy UNIT( .InputPin1( ACONTROL ? A : B ), .OutputPin() );
Another way is to create a third wire from the output of the mux, lets call it "choice", then conditionally assign that wire:
wire [4:0] A; wire [4:0] B; wire [4:0] choice; assign choice = ACONTROL ? A : B; Thingy UNIT( .InputPin1( choice ), .OutputPin() );
A final way is to use an
always
block to recomputechoice
when the inputs change. Note thatchoice
is areg
type here:wire [4:0] A; wire [4:0] B; reg [4:0] choice; always @(A,B,ACONTROL) begin if (ACONTROL === 1) choice <= A; else choice <= B; end Thingy UNIT( .InputPin1( choice ), .OutputPin() );
These all effectively do the same thing.
-
Dont forget to update any components you hooked up for R-types that may need new connections (for example, the
Read 2
port on the regitster file).
-
-
-
Test your I-type processor
- Make a copy of the
tb_Processor_R.v
file calledtb_Processor_I.v
. - Create a
memory_I.txt
file with some assembled I-type instructions (much like thememory_R.txt
file).- Add at least five tests for I-types.
- Inside the new
tb_Processor_I.v
file, replace the R-type tests with ones for the instructions you've put in the new memory file. You may need to add more tests.- IMPORTANT: be sure your new
tb_Processor_I
test bench loads the new memory file! - These will be very similar to the R-type tests!
- IMPORTANT: be sure your new
- Run your tests!
- SUGGESTION: make another waveform for this new test bench and save the config to
tb_Processor_I_waves.do
or similar.
- SUGGESTION: make another waveform for this new test bench and save the config to
- Make a copy of the
-
Be sure the R-type tests still work with your updated
Processor.v
.- Re-run your R-type test bench (
tb_Processor_R
) after you get the I-type tests working.
- Re-run your R-type test bench (
-
Once you've got I-types working, save your progress in git!
- Be sure to add, commit, and push only the verilog files, changed memory file, and any waveform
.do
file you edited or created.
- Be sure to add, commit, and push only the verilog files, changed memory file, and any waveform
3 Implement Memory Instructions
Now you will repeat the same steps for lw
and sw
. This should be faster than the first two parts.
-
Trace
lw
andsw
on the worksheet datapath. -
There is now a critical series of more than two clocked things that must happen in sequence during one clock cycle:
- (a) Read instruction from memory
- (b) Load data from memory
- (c) Put in register file
The processor cannot put the value into the register file until it has been loaded from memory.
And it cannot do any of this until it has read the instruction (a). The clock only has two edges, rising and falling edge, so we need a strategy to handle this!If your register file has async reads, this makes it much easier; we can write the data into the register file while the next instruction is getting read from memory. Both (a) and (c) can happen on the rising edge, and the memory read (b) can happen between the other two.
-
On the practical worksheet, complete the timing diagram. Assume two load word instructions execute one after the other, and each must do all three steps above, requiring execution of these steps:
- 1a: first load instruction fetched from memory
- 1b: first load instruction gets data from memory
- 1c: first load instruction writes data to reg file
- 1d: first load instruction updates PC (to be PC+4)
- 2a: second load instruction fetched from memory
- 2b: second load instruction gets data from memory
- 2c: second load instruction writes data to reg file
- 2d: second load instruction updates PC (to be PC+4)
Label the clock signal edges in the worksheet with the instruction steps above each clock edge where they should happen. Multiple steps from one instruction or steps from both instructions may need to happen simultaneously. 1a is given for you.
-
Update Control for the new instructions. Note that
lw
andsw
each have a unique opcode, so you'll want to create a control case for each. -
Update the control tests to also test
lw
andsw
. -
Update your datapath for
lw
andsw
.-
IMPORTANT: Instead of adding a new memory component for data memory, use the "B" half of the
DP_Memory
block that is already in your code.- The "B" half should operate on the falling edge of the clock so it happens after the instruction is loaded (at the rising edge). But you don't want to change the memory module, so instead just invert the clock ( use
~CLK
) when you connect it to the B-half portclk_b
input. This will make the negative edge "look like" a positive edge.
- The "B" half should operate on the falling edge of the clock so it happens after the instruction is loaded (at the rising edge). But you don't want to change the memory module, so instead just invert the clock ( use
-
SUGGESTION: You'll need to put a mux between the output of the ALU and the input of the register file. For this mux, DO NOT use the ternary operator (
q ? a : b
). You will want to grow the mux later, so it's best to do it with an always block:// this wire connects to the output of the mux and the write data port on the reg file. reg [31:0] aluOutputOrMemData; always @(A,B,MemtoReg) begin if (MemtoReg === 1) aluOutputOrMemData <= A; else aluOutputOrMemData <= B; end
-
It is OK to completely ignore the
MemRead
control signal. Our memory is always reading.
-
-
Test your updated processor
- Make a copy of the
tb_Processor_I.v
file calledtb_Processor_S.v
. - Create a
memory_S.txt
file with some assembled load and store instructions (much like thememory_I.txt
file, but with only loads and stores in it). - Inside the new test bench file, replace the I-type tests to test your loads and stores from the new memory file.
- IMPORTANT: be sure your new
tb_Processor_S
test bench loads the new memory file!
- IMPORTANT: be sure your new
- Run your tests!
- SUGGESTION: make another waveform for this new test bench and save the config to
tb_Processor_S_waves.do
or similar.
- SUGGESTION: make another waveform for this new test bench and save the config to
- Make a copy of the
-
Be sure to run old tests and new tests!
-
Add, Commit, Push
4 Implement Branches
Now you will repeat the same steps for beq
and then add the three other branch flavors.
-
Trace
beq
on the worksheet datapath.-
Update Control and Tests for beq
-
Update Processor (datapath + control)
- HINT: for the mux, implement it like the one for memory. Use an always block to sometimes assign the input to the PC to be PC+4 and sometimes to be the computed branch target.
-
Create a
memory_B.txt
file with some assembled beq instructions (you can also use other instructions that you've tested).- You want to test both a
beq
that doesn't get taken and one that does, so you may need to skip over some instructions. - HINT: you can ensure
beq x0, x0, LABEL
should always be taken! - If you put a nonzero value into a register and compare it to zero, the branch won't be taken.
- You want to test both a
-
Make a test bench (copy one of your others) called
tb_Processor_B.v
to test branches.- Update this to use your
memory_B.txt
file and write tests to make sure the PC goes to the right place. - HINT: your test bench can inspect the value of the PC by "digging into" the unit you are testing. Assumming your test bench has an instance of
Processor
calledUUT
and the processor has an instance ofRegister
calledPC
, you can do this:Processor UUT(.CLK(CLK), .reset(reset));
// ... your tasks go here
initial begin //... setup code here... // Look at the PC's output to see if it is correct VU.ASSERT_INT_EQUAL(UUT.PC.q, 32'h00004444);
You can inspect all your `wire` instances inside the `Processor` module in a similar fashion.
- Update this to use your
-
Run your tests!
- You are likely to find that your tests don't pass initially. Take a careful look at when and how PC changes. Investigate your waveform and look at the timing diagram you made for
lw
/sw
on the worksheet. You will likely need to change the timing for branches to work. Make sure after you make any changes that the old tests you wrote still work (don't assume they do, actually run them). If you want another hint, expand this after trying to work through it yourself for a bit:Hint
You need to move only one component to work on the negative clock edge.
- You are likely to find that your tests don't pass initially. Take a careful look at when and how PC changes. Investigate your waveform and look at the timing diagram you made for
-
Be sure to run old tests to make sure they still work
-
Add, Commit, Push your new files and updates. Now is a great time to save your changes.
-
-
Next, add
bne
.BNE is nearly identical to BEQ, but you want the opposite of the zero detector. There's no change to control for
bne
because it has the same opcode asbeq
.- Update the processor to support BNE.
- The only modifications you should make will be to your "branch mux". Change the always block to also sense changes in the
funct3
field of the instruction. Now you can differentiate betweenbne
andbeq
. - Change the conditional logic in your branch mux to first look at the
funct3
, then look at thezero
detector from the ALU when deciding whether or not to take the branch.
- The only modifications you should make will be to your "branch mux". Change the always block to also sense changes in the
- Add some tests for
bne
to yourmemory_B.txt
code and totb_Processor_B.v
- Update the processor to support BNE.
-
Add
bge
andblt
-
bge
andblt
are a little harder because the zero detector is not very useful for these instructions. Instead we care ifA - B
is positive or not. WhenA - B < 0
then A must be less than B. The most significant bit ofA - B
will tell us ifA < B
or not. -
In the practical worksheet, complete the truth table to help guide you in constructing some verilog that will correctly choose branch target when a branch of various types should be taken.
-
Update
Processor.v
to support the last two types of branches.- Update the "branch mux" logic to support the two new funct3 values (
bge
andblt
). - HINT: You need to add something new to the
always
block's sensitivity list. - HINT: use a case statement to make a decision differently based on the type of comparison you want to do.
- IMPORTANT: Be sure to have a
default
case in the case statement in case an unexpected funct3 value shows up.
- Update the "branch mux" logic to support the two new funct3 values (
-
Update your
tb_Processor_B.v
test bench and the memory text file to have a bge and blt test.
-
-
Add, Commit, Push any changes. DO NOT commit everything, only the important files (do not commit the 'work' directory.)
Turn It In
Grading Rubric
General Requirements for all Practicals
- The solution fits the need
- Aspects of performance are discussed
- The solution is tested for correctness
- The submission shows iteration and documentation
This practical hits all of these very directly, you should think about these items before you start coding.
Fill out the Practical Worksheet
In the practical worksheet, explain how you satisfy each of these items. Some guidelines:
-
None of these answers should be more than 100 words.
Practical 5 Rubric items Possible Points Practical Worksheet 20 R-Type and tests 15 I-Type and tests 20 Memory insts and tests 20 Branch insts and tests 20 Total out of 100
-
Submit your completed Practical Worksheet to gradescope.
-
Practical code will be submitted to your
C
git repository as new files and committed modifications to the repo we provided you. You must include your name and your teammates' names in a comment at the top of all files you submit.