Practical 2: RISC-V Assembler II
1 Objectives
Following completion of this practical you should be able to:
- Assemble RISC-V SB, U, and UJ types.
- Understand how the design of instruction types influences and limits their behavior, especially when it comes to immediates.
- Explain the relationship between pseudoinstructions and core instructions.
- Discuss the benefits and drawbacks of different types of addressing.
Your Tasks
Take a look at the practical worksheet before you start writing code for this practical. The methods you need to implement are flagged with a TODO: Practical 2
throughout the files.
1 Support new instruction types
You need to extend your assembler from Practical 1 to support the missing instruction types listed below. The hints and tips on the Practical 1 page are worth reviewing again before you start this practical. All the tests for this practical are in advanced_assembler_test.py
.
Implement helper methods
First, start by implementing these helper methods:
Assemble
index_to_address
*parse_labels
label_to_offset
It may be helpful to use the has_label
and split_out_label
helper functions provided.
*For index_to_address
you need to think about the starting address of a program in memory. Note that in RISC-V PC does not start at 0x0000 0000
, instead it has a different starting address. I wonder if there is some sort of reference document that might tell you that address? If there was, I bet it would be green.
For Assemble
you simply need to add to it as you go to support the new instruction types. The other helpers here will be useful as you write the other methods. You should fully implement any other unimplemented helpers. Think of this as a warm up, there are not explicit test cases for these, so you'll need to figure out how to verify they are working as you go. (This is a good time to review the general requirments for practicals and consider how you will meet these.) You may need to come back and change the behavior of these helpers as you move further into the practical.
You should push at least one commit to your repo that contains the implementation of all of these helpers.
Add new instruction types
You need to implement each of these methods:
Assemble_U_Type
Assemble_SB_Type
Assemble_UJ_Type
After you finish each of these methods (or pause working) you must commit and push your repo with a meaningful commit message. As you debug and fix errors you should consider doing more commits.
For the latter two I recommend you focus on getting them working when a number is passed in as the branch target (e.g. bne t0 t1 40
) then after that works, add support for labels (e.g. beq t0 t1 LABEL
). All numbers used as targets for branches or jumps should be the PC-relative offset, not the immediate itself. You need to adjust the number before you translate it into an immediate. For reference the bne
instruction above should be interpreting as branching to PC = PC + 40
).
Your assembler only needs to support decimal immediates, assume all numbers passed as operands to an instruction are in decimal. As you work through the test cases you may want to consider the binary or hex representation of the numbers used in the tests.
Consider using the helper method is_int()
and the helpers you wrote above.
2 Support pseudoinstructions
Your assembler needs to support a few pseudoinstructions. The behavior of individual pseudoinstructions is defined in pseudoinstruction_handler.py
. Note that you can see the list of methods and their docs for this file by opening docs/pseudoinstruction_handler.html
in your repo.
Implement individual pseudoinstruction methods
In pseudoinstruction_handler.py
you will see these methods when you need to implement:
double
diffsums
push
*li
beqz
jalif
You should push at least one commit to your repo that contains the implementation of all of these pseudoinstructions.
The behavior of each of these pseudoinstructions is defined in the code comments. For each of these methods two arguments are given, the method might be called like this:
double("double t5, s0", 7)
The first argument is the actual use of the pseudoinstruction, the second argument is the line/instruction number in the assembled program where this pseudoinstruction starts. For most of these methods this is just for error output, but some of them will need to use this argument in other ways.
This function should return a list of new core instructions that will have the same behavior as the pseudoinstruction.
Recall that pseudoinstructions should not change other registers beyond those implied by the instruction definition, with the exception of at
(aka x31
) which can be modified freely. Also, recall that any register could be used as any register operand in a pseudoinstruction. Additionally, think carefully about the size of immediates supported by each pseudoinstruction.
The test cases for the pseudoinstructions do not directly test if your implementation of the instruction works. Instead they test general rules about the pseudoinstructions. It will be your job to explain how you know these pseudoinstructions behave correctly in the practical worksheet. You will need to run the code produced by your pseudoinstructions in a RISC-V simulator. You could use this online one, or jump to Practical 3 and install the one we use there.
You may want to consider using the helpers: replace_all()
, assembler.reverse()
, assembler.is_int()
, assembler.dec_to_bin()
, assembler.index_to_address()
, assembler.label_to_offset()
*Implement push
last, this pseudoinstruction requires knowledge from the "procedures" lectures that haven't come yet. But you can read ahead if you want to get it started now!
Once you've implemented all of the pseudoinstruction handlers, tests in the TestPseudos
unit test category should pass, except for the one called test_pseudoinstructions_pass
.
Implement the pseudoinstruction pass of the assembler
Look at the main assemble_asm()
method. This function does all the steps of the assembler, notice that after removing comments the next thing is the processing of pseudoinstructions. For now, the assembler assumes there are no pseudoinstructions, go look at the definition of pseudoinstruction_pass()
.
You need to implement this function. The big picture is this: raw code comes in that may contain pseudoinstructions, this method should return a list of core instructions (and labels) where the pseudoinstructions have been replaced. You need to look at each line of code, determine if it is a pseudoinstruction, if it is then you need to call the correct pseudoinstruction-replacement method (which you wrote above), otherwise simply leave the line unchanged. The pseudoinstruction methods are passed in the second argument of to pseudoinstruction_pass()
, so to apply the double
method I could do this:
new_code = pseudos_dictionary["double"](my_line, inst_num)
new_code
will be a list of the new instructions that I can add to my growing program.
You should look at the other pass
methods that are implemented for you, if you need help starting. Keep in mind that one pseudoinstruction can become more than one core instruction, this will affect the line numer/address of each instruction following a pseudoinstruction in the original code.
You should consider the different "cases" you may hit as this method goes over each line of code in a file:
- case 1: a core instruction
- case 2: a label (e.g.
LABEL:
) - case 3: a label and another instruction (e.g.
LABEL: add t0, t0, t0
) - case 4: a pseudoinstruction
- case 5: an unknown instruction
Work on these one case at a time, make sure that your pseudoinstruction pass returns the correct number of instructions and labels pointing to the right instructions.
The tests for this method once again do not test the exact instructions you return, since implementations of pseudoinstructions can be variable. Instead, it tests general patterns about what the code should look like. You will need to explain how you know your code is correct on the practical worksheet.
By the time you commit this work to your repo you should have at least 6 commits with meaningful messages, if not more.
- Once you've implemented
pseudoinstruction_pass
(and have working pseudoinstruction handlers), all tests in theTestPseudosFileAssembly
andTestPseudos
unit test categories should all pass.
Grading Rubric
All the practicals for CSSE232 have these general requirements:
General Requirements for all Practicals
- The solution fits the need
- Aspects of performance are discussed
- The solution is tested for correctness
- The submission shows iteration and documentation
Some practicals will hit some of these requirements more than others. But you should always be thinking about them.
Fill out the Practical Worksheet
In the worksheet, explain how you satisfy each of these items. Some guidelines:
-
None of these answers should be more than 100 words. (Unless otherwise indicated on the worksheet.)
-
You will upload this sheet to gradescope. Make sure you indicate your partner when you upload.
Practical 2 Rubric items Possible Points Practical Worksheet 20 Implements U-Types 10 Implements SB-Types 15 Implements UJ-Types 15 Implements 6 pseudoinstructions 15 Implements pseudoinstruction_pass 5 Autograder test cases 20 Total out of 100
- Submit your completed worksheet to gradescope, only 1 per team (make sure all team member's names are included). In gradescope you are able add your team members names to the submission, make sure you do so. You can find the gradescope link on the course moodle page.