Practical 2: RISC-V Assembler II

1 Objectives

Following completion of this practical you should be able to:

Your Tasks

Time Estimate This practical is estimated to take about 3-5 hours per student (so 6-10 hours total) on average.

Take a look at the practical worksheet before you start writing code for this practical. The methods you need to implement are flagged with a TODO: Practical 2 throughout the files.

1 Support new instruction types

You need to extend your assembler from Practical 1 to support the missing instruction types listed below. The hints and tips on the Practical 1 page are worth reviewing again before you start this practical. All the tests for this practical are in advanced_assembler_test.py.

Implement helper methods

First, start by implementing these helper methods:

  1. Assemble
  2. index_to_address*
  3. parse_labels
  4. label_to_offset

It may be helpful to use the has_label and split_out_label helper functions provided.

*For index_to_address you need to think about the starting address of a program in memory. Note that in RISC-V PC does not start at 0x0000 0000, instead it has a different starting address. I wonder if there is some sort of reference document that might tell you that address? If there was, I bet it would be green.

For Assemble you simply need to add to it as you go to support the new instruction types. The other helpers here will be useful as you write the other methods. You should fully implement any other unimplemented helpers. Think of this as a warm up, there are not explicit test cases for these, so you'll need to figure out how to verify they are working as you go. (This is a good time to review the general requirments for practicals and consider how you will meet these.) You may need to come back and change the behavior of these helpers as you move further into the practical.

You should push at least one commit to your repo that contains the implementation of all of these helpers.

Add new instruction types

You need to implement each of these methods:

  1. Assemble_U_Type
  2. Assemble_SB_Type
  3. Assemble_UJ_Type

After you finish each of these methods (or pause working) you must commit and push your repo with a meaningful commit message. As you debug and fix errors you should consider doing more commits.

For the latter two I recommend you focus on getting them working when a number is passed in as the branch target (e.g. bne t0 t1 40) then after that works, add support for labels (e.g. beq t0 t1 LABEL). All numbers used as targets for branches or jumps should be the PC-relative offset, not the immediate itself. You need to adjust the number before you translate it into an immediate. For reference the bne instruction above should be interpreting as branching to PC = PC + 40).

Your assembler only needs to support decimal immediates, assume all numbers passed as operands to an instruction are in decimal. As you work through the test cases you may want to consider the binary or hex representation of the numbers used in the tests.

Consider using the helper method is_int() and the helpers you wrote above.

2 Support pseudoinstructions

Your assembler needs to support a few pseudoinstructions. The behavior of individual pseudoinstructions is defined in pseudoinstruction_handler.py. Note that you can see the list of methods and their docs for this file by opening docs/pseudoinstruction_handler.html in your repo.

Implement individual pseudoinstruction methods

In pseudoinstruction_handler.py you will see these methods when you need to implement:

  1. double
  2. diffsums
  3. push*
  4. li
  5. beqz
  6. jalif

You should push at least one commit to your repo that contains the implementation of all of these pseudoinstructions.

The behavior of each of these pseudoinstructions is defined in the code comments. For each of these methods two arguments are given, the method might be called like this:

double("double t5, s0", 7)

The first argument is the actual use of the pseudoinstruction, the second argument is the line/instruction number in the assembled program where this pseudoinstruction starts. For most of these methods this is just for error output, but some of them will need to use this argument in other ways.

This function should return a list of new core instructions that will have the same behavior as the pseudoinstruction.

Recall that pseudoinstructions should not change other registers beyond those implied by the instruction definition, with the exception of at (aka x31) which can be modified freely. Also, recall that any register could be used as any register operand in a pseudoinstruction. Additionally, think carefully about the size of immediates supported by each pseudoinstruction.

The test cases for the pseudoinstructions do not directly test if your implementation of the instruction works. Instead they test general rules about the pseudoinstructions. It will be your job to explain how you know these pseudoinstructions behave correctly in the practical worksheet. You will need to run the code produced by your pseudoinstructions in a RISC-V simulator. You could use this online one, or jump to Practical 3 and install the one we use there.

You may want to consider using the helpers: replace_all(), assembler.reverse(), assembler.is_int(), assembler.dec_to_bin(), assembler.index_to_address(), assembler.label_to_offset()

*Implement push last, this pseudoinstruction requires knowledge from the "procedures" lectures that haven't come yet. But you can read ahead if you want to get it started now!

Once you've implemented all of the pseudoinstruction handlers, tests in the TestPseudos unit test category should pass, except for the one called test_pseudoinstructions_pass.

Implement the pseudoinstruction pass of the assembler

Look at the main assemble_asm() method. This function does all the steps of the assembler, notice that after removing comments the next thing is the processing of pseudoinstructions. For now, the assembler assumes there are no pseudoinstructions, go look at the definition of pseudoinstruction_pass().

You need to implement this function. The big picture is this: raw code comes in that may contain pseudoinstructions, this method should return a list of core instructions (and labels) where the pseudoinstructions have been replaced. You need to look at each line of code, determine if it is a pseudoinstruction, if it is then you need to call the correct pseudoinstruction-replacement method (which you wrote above), otherwise simply leave the line unchanged. The pseudoinstruction methods are passed in the second argument of to pseudoinstruction_pass(), so to apply the double method I could do this:

new_code = pseudos_dictionary["double"](my_line, inst_num)

new_code will be a list of the new instructions that I can add to my growing program.

You should look at the other pass methods that are implemented for you, if you need help starting. Keep in mind that one pseudoinstruction can become more than one core instruction, this will affect the line numer/address of each instruction following a pseudoinstruction in the original code.

You should consider the different "cases" you may hit as this method goes over each line of code in a file:

Work on these one case at a time, make sure that your pseudoinstruction pass returns the correct number of instructions and labels pointing to the right instructions.

The tests for this method once again do not test the exact instructions you return, since implementations of pseudoinstructions can be variable. Instead, it tests general patterns about what the code should look like. You will need to explain how you know your code is correct on the practical worksheet.

By the time you commit this work to your repo you should have at least 6 commits with meaningful messages, if not more.

Grading Rubric

All the practicals for CSSE232 have these general requirements:

General Requirements for all Practicals

  1. The solution fits the need
  2. Aspects of performance are discussed
  3. The solution is tested for correctness
  4. The submission shows iteration and documentation

Some practicals will hit some of these requirements more than others. But you should always be thinking about them.

Fill out the Practical Worksheet

In the worksheet, explain how you satisfy each of these items. Some guidelines:

  1. Submit your completed worksheet to gradescope, only 1 per team (make sure all team member's names are included). In gradescope you are able add your team members names to the submission, make sure you do so. You can find the gradescope link on the course moodle page.
  2. Record the commit ID of your final commit to github on the worksheet.
  3. Upload the code to gradescope, you need to upload both assembler.py and pseudoinstruction_handler.py to gradescope. Make sure that the autograder has run as expected.