Name: Date:

HW14 solution

You are the lead designer of a processor and wish to evaluate design options and pick the most efficient. For all problems show your work and include units in your calculations for full credit.

  1. (3 points) Base deisgn You first analyze the current design, Mbase, which has a clock rate of 2.5 GHz. (Note that Hz = cycles/second, track your units carefully.) Testing with a benchmark program yields the measurements below. What is the execution time of this porgram.

    Instruction Class CPI Count
    R 1 40
    I 1 25
    Mem 1 25
    SB/J 1 10
    $$ ET = instructions * sec/cycle * cycle/instruction instructions = 100 CPI = 1 cycle time = 1/(2.5*10^9) = 0.4 * 10^-9 sec/cycle ET = 100 inst * 0.4 * 10^-9 sec/inst = 40 * 10^-9 sec = 40 ns$$
  2. Your team is going to add the l_inc instruction (from HW11) to your processor.

    a. (2 points) What effect is this likely to have on the cycle time? Justify your answer by explaining how the critical path changes with the addition of this instruction.

    This will likely not change the cycle time. The hardware will change, we need a new adder for the increment, but that can run in parallel to the old ALU. The register file will need two ports for writing, but they can happen in parallel.

    b. (3 points) Estimate the new cycle time. Assume that each piece of hardware takes these amounts of time to execute: Memory - 2ns, ALU - 2ns, registers - 1ns, others - 0ns.

    This is unchanged from our regular lw.

    $$ read Inst Mem + read Registers + ALU + read Data Mem + write Registers 2ns + 1ns + 2ns + 2ns + 1ns = 8ns$$
  3. Your team is going to add the add4_mem instruction, which does the following: rd = Mem[rs1+imm] + 4

    a. (2 points) What effect is this likely to have on the cycle time? Justify your answer by explaining how the critical path changes with the addition of this instruction.

    This will slow down the processor (lead to a longer cycle time) because we add an ALU step after the memory step. This will require an additional ALU.

    b. (3 points) Estimate the new cycle time. Assume that each piece of hardware takes these amounts of time to execute: Memory - 2ns, ALU - 2ns, registers - 1ns, others - 0ns.

    This is longer than our other lw.

    $$ read Inst Mem + read Registers + ALU + read Data Mem + ALU2 + write Registers 2ns + 1ns + 2ns + 2ns + 2ns + 1ns = 10ns$$
  1. Your team creates two new compilers, each one uses one of the new instructions described above. C1 incorporates the lw_inc instruction, while C2 incorporates the add4_mem instruction.

    a. (6 points) Estimate the execution time for the compiled program on each of the modified processor designs you discuss above.

    Instruction Class CPI C1 Count C2 Count
    R 1 30 35
    I 1 25 20
    Mem 1 25 20
    SB/J 1 10 10

    C1 (lw_inc)

    $$ ET = 90 inst * 8 ns/inst = 0.72 us$$

    C2 (add4_mem)

    $$ ET = 85 inst * 10ns/inst = 0.85 us$$

    b. (3 points) You realize that your team can only implement one of these two new instructions. Which one should your team implement? Or shoudl they stick with the base RISC-V design? Justify your answer.

    implement lw_inc, it has no cost on cycle time, but increases performance since the resulting programs are shorter.