CSE120 Spring 2022 Mid-term
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
CSE120 Spring 2022 Mid-term
Q4. Assume you have three arrays, A, B, and C, holding elements of size 8 bytes each. Assume the base address of A is stored in $t0, that of B in $t1, and that of C in $t2. Convert the following C code to RV64I assembly code. A[i]=B[C[i*2]];
Assume the i variable is stored in $t3.
Your lines of assembly code should not exceed 20!
Make sure to comment your code. Precede your comments with "#" for better readability.
You don’t need to follow any register convention while writing the code.
Ans: |
|
slli $t3, $t3, 3 |
#offset for i = (i*8) |
slli $t4, $t3, 1 |
#offset for C[i*2] = (i*2)*8 |
add $t2, $t2, $t4 |
#base+offset for C |
ld $t5, 0($t2) |
#load from memory C[i*2] |
slli $t5, $t5, 3 |
#offset for B |
add $t1, $t1, $t5 |
#base+offset for B |
ld $t6, 0($t1) |
#load from memory B[C[i*2]] |
add $t0, $t0, $t3 |
#base+offset for A |
sd $t6, 0($t0) |
#store from B[C[i*2]] into A[i] |
Q7. Assume that to spell check a large file,100,000 instructions are needed. The instructions in the program are broken down into 4 different classes, and each class requires its own number of clock cycles to execute.
Specific information is given in the table below.
Instruction Class |
Clock Cycles per Instruction |
Number of Instructions |
Branch |
3 |
40,000 |
Store |
4 |
20,000 |
Load |
5 |
30,000 |
ALU / R-type |
4 |
10,000 |
Part A (10 POINTS)
If the total execution time for this program is found to be 2 seconds, what is the clock rate (expressed in KHz) of the computer on which it was run?
Ans:
CPI*IC = CPI*IC Branch + CPI*IC Store + CPI*IC Load + CPI*IC R
= (3*40000) + (4*20000) + (5*30000) + (4*10000)
= 120000 + 80000 + 150000 + 40000
= 390000 cycles
Execution time = CPI*IC/clock rate
=> clock rate = CPI*IC/Execution time
= 390000 cycles/2 seconds
= 195000 Hz = 195kHz
Part B(5 POINTS)
Now, assume that as part of the 100,000 instruction spell check, 20% of all the original number of Load instructions are immediately followed by an ALU/R-type instruction that uses the data that was just loaded. To speed up the original spell check program, we are contemplating adding a new type of instruction to our architecture: an ALU instruction where one of the source operands is a value from memory. Ex: add rd, rs1, mem[address]
- This new instruction will replace the previous 2 instruction sequence (Load followed by ALU/R type).
- It will take 7 clock cycles.
Will this change offer any speedup over the original design? If so, by how much?
You may assume that the clock rate does not change and your answer to this question does not depend on your answer to Part A.
Ans:
No. of Load instructions replaced by new instruction = 20% of 30000 = 6000
No. of remaining original Load instructions = 30000 - 6000 = 24000
No. of remaining ALU/R-type instructions = 10000 - 6000 = 4000
CPI*IC = CPI*IC Branch + CPI*IC Store + CPI*IC Load + CPI*IC R + CPI*IC New
= (3*40000) + (4*20000) + (5*24000) + (4*4000) + (7*6000)
= 120000 + 80000 + 120000 + 16000 + 42000
= 378000
Execution time= CPI*IC/clock rate
= 378000/195000
= 378/195
= 1.94s (approx)
Speedup = Execution time_A/Execution time_B = 2/1.94 = 1.03 (approx)
2022-05-30