CS 230 Winter 2023 Assignment 5
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
CS 230
Winter 2023
Assignment 5
Coverage: Subroutines in MIPS, CPU Pipeline and Performance Measures
Due Date: Monday, March 27, 11:59 p.m. *** Note the new due date.****
All submissions are to be completed through MarkUs
(https://markus.student.cs.uwaterloo.ca/markus_cs230_w/en/assignments)
Read the policy described in Learn regarding late submissions.
Check MarkUs to see how many 12-hour grace periods you have remaining.
Double check your submissions in MarkUs after you have uploaded them. It is your responsibility to ensure that the uploaded version is readable in MarkUs.
Solutions for assignments will not be posted.
Questions are approximately equally weighted.
1. For this question, you will be submitting two separate files. However, you may choose to develop the solution in a single file, and divide the code into separate files before submitting the final versions.
You should test the final versions of your solutions by concatenating the files before assembling them. Here is how you can do that in Linux:
cat a5q1b .asm a5q1a .asm > a5q1 .asm
/u/cs230/pub/binasm < a5q1 .asm > a5q1 .mips
The first command concatenates the two files from part b) and part a) into a single file (note the order of the files). The second command assembles the joint files.
When we test your solution, we will providing our own versions of each part.
In other words, when we test your part a), we will use our own version of part b) and vice versa. This means your solution for each part should not depend on any details of the other part.
It is very important that the solutions for this question follow the register conventions described on Slide 74 of Module 2. The main purpose of this assignment is to test the creation and use of subprograms. The majority of the correctness marks will be connected to properly using the registers according to the conventions.
Your programs must assemble and run properly with the given MIPS assembler binasm in the student.cs environment.
The marks for this question will be based on correctness. If your program does not assemble properly, then the correctness marks will be 0. Test your submission thoroughly before submitting.
Your solution should include comments at the beginning that include your name, your Quest ID, and a brief description of the program. You are not required to comment every line of your program. However, you should include inline comments that highlight the logical steps of your solution.
(a) Recall the filter function from CS115/135. This is an abstract list function that
consumes a function and a list and produces a list containing only the elements that make predicate function produce true. For example
(filter odd? ’(1 2 3)) produces the list ’(1 3).
For this part, create a subroutine that is labelled filter that has three arguments (in this order): the address of a function that itself has one parameter, the address of the first element of an array, and the size of the array. The function passed as an argument is the predicate function. The predicate function will return either 1, which represents the value true, or 0, which represents the value false. The filter subroutine will create a copy of the elements of the original array that cause the predicate function to produce true. The copy should appear in the memory addresses immediately following the original array. The subroutine should return the address of the first element of the newly created array, and the size of the newly created array (in that order). For example, suppose the filter subroutine is given the address of a subroutine that will check to see if a value is an odd number as the first argument, the address of an array with values [1’2’3] as the second argument, and the size 3 as the third argument. When the filter subroutine is finished, the values [1’3] would appear in memory locations immediately following [1’2’3].
Note that this subroutine will be calling another subroutine. This means that it is both a callee and a caller. The subroutine it calls has the right to change any registers that are designated as unsaved temporary. So, if this is an issue, you should save the important register values of filter on the stack before calling another subroutine.
Submit the file a5q1a.asm.
(b) Write a program that uses the filter subroutine from part a) to create a new
array that is a subset of the original. The program will use the array front end, where the address of the first element is in register $1 and the size of the array is in register $2. The new array will contain the values of the original array that are non-negative, multiples of 4. in the same relative order as the original array. Before the main program ends, but after the filter subroutine has been called, the address of the new array should be saved in register $1 and the size of the new array should be saved in register $2.
The program must include a subroutine called wordaligned. This subroutine has a single argument that is an integer and returns 1 if the number is a non-negative, multiple of 4, and returns 0 otherwise. For example, if the subroutine is given the integer 12, 1000, or 0 it will return the value 1. If the subroutine is given the integer -4 or 15, it will return the value 0. The code for the subroutine should appear at the end of the file, after the instruction jr $31 of the main program.
The basic structure of this part b) file is:
|
Note that this program should not work directly with the arrays. There will be a severe loss of correctness marks for solutions of part b) that directly access the elements of the arrays using lw and sw.
You may want to test your solution using the arraydump front end.
Submit the file a5q1b.asm.
2. For Questions 2 and 3 you will be working with the following MIPS assembly program.
1 beq $2 , $0 , e n d l o o p
2 add $3 , $0 , $1
3 add $4 , $2 , $0
4 l o o p : a d d i $4 , $4 , · 1
5 lw $13 , 0 ( $3 )
6 s l t $14 , $13 , $0
7 bne $14 , $0 , e n d i f
8 sub $15 , $13 , $4
9 sw $15 , 0 ( $3 )
10 e n d i f : a d d i $3 , $3 , 4
11 bne $4 , $0 , l o o p
12 e n d l o o p : j r $31
Line numbers have been added to the beginning of each instruction so you can use them as references in subsequent questions.
(a) Write the machine code version of the program. The machine code instructions
should be written as hexadecimal values. The file should contain each 8-digit hexadecimal number preceded by 0x (for a total of 10 characters) on a separate line. The alphadigits in the hexadecimal values should all be uppercase. See the solution for Question 3 of Tutorial 08 for the expected format.
Submit the file a5q2a.txt.
(b) Assume you have a processor running at 8GHz with the cycles per instruction set
architecture defined as follows:
. Memory accesses: 8 cycles
. Branch/Jump accesses: 2 cycles
. Everything else: 4 cycles
Assume that the program is run using the array front end and the user enters 5 for the array size, and the numbers: [-5, 0, 2, -8, 3] for the array contents. What is the CPI of the processor for the program given the input described? How much time would the processor take to complete the program instructions? Your answer should ignore the user time it takes to enter the data. You may assume there are no extra cycles due to pipelining or hazards for this question. Show your work.
Submit the file a5q2b.pdf.
3. Consider the code from the program listed at the beginning of Question 2. For this question assume that pipelining has been implemented. Assume that each stage takes
1 cycle to complete.
(a) Identify the pipeline hazards present in this code, i.e. situations that would generate
stall bubbles. In the case of data hazards identify the two lines of code involved.
(b) Assume that this program is run using the array frontend, and that register $1
contains the address of the first element of the array, and $2 contains the size of the array. Assume that the size of the array is 2, and that the numbers [4, -7] have been placed in the array. Assuming that no pipeline optimizations have been implemented, i.e. no forwarding or branch prediction, determine the number of cycles it takes to complete the execution of this program.
Justify your answer using a pipeline diagram formatted like the diagram below:
It may be helpful to use a spreadsheet to draw the diagram.
(c) Suppose forwarding and static branch prediction have been implemented. Given the same array, when compared to the answer in part (b):
● how many cycles would be saved due to forwarding?
● how many cycles would be saved due to static branch prediction?
Consider each case (forwarding and branch prediction) separately. In each case, identify which lines of stall bubbles from your diagram in part (b) will be eliminated.
(d) Suppose you ran the same code with an array [-5, 0, 2, -8, 3]. Would there be any difference between the number stall bubbles saved using static branch prediction compared to the number of stall bubbles saved using dynamic branch prediction for the branch instruction at Line 7?
Briefly justify your answer.
Submit the file a5q3.pdf.
2023-03-28