Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

CDA 4102/CDA 5155: Fall 2023

You will create a simple RISC-V simulator which will perform the following two tasks. Please develop your project in  one  (C,  C++, Java or Python) source file to avoid the stress of combining multiple files before submission and making sure it still works correctly.

•   Load a specified RISC-V text file1 and generate the assembly code equivalent to the input file (disassembler). Please see the sample input file and disassembly output in the project assignment.

•   Generate the instruction-by-instruction simulation of the RISC-V code (simulator). It should also produce/print the  contents  of registers  and  data memories  after  execution  of each  instruction. Please see the sample simulation output file in the project assignment.

You do not have to implement any exception or interrupt handling for this project. We will use only valid testcases that will not create any exceptions. Please go through this document first, and then view the sample input/output files in the project assignment, before you start implementing the project.

Instructions

For reference, please use the RISC-V Instruction Set Architecture (riscv-ISA.pdf in the course website) to see the format for each instruction and pay attention to the following changes. For example, we introduced a break instruction, modified the opcode format, etc. In other words, you should exactly follow the details from riscv-ISA.pdf except the changes outlined in this document.

Your disassembler & simulator need to support the three categories of instructions shown in Figure 1.

Category-1

Category-2

Category-3

Category-4

beq, bne, blt, sw

add, sub, and, or

addi, andi, ori, sll, sra, lw

jal, break

Figure 1: Three categories of instructions

The  format  of  Category-1  instructions  is  described  in  Figure  2.  If  the  instruction  belongs  to Category-1, the right two bits (least significant bits) are always “00” preceded by 4 bits Opcode. Note that instead of using 7 bits opcode in RISC-V, we use 5 bits opcode as described in Figure 3. The remaining 25 bits of the instruction binary is exactly the same as the original RISC-V instruction set for that specific instruction.

Same as RISC-V instruction (25 bits)

Opcode (5 bits)

00

Figure 2: Format of Instructions in Category-1

Please pay attention to the exact description of instruction formats and its interpretation in MIPS instruction set manual. For example, in case of jal instruction, the 21-bit offset is shifted left by one bit (padded with 0 at LSB side), sign extended to form 32 bits, and then added to the address of the jal instruction to form the target address. Similarly, for beq, bne and blt instructions, the 12-bit offset is formed by concatenating bits in [31:25] with bits in [11:7] (see S-type format in risc-ISA.pdf), and then the 12-bit offset is shifted left by one bit, sign extended to form 32-bits, and added to the address of the current instruction to form the target address. The 12-bit offset Please note that we do not consider delay slot for this project. In other words, an instruction following the branch instruction should be treated as a regular instruction (see sample_simulation.txt).

Instruction

Opcode

beq

00000

bne

00001

blt

00010

sw

00011

Figure 3: Opcode for Category-1 instructions

If the instruction belongs to Category-2 which has the form “dest ← src1 op src2”, the rightmost two bits (least significant bits) are always “01” as shown in Figure 4. Then the following 5 bits serve as opcode as listed in Figure 5.

Same as RISC-V instruction (25 bits)

Opcode (5 bits)

01

Figure 4: Format of  Category-2 instructions where both sources are registers

Instruction

Opcode

add

00000

sub

00001

and

00010

or

00011

Figure 5: Opcode for Category-2 instructions

If the instruction belongs to Category-3 which has the form “dest ← src1 op immediate_value”, the rightmost two bits (least significant bits) are always “ 10” . Then 5 bits for opcode as indicated in Figure 6. The instruction format is shown in Figure 7.

Instruction

Opcode

addi

00000

andi

00001

ori

00010

sll

00011

sra

00100

lw

00101

Figure 6: Opcode for Category-3 instructions

Same as RISC-V instruction (25 bits)

Opcode (5 bits)

10

Figure 7: Format of  Category-3 instructions with source2 as immediate value

If the instruction belongs to Category-4, the rightmost two bits (least significant bits) are always “ 11” . Then 5 bits for opcode as indicated in Figure 8. The instruction format is shown in Figure 9.

Instruction

Opcode

jal

00000

break

11111

Figure 8: Opcode for Category-4 instructions

Same as RISC-V instruction (25 bits)

Opcode (5 bits)

11

Figure 9: Format of Category-4 instructions

All signed numbers should be interpreted using 2’s complement arithmetic. Note that the signed numbers can be in registers, data memories or inside an instruction (e.g., the immediate field is signed for addi). Most importantly, each location (register or data memory) can be treated differently based on the context. For example, an arithmetic instruction (e.g., add) will treat the content of a register as a signed number (in 2’s complement arithmetic), whereas a logical operation (e.g., and) will treat the same register content as an unsigned number (sequence of bits). Please go through riscv-ISA.pdf to understand how each instruction treats its operands (signed or unsigned). Assume that all unassigned register and data memory locations are 0.

Sample Input/output Files

Your program will be given a text input file (see sample.txt). This file will contain a sequence of 32-bit instruction words starting at address "256". The final instruction in the sequence of instructions is always break. There will be only one break instruction. Following the break instruction (immediately after break), there is a sequence of 32-bit 2's complement signed integers for the program data up to the end of the file. The newline character can be either “\n” (linux) or “\r\n” (windows). Your code should work for both cases. Please download the sample input/output files using Save Asinstead of using copy/paste of the content.

Your   RISC-V    simulator    (with    executable   name    as    Vsim)    should   accept   an   input    file (inputfilename.txt)  in  the  following  command  format  and  produce  two  output  files  in  the  same directory: disassembly.txt (contains disassembled output) and simulation.txt (contains the simulation trace). Please hardcode the names of the output files. Please do not hardcode the input filename. It will be specified when running your program. For example, it can be sample.txtor test.txt .

Vsim inputfilename.txt

Correct handling of the sample input file (with possible different data values) will be used to determine 60% of the credit. The remaining 40% will be determined from other valid test cases that you will not have access prior to grading. It is recommended that you construct your own sample input files with which to further test your disassembler/simulator. It is okay to share your new testcases with other students in the class as long as it does not lead to similarity in the project source code.

The disassembler output file should contain 3 columns of data with each column separated by one tab character (‘\t’ or char(9)). See the sample disassembly file in the project1 assignment.

1.   The text (e.g., 0’s and 1’s) string representing the 32-bit data word at that location.

2.   The address (in decimal) of that location

3.   The disassembled instruction.

Note, if you are displaying an instruction, the third column should contain every part of the instruction, with each argument separated by a comma and then a space (“, ”).

The simulation output file should have the following format.

20 hyphens and a new line

Cycle  < cycleNumber >:< tab >< instr_Address >< tab >< instr_string >

< blank_line >

Registers

x00: < tab >< int(x0) >< tab >< int(x1) >...< tab >< int(x7) >

x08: < tab >< int(x8) >< tab >< int(x9) >...< tab >< int(x15) >

x16: < tab >< int(x16) >< tab >< int(x17) >...< tab >< int(x23) >

x24: < tab >< int(x24) >< tab >< int(x25) >...< tab >< int(x31) >

< blank_line >

Data

< firstDataAddress >: < tab >< display 8 data words as integers with tabs in between >

..... < continue until the last data word >

Display all integer values in decimal. Immediate values should be preceded by a “#” symbol. Note that  some  instructions  take  signed  immediate  values  while  others  take  unsigned  immediate values. You will have to make sure you properly display a signed or unsigned value depending on the context.

Because we will be using diff -w -B” to check your output versus the expected outputs, please follow the output formatting. Mismatches will be treated as wrong output and will lead to score penalty.

The    project     assignment     contains    the     following     sample     programs/files    to     test     your

disassembler/simulator.

•    sample.txt : This is the input to your program.

•    sample_disassembly.txt : This is what your program should produce as disassembled output.

•    sample_simulation.txt : This is what your program should output as simulation trace.