Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

EE450 Socket Programming Project

Part3

Fall 2023

Due Date:

Wednesday, December 6, 2023 11:59PM

(Hard Deadline, Strictly Enforced)

OBJECTIVE

The objective of project is to familiarize you with UNIX socket programming. It is an individual assignment, and no collaborations are allowed. Any cheating will result in an automatic F in the course (not just in the assignment). If you have any doubts/questions, please email TA, or come by TA’s office hours. You can ask TAs any question about the content of the project, but TAs have the right to reject your request for debugging.

PROBLEM STATEMENT

In this project,you will implement a Student Performance Analysis system, a software application that  allows  students  to  check  their  GPA,  percentage  ranks,  and  other  relevant  data.  This information can be used by students to identify areas where they need to focus their efforts and improve their performance. It can also be used by administrators to identify areas where teaching teams need to improve or provide additional resources to help students succeed.

 

Figure 1: Overview of the Student Performance Analysis system.

This system generates academic statistics based on user queries. There are two types of clients, student (Student 1 and Student 2) and administrator (Admin) in the system. A student can query a student’s academic performance, while the admin can also query academic statistics of a certain department. The students/admin would send a request to a main server and receive results replied from the main server. Now since there are many departments, the university use a distributed system design where the main server is further connected to many (in our case, three) backend servers. Each backend server stores the academic records for a list of departments. For example, backend server A may store the student data of ECE and CS department, and backend server B may store the student data of Art department. Therefore, once the main server receives a user (student) query, it decodes the query and further sends a request to the corresponding backend server. The backend server will search through its local data, analyze the academic statistics, and reply to the main server. Finally, the main server will reply to the client to conclude the process.

The detailed operations to be performed by all the parties are described with the help of Figure 1. There are in total 7 communication endpoints:

●   Student 1 and Student 2: two different student clients possibly in different department

●   Admin: a client that can query student and department academic statistics

●   Main server: responsible for interacting with the clients and the backend servers

●   Backend server A/B/C: responsible for loading data (from dataA/B/C.csv) generating the statistics of the students’ academic performance

The full process can be roughly divided into four phases (see also “DETAILED EXPLANATION” section), the communication and computation steps are as follows:

Bootup

1.   [Computation]: Backend server A/B/C read the file dataA.csv, dataB.csv and dataC.csv respectively and store the information in data structures.

○   Assume   a  “static”  system  where   contents  in  dataA/B/C.csv  do  not   change throughout the entire process.

○   Backend servers only need to read the text files once. When Backend servers are handling user queries, they will refer to the data structures, not the text files.

○   For simplicity, there is no overlap of departments or students among data files.

2.   [Communication]:  After  step  1,  Main  server  will ask each of Backend  servers which departments they are responsible for. Backend servers reply with a list of departments to the main server.

3.   [Computation]: Main server will construct a data structure to book-keep such information from step 2. When the client queries come, the main server can send a request to the correct Backend server.

Query

1.   [Communication]: Each client (students and admin) sends a query to the Main server.

○   A student client sends department name and student ID to the main server.

○   The admin client can perform two types of queries:

(a) sending department name and student ID to query a student academic record (b) sending only a department name to query department academic statistics

○   A client can terminate itself only after it receives a reply from the server (in the Reply phase).

○   Main server maybe connected to all three clients at the sametime.

2.   [Computation]:  Once the Main  server receives the queries, it decodes the queries and decides which backend server(s) should handle the queries.


Analysis

1.   [Communication]: Main server sends a message to the corresponding Backend server so that the Backend server can perform computation.

2.   [Computation]:  For query (a), once the department name and student ID are received, Backend server generates academic record for this student.

3.   [Computation]:  For  query (b), once the department name is received, Backend server generates academic statistics for the department.

4.   [Communication]: Backend servers, after generating the results, will reply to Main server.

Reply

1.   [Computation]: Main server decodes the messages from Backend servers and then decides which academic statistic result correspond to which client query.

2.   [Communication]: Main server prepares a reply message and sends it to the appropriate Client.

3.   [Communication]:  Clients receive the results  from Main server and display it. Clients should keep active for further inputted queries, until the program is manually killed.

DETAILED EXPLANATION

Phase 1 (20 points) -- Bootup

All server programs (Main server and Backend servers A/B/C) bootup in this phase. While booting up, the servers must display a bootup message on the terminal. The format of the bootup message for each server is given in the onscreen message tables at the end of the document. As the bootup message   indicates,   each   server   must    listen    on    the    appropriate   port    for    incoming packets/connections.

Backend servers should boot up first. A backend server needs to read the CSV file and store the department names and student academic records in appropriate data structures. There are many ways to store them, such as dictionary, array, vector, etc. You need to decide which format to use based on the requirement of the problem. You can use any format if you can generate the correct results.

Data File Formats

CSV stands for “comma-separated values” and is a simple file format used to store tabular data, such as a spreadsheet or database. CSV files are often used when data needs to be compatible

with many different programs. They are plaintext files that store data by delimiting data entries with commas. You can open CSV files in text editors, spreadsheet programs like MS Excel, or   other specialized applications, as is shown in Figure 2.

The format of dataA/B/C.csv is defined as follows. For simplicity, consider the university is

using 100-point scale grades for all courses, and all courses are 1 unit. The first row is a header, showing the columns name [DPT, ID, 0, 1, …], where DPT means department name, ID means student ID, and following numerical values 0, 1, … 11 are the index of 12 courses. Below the

first row, each entry represents the academic record (course scores) for a specific student. For

example, row 2 means the student in department “AI”  with student ID “42989” has taken course 0- 11 with scores: 75, 30, 3, none, none, 61, 85, 35, 68, 15, 65, 14.

 

Figure 2: Example dataA.csv open in MS Excel

Assumptions on the data file:

1.   There can be empty (none) items in the course columns, which means the student does not take the course, e.g., Figure 2 item F2 and G3. In this case, there is no content between two commas in the plaintext.

2.   A student takes at least one course, there is no entry with all empty scores. 3.   There are at most 100 students and at least 1 student per department.

4.   There are at most 50 departments and at least 1 department per file.

5.   The student IDs are unique, there is no repeated student ID.

6.   Department names are letters. The length of a department name can vary from 1 letter to at most 20 letters. It may contain only capital and lowercase letters but does not contain any white spaces or other characters. Department names “Abc” and “abc” are different.

7.   Student IDs are non-negative integer numbers. The maximum possible student ID is (2^31

- 1). The minimum possible student ID is 0.

○   This ensures that you can always use int32 to store the student ID.

8.   There is no additional empty line(s) at the beginning or the end of the file. That is, the whole dataA/B/C.csv do not contain any empty lines.

9.   For simplicity, there is no overlap of departments between dataA.csv and dataB.csv. 10. The student IDs in the text are not sorted.

11. Data files dataA/B/C.csv will not be empty.


Example dataA.csv, dataB.csv and dataC.csv is provided for you as a reference. Other data files will be used for grading.

Main server then boots up after Backend servers are running and finish processing the files of

dataA/B/C.csv. Main server will request Backend servers for department lists so that Main server knows which Backend server is responsible for which departments. The communication between Main server and Backend servers is using UDP.

Once the server programs have booted up, the three client programs run. Each client displays a boot up message as indicated in the onscreen messages table. Note that the client code takes no input argument from the command line.

The format for running the student client code is as below. After running it, it should display messages to ask the user to enter a query department name and student ID:

./student

Department name:

student ID:

The format for running the Admin client code is as below. After running it, it should display messages to ask the admin to enter a department name and a query student ID. The Admin client can function two types of queries:

•    Student academic record query: input department name and student ID

•   Department academic statistics query: skip the student ID input using “Enter” key

./student

Department name:

student ID (“Enter” to skip):


After  booting  up,  clients  establish  TCP  connections  with  Main  server.  After  successfully establishing the connection, clients send the input student ID or department name to Main server. Once this is sent, clients should print a message in a specific format.

Each of the backend servers and the main server have its unique port number specified in “PORT NUMBER   ALLOCATION”    section   with    the    source   and    destination    IP   address    as localhost/ 127.0.0.1.

Student clients, Admin client, Main server, Backend server A/B/C are required to print out on  screen messages after executing each action as described in the “ON SCREEN MESSAGES” section. These messages will help with grading if the process did not execute successfully. Missing  some of the on-screen messages might result in misinterpretation that your process failed to  complete. Please follow the exact format when printing the on-screen messages.

Phase 2 (30 points) -- Query

In the previous phase, student clients and admin client receive the query parameters from input and send them to Main server over TCP socket connection. In phase 2, Main server will have to receive requests from all the clients. If the department name or student ID are not found, the main server will print out a message (see the “On Screen Messages” section) and return to standby.

For a server to receive requests from several clients at the same time, the function fork() should be used for the creation of a new process. Fork() function is used for creating a new process, which is called child process, which runs concurrently with the process that makes the fork() call (parent process). This is the same as in Project Part 1.

For a TCP server, when an application is listening for stream-oriented connections from other hosts, it is notified of such events and must initialize the connection using accept(). After the connection  with the client is successfully established, the accept() function returns anon-zero descriptor for a  socket called the child socket. The server can then fork off a process using fork() function to handle  connection on the new socket and go back to waiting on the original socket. Note that the socket  that was originally created, that is the parent socket, is going to be used only to listen to the client  requests, and it is not going to be used for communication between client and Main server. Child  sockets that are created for a parent socket have the identical well-known port number IP address  at the server side, but each child socket is created for a specific client. Through using the child  socket with the help of fork(), the server can handle the two clients without closing any one of the  connections.

Phase 3 (40 points) -- Analysis

In this phase, each Backend server should have received a request from Main server. The request should contain a department name or a student ID. A backend server will generate academic statistics per request based on the academic records of the student and records of all students in the department.

Expected Results

(a) Student academic record query. For the student clients and for the admin client when providing student ID, you should calculate, reply and show on-screen the Student GPA and Percentage Rank in the Department:

•    Student GPA: Grade-Point Average. Since we assume all courses are at the same unit, the GPA for a student is the average grades of all the courses the student has taken. Note that empty items should not be counted.

•   Percentage Rank in Department: The GPA of the request student is higher or equal than how many percent of students (self-included). For example, if the student is rank 4 in 10 students, meaning the student is higher or equal than 7 students in GPA, the percentage rank should be 7/10=70%. If there is only one student in the department, then the rank is 1/1=100%.

(b) Department academic statistics. For the admin client not providing student ID, you should calculate, reply and show on-screen the Department GPA Mean, Department GPA Variance, Department Max GPA, Department Min GPA. In detail:

•   Department GPA Mean: Sum(GPA of all students in this department)/(number of students in this department).

•   Department GPA Variance: Assuming in a department, the student number isN, the mean GPA is μ, then the variance is defined as: ∑ (x − μ)2 /N, where xis the GPA of one student in the department.

•   Department Max GPA, Department MinGPA: Sort all the students’ GPA in the department and get the max and min values.

Phase 4 (10 points) -- Reply

At the end of Phase 3, the responsible Backend server should have the corresponding results ready. The result should be sent back to the Main server using UDP. When the Main server receives the result, it needs to forward all the result to the corresponding Client using TCP. The clients will

print out academic performance statistics and then print out the messages for a new request as follows:

(a) Student academic record query.

...

The academic record for Student Student GPA: 45.1

Percentage Rank: 50.0%

 

-----Start a new request-----

Enter department name:

Enter student ID:

 

 

42989

 

 

in

 

 

Department

 

 

AI

 

 

is:

 

(b) Department academic statistics.

...

The academic statistics for Department AI are:

Department GPA Mean: 47.9

Department GPA Variance: 62.7

Department Max GPA: 61.4

Department Min GPA: 38.9

 

-----Start a new request-----

Enter department name:

Enter student ID:

See the ON SCREEN MESSAGES table for an example output table.