EE450 Socket Programming Project Part3 Fall 2023
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
EE450 Socket Programming Project
Part3
Fall 2023
Due Date:
Wednesday, December 6, 2023 11:59PM
(Hard Deadline, Strictly Enforced)
OBJECTIVE
The objective of project is to familiarize you with UNIX socket programming. It is an individual assignment, and no collaborations are allowed. Any cheating will result in an automatic F in the course (not just in the assignment). If you have any doubts/questions, please email TA, or come by TA’s office hours. You can ask TAs any question about the content of the project, but TAs have the right to reject your request for debugging.
PROBLEM STATEMENT
In this project,you will implement a Student Performance Analysis system, a software application that allows students to check their GPA, percentage ranks, and other relevant data. This information can be used by students to identify areas where they need to focus their efforts and improve their performance. It can also be used by administrators to identify areas where teaching teams need to improve or provide additional resources to help students succeed.
Figure 1: Overview of the Student Performance Analysis system.
This system generates academic statistics based on user queries. There are two types of clients, student (Student 1 and Student 2) and administrator (Admin) in the system. A student can query a student’s academic performance, while the admin can also query academic statistics of a certain department. The students/admin would send a request to a main server and receive results replied from the main server. Now since there are many departments, the university use a distributed system design where the main server is further connected to many (in our case, three) backend servers. Each backend server stores the academic records for a list of departments. For example, backend server A may store the student data of ECE and CS department, and backend server B may store the student data of Art department. Therefore, once the main server receives a user (student) query, it decodes the query and further sends a request to the corresponding backend server. The backend server will search through its local data, analyze the academic statistics, and reply to the main server. Finally, the main server will reply to the client to conclude the process.
The detailed operations to be performed by all the parties are described with the help of Figure 1. There are in total 7 communication endpoints:
● Student 1 and Student 2: two different student clients possibly in different department
● Admin: a client that can query student and department academic statistics
● Main server: responsible for interacting with the clients and the backend servers
● Backend server A/B/C: responsible for loading data (from dataA/B/C.csv) generating the statistics of the students’ academic performance
The full process can be roughly divided into four phases (see also “DETAILED EXPLANATION” section), the communication and computation steps are as follows:
Bootup
1. [Computation]: Backend server A/B/C read the file dataA.csv, dataB.csv and dataC.csv respectively and store the information in data structures.
○ Assume a “static” system where contents in dataA/B/C.csv do not change throughout the entire process.
○ Backend servers only need to read the text files once. When Backend servers are handling user queries, they will refer to the data structures, not the text files.
○ For simplicity, there is no overlap of departments or students among data files.
2. [Communication]: After step 1, Main server will ask each of Backend servers which departments they are responsible for. Backend servers reply with a list of departments to the main server.
3. [Computation]: Main server will construct a data structure to book-keep such information from step 2. When the client queries come, the main server can send a request to the correct Backend server.
Query
1. [Communication]: Each client (students and admin) sends a query to the Main server.
○ A student client sends department name and student ID to the main server.
○ The admin client can perform two types of queries:
(a) sending department name and student ID to query a student academic record (b) sending only a department name to query department academic statistics
○ A client can terminate itself only after it receives a reply from the server (in the Reply phase).
○ Main server maybe connected to all three clients at the sametime.
2. [Computation]: Once the Main server receives the queries, it decodes the queries and decides which backend server(s) should handle the queries.
Analysis
1. [Communication]: Main server sends a message to the corresponding Backend server so that the Backend server can perform computation.
2. [Computation]: For query (a), once the department name and student ID are received, Backend server generates academic record for this student.
3. [Computation]: For query (b), once the department name is received, Backend server generates academic statistics for the department.
4. [Communication]: Backend servers, after generating the results, will reply to Main server.
Reply
1. [Computation]: Main server decodes the messages from Backend servers and then decides which academic statistic result correspond to which client query.
2. [Communication]: Main server prepares a reply message and sends it to the appropriate Client.
3. [Communication]: Clients receive the results from Main server and display it. Clients should keep active for further inputted queries, until the program is manually killed.
DETAILED EXPLANATION
Phase 1 (20 points) -- Bootup
All server programs (Main server and Backend servers A/B/C) bootup in this phase. While booting up, the servers must display a bootup message on the terminal. The format of the bootup message for each server is given in the onscreen message tables at the end of the document. As the bootup message indicates, each server must listen on the appropriate port for incoming packets/connections.
Backend servers should boot up first. A backend server needs to read the CSV file and store the department names and student academic records in appropriate data structures. There are many ways to store them, such as dictionary, array, vector, etc. You need to decide which format to use based on the requirement of the problem. You can use any format if you can generate the correct results.
Data File Formats
CSV stands for “comma-separated values” and is a simple file format used to store tabular data, such as a spreadsheet or database. CSV files are often used when data needs to be compatible
with many different programs. They are plaintext files that store data by delimiting data entries with commas. You can open CSV files in text editors, spreadsheet programs like MS Excel, or other specialized applications, as is shown in Figure 2.
The format of dataA/B/C.csv is defined as follows. For simplicity, consider the university is
using 100-point scale grades for all courses, and all courses are 1 unit. The first row is a header, showing the columns name [DPT, ID, 0, 1, …], where DPT means department name, ID means student ID, and following numerical values 0, 1, … 11 are the index of 12 courses. Below the
first row, each entry represents the academic record (course scores) for a specific student. For
example, row 2 means the student in department “AI” with student ID “42989” has taken course 0- 11 with scores: 75, 30, 3, none, none, 61, 85, 35, 68, 15, 65, 14.
Figure 2: Example dataA.csv open in MS Excel
Assumptions on the data file:
1. There can be empty (none) items in the course columns, which means the student does not take the course, e.g., Figure 2 item F2 and G3. In this case, there is no content between two commas in the plaintext.
2. A student takes at least one course, there is no entry with all empty scores. 3. There are at most 100 students and at least 1 student per department.
4. There are at most 50 departments and at least 1 department per file.
5. The student IDs are unique, there is no repeated student ID.
6. Department names are letters. The length of a department name can vary from 1 letter to at most 20 letters. It may contain only capital and lowercase letters but does not contain any white spaces or other characters. Department names “Abc” and “abc” are different.
7. Student IDs are non-negative integer numbers. The maximum possible student ID is (2^31
- 1). The minimum possible student ID is 0.
○ This ensures that you can always use int32 to store the student ID.
8. There is no additional empty line(s) at the beginning or the end of the file. That is, the whole dataA/B/C.csv do not contain any empty lines.
9. For simplicity, there is no overlap of departments between dataA.csv and dataB.csv. 10. The student IDs in the text are not sorted.
11. Data files dataA/B/C.csv will not be empty.
Example dataA.csv, dataB.csv and dataC.csv is provided for you as a reference. Other data files will be used for grading.
Main server then boots up after Backend servers are running and finish processing the files of
dataA/B/C.csv. Main server will request Backend servers for department lists so that Main server knows which Backend server is responsible for which departments. The communication between Main server and Backend servers is using UDP.
Once the server programs have booted up, the three client programs run. Each client displays a boot up message as indicated in the onscreen messages table. Note that the client code takes no input argument from the command line.
The format for running the student client code is as below. After running it, it should display messages to ask the user to enter a query department name and student ID:
./student
…
Department name:
student ID:
The format for running the Admin client code is as below. After running it, it should display messages to ask the admin to enter a department name and a query student ID. The Admin client can function two types of queries:
• Student academic record query: input department name and student ID
• Department academic statistics query: skip the student ID input using “Enter” key
./student
…
Department name:
student ID (“Enter” to skip):
After booting up, clients establish TCP connections with Main server. After successfully establishing the connection, clients send the input student ID or department name to Main server. Once this is sent, clients should print a message in a specific format.
Each of the backend servers and the main server have its unique port number specified in “PORT NUMBER ALLOCATION” section with the source and destination IP address as localhost/ 127.0.0.1.
Student clients, Admin client, Main server, Backend server A/B/C are required to print out on screen messages after executing each action as described in the “ON SCREEN MESSAGES” section. These messages will help with grading if the process did not execute successfully. Missing some of the on-screen messages might result in misinterpretation that your process failed to complete. Please follow the exact format when printing the on-screen messages.
Phase 2 (30 points) -- Query
In the previous phase, student clients and admin client receive the query parameters from input and send them to Main server over TCP socket connection. In phase 2, Main server will have to receive requests from all the clients. If the department name or student ID are not found, the main server will print out a message (see the “On Screen Messages” section) and return to standby.
For a server to receive requests from several clients at the same time, the function fork() should be used for the creation of a new process. Fork() function is used for creating a new process, which is called child process, which runs concurrently with the process that makes the fork() call (parent process). This is the same as in Project Part 1.
For a TCP server, when an application is listening for stream-oriented connections from other hosts, it is notified of such events and must initialize the connection using accept(). After the connection with the client is successfully established, the accept() function returns anon-zero descriptor for a socket called the child socket. The server can then fork off a process using fork() function to handle connection on the new socket and go back to waiting on the original socket. Note that the socket that was originally created, that is the parent socket, is going to be used only to listen to the client requests, and it is not going to be used for communication between client and Main server. Child sockets that are created for a parent socket have the identical well-known port number IP address at the server side, but each child socket is created for a specific client. Through using the child socket with the help of fork(), the server can handle the two clients without closing any one of the connections.
Phase 3 (40 points) -- Analysis
In this phase, each Backend server should have received a request from Main server. The request should contain a department name or a student ID. A backend server will generate academic statistics per request based on the academic records of the student and records of all students in the department.
Expected Results
(a) Student academic record query. For the student clients and for the admin client when providing student ID, you should calculate, reply and show on-screen the Student GPA and Percentage Rank in the Department:
• Student GPA: Grade-Point Average. Since we assume all courses are at the same unit, the GPA for a student is the average grades of all the courses the student has taken. Note that empty items should not be counted.
• Percentage Rank in Department: The GPA of the request student is higher or equal than how many percent of students (self-included). For example, if the student is rank 4 in 10 students, meaning the student is higher or equal than 7 students in GPA, the percentage rank should be 7/10=70%. If there is only one student in the department, then the rank is 1/1=100%.
(b) Department academic statistics. For the admin client not providing student ID, you should calculate, reply and show on-screen the Department GPA Mean, Department GPA Variance, Department Max GPA, Department Min GPA. In detail:
• Department GPA Mean: Sum(GPA of all students in this department)/(number of students in this department).
• Department GPA Variance: Assuming in a department, the student number isN, the mean GPA is μ, then the variance is defined as: ∑ (x − μ)2 /N, where xis the GPA of one student in the department.
• Department Max GPA, Department MinGPA: Sort all the students’ GPA in the department and get the max and min values.
Phase 4 (10 points) -- Reply
At the end of Phase 3, the responsible Backend server should have the corresponding results ready. The result should be sent back to the Main server using UDP. When the Main server receives the result, it needs to forward all the result to the corresponding Client using TCP. The clients will
print out academic performance statistics and then print out the messages for a new request as follows:
(a) Student academic record query.
... The academic record for Student Student GPA: 45.1 Percentage Rank: 50.0%
-----Start a new request----- Enter department name: Enter student ID: |
42989 |
in |
Department |
AI |
is: |
(b) Department academic statistics. |
|||||
... The academic statistics for Department AI are: Department GPA Mean: 47.9 Department GPA Variance: 62.7 Department Max GPA: 61.4 Department Min GPA: 38.9
-----Start a new request----- Enter department name: Enter student ID: |
See the ON SCREEN MESSAGES table for an example output table.
2023-12-01