Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

CITS 2401

Computer Analysis and Visualisation

Project 2

Visualising Data Traffic in Smart Cars

Worth: 15% of the unit

Submission: (1) your code, (2) your data analysis and visualisation report, and (3) your peer marking results. Deadlines:

•   Submissions (1) and (2):         19 May 2023 5pm

•   Submission (3)              :        26 May 2023 5pm

Late submissions: For (1) and (2), the late submissions attract 5% raw penalty per day up to 7 days (i.e., 26 May 2023 5pm). After, the mark will be 0 (zero). Also, any plagiarised work will be marked zero.

Submission (3) is a failed component of the project (i.e., if you fail submission (3) then you fail the project).

1. Outline

In this project, we will continue from our Project 1 where we implemented an intrusion detection system (IDS). But instead of implementing the features (which we completed in Project 1), we will now focus on data analysis and visualisation skills to better present what our datasets contain. For this project, you will be given several datasets that contain smart car sensor readings and network traffic that are already labelled normal or attack. Your task is to perform the following steps (more details in the tasks section) :

•    Data analysis

•    Data visualisation

•    Write data analysis and visualisation report

•    (bonus) use machine learning to implement IDS

•    Peer marking

To complete this project, you will need to refer to the lectures up to week 10 Visualisation, along with their relevant labs (i.e., lab 10).

Note 1: This is an individual project, so please refrain from sharing your code or files with others. However, you can have high-level discussions about the syntax of the formula or the use of modules with other examples. Please note that if it is discovered that you have submitted work that is not your own, you may face penalties. It is also important to keep in mind that chatGPT and other similar tools are limited in their ability to generate outputs, and it is easy to detect if you use their outputs without understanding  the  underlying  principles.  The  main  goal  of  this  project  is  to  demonstrate  your understanding of programming principles and how they can be applied in practical contexts.

Note 2: you do not necessarily have to complete project 1 to do this project, as it is more about data analysis and visualisation of the datasets you are given. But if you have completed project 1, you are certainly welcome to include IDS related data analysis and visualisation as well.

2. Tasks

Task 1 Data Analysis using NumPy

You must demonstrate at least 5 NumPy related skills for data analysis. This may include  NumPy functions and methods, matrix manipulations, vectorized computations, NumPy statistics, NumPy where function, etc. The more variety of skills demonstrated would equate to the fluency of data analysis.

For example, if you wrote a data analysis code that leads to the same output, we will consider that as NOT being a different approach to achieving the outcomes. If you only slightly modify the code to get different outputs (e.g., changing the axes value),  it will not give you full marks (i.e., it would be considered as a variation to the original solution you presented).

Task 2 Data Visualisation using matplotlib

You must demonstrate at least 5 matplotlib related skills for data visualisation. These can be line/bar charts, histograms, scatterplots, heatmaps, etc. The presentation of the visualisations (e.g., customising labels, points etc.) will determine your fluency in data visualisation skills. You should also note that the data you visualise should be meaningful (i.e., you need to provide discussions about the data you have visualised).

For example, changing the line colour will NOT be considered as a different data visualisation skill. Changing the regression model from linear to cubic will only be considered as a minor change (i.e., it would be considered as a variation to the original solution you presented) and will not give you full marks.

Task 3 Write a summary report

Write a summary report for your data analysis and visualisation. The report presents your findings and recommendations  using  the  above  data  analysis  and  visualisation techniques  and  articulate  your understanding of the datasets. You should clearly explain your motive (i.e., why did you do this data analysis and/or visualisation?), methodology (how), and results (what), and discuss the capabilities of your IDS based on your analysis.

There is no specific format or page/word limits, but a good (HD-grade) report does not necessarily be the longest nor in the most professional format (i.e., quality over quantity, and you need a good balance between the two).

At the very end of the report, you must also add an appendix to include your source code . Note: Ensure to include your student ID in the report! NO NAME NO MARKS!

Task 4 (Bonus) Apply machine learning techniques

Research and build an intrusion detection model using machine learning (ML). You can implement a ML- based IDS, such as logistic regression, decision tree, or neural network, to classify the data into normal or attack categories. You should also split the data into training and testing sets, train your model, and

evaluate its performance using metrics such as accuracy, precision, recall, and F1-score. Add a section in your report about using ML techniques for IDS, including visualisations.

Final remarks: make sure you have the module docstring for your project code, indicating your name, your student ID number along with the description of the project.

Above tasks (tasks 1 4) are all due 19 May 2023 5pm.

Task 5 Peer marking

Please note, only this task is due 26 May 2023 pm .

Project 2 will involve peer-marking: you will be asked to mark 3 other reports assigned to you for the data analysis (20 marks), visualisation (20 marks) and coding style (10 marks), a total of 50 marks . This is a failed component task (i.e., you must complete this task to pass the project).

Rules

1.  You must mark the reports according to the rubrics shown in section 4 below for data analysis and visualisation components (marking out of 40), and the coding style (marking out of 10). A total of 50 marks can be given to a report.

2.  You may also award bonus marks if it exists in the report, with the maximum mark of 5.

3.  Giving poor/high marks will be flagged and may be deemed as task 5 incomplete .

o  so mark genuinely following the rubrics!

4.  You must mark ALL three allocated reports.

5.  Your mark remains zero (0) until you complete all three peer markings.

6.  Your mark remains zero (0) if you conduct rule (3) intentionally.

Please note, all peer marks submitted will be moderated and adjusted as necessary. This will ensure the peer marking is done correctly and that you are not penalised for those who conduct rule (3)     intentionally or unintentionally.

3. Submission

Submission items (1) and (2) Code and Report

Submit your whole code (tasks 1 to 3, and 4 if exists) in the quiz answer box by the due date (19 May 2023 5pm, drop dead due date 26 May 2023 5pm with 5% raw penalty per day), containing all functions, objects etc., as well as attaching the python file containing all the code you wrote for this project. You should name the file as  [student

id]_P2.py. For example, if your student ID is 12345678, then your file name is 12345678_P2.py.         Similarly, submit your report as a PDF format ONLY to the project 2 report submission on the quiz server.     Fail to follow these instructions will be regarded as NO SUBMISSION (i.e., you will receive 0 for this project).

Submission item (3) Peer Marking

Your peer marking allocation will be released on 22 May 2023. You will have access to the reports through the MS Teams -> CITS2401 Team -> Project discussion -> Files -> Project 2 Reports

(link: https://uniwa.sharepoint.com/:f:/r/teams/CITS2401SEM- 12023/Shared%20Documents/Project%20discussion/Project%202%20Reports?csf=1&web=1&e=FTGco6).

From here, find the reports you are assigned to mark (you can also view other reports for your learning as well).

Once you have finished marking, submit the marks on the quiz server Project 2 peer marking submissions” . Invalid submissions will be deemed INCOMPLETE .

4. Rubrics

Criteria

Highly Satisfactory (D, HD)

Satisfactory (P, CR)

Unsatisfactory (N)

Data Analysis

(20 marks)

Understand the use of Python for data            analysis.

Demonstrate the ability to write and         execute Python code for data    analysis.

Demonstrated the ability to analyse data using Python fluently:

Correct use of five or more   NumPy related skills for data analysis as appropriate in      articulate ways.

Meaningful data analysis results.

Demonstrated the ability to analyse data using Python:

Correct use of four or five NumPy related skills for  data analysis.

Mostly meaningful data analysis results.

Failed to demonstrate the ability to use Python for data analysis:

Less than two correct  uses of NumPy related skills for data analysis.

The data analysis results are not intuitive/hard to  understand.

Visualisation

(20 marks)

Understand the use of Python visualisation    tools.

Demonstrate the ability to      visualise data in Python.

Demonstrated the ability to use Python for visualisation fluently:

Correct use of matplotlib for visualisations with advanced visualisation outputs.

Visually appealing format for various data analysis results.

Demonstrated the ability to use Python for visualisation:

Correct use of matplotlib for visualisations.

Appropriate visualisation format for various data    analysis results.

Failed to demonstrate the ability to use Python functions:

Incorrect use of matplotlib for   visualisations.

Inappropriate visualisation format for different data.

Coding Style

( 10 marks)

Code is written in accordance  with the style   guideline.

Code is written  legibly and is of high standard.

Demonstrated the ability to present Python code comprehensively:

The coding style conforms to the style guideline with         attention to details.

Demonstrated the ability to present Python code:

The coding style conforms to the style guideline.

Failed to demonstrate the ability to present Python code:

The coding style does not conform to the style         guideline.

This project is worth a total of 50 marks.

Task 4 marking guide:

For this task, the student must have implemented at least one ML model for IDS. Then, some form of data analysis and visualisation related to the ML IDS solution is needed. Depending on the level of discussion and visualisation, the student can award the report up to 5 additional marks (at your discretion). The marker must insure that the bonus marks are only given to reports that satisfy the above criteria, otherwise it will be deemed

as poor marking (which can lead you to not getting any marks for YOUR project).

You may consult any of the lab facilitators if you are not sure about this.