DTS001 Data Analytic for Entrepreneurship 2025-2026
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
|
Module code and Title |
DTS001 Data Analytic for Entrepreneurship |
|
|
Academic Year |
2025-2026 |
|
|
Assignment Title |
Final Coursework |
|
|
Submission Deadline |
31st. Oct. 2025 |
|
DTS001 Data Analytic for Entrepreneurship
Resit Coursework
Submission deadline: 30th Oct.
Percentage in final mark: 100%
Learning outcomes assessed:
A: Preprocess, analyse and interpret data using a modern computer package
B: Summarize and visualize data using a modern computer package
C: Present findings to a business audience in a suitable format
Late policy: 5% of the total marks available for the assessment shall be deducted from the assessment mark for each working day after the submission date, up to a maximum of five working days
Risks:
Please read the coursework instructions and requirements carefully. Not following these instructions and requirements may result in loss of marks.
Plagiarism results in award of ZERO mark.
The formal procedure for submitting coursework at XJTLU is strictly followed. Submission link on Learning Mall will be provided in due course. The submission timestamp on Learning Mall will be used to check late submission.
All students must download their file and check that it is viewable after submission. Documents may become corrupted during the uploading process (e.g. due to slow internet connections). However, students themselves are responsible for submitting a functional and correct file for assessments.
In this assignment, you are strictly prohibited from using ChatGPT or other similar natural language processing tools to attempt to directly solve problems in the assignments. Once detected, the corresponding parts will be marked as 0 point.
All submissions must be written in English; Any parts containing Chinese will not be considered for marking.
Overview
In this coursework, you are required to complete two tasks based on the given dataset and submit a compressed document (in .zip file) that includes two files:
1. Task1: An Excel file (in .xlsx file) containing your visualization and modeling process and results for the given dataset.
2. Task2: A report (in .pdf file) analyzing the visualization and modeling results.
The assignment must be submitted via Learning Mall Online to the correct drop box. Only electronic submission is accepted and no hard copy submission.
Task 1 (50 marks)
You will work with the “Bike Sharing Demand data set”, which contains information on bike rental demand in the Capital Bikeshare program in Washington, D.C.. The data are split into a training set and a test set. There are 11 predictors in training set and 9 predictors in test set and 1 target variable—number of total bike rental demand (count)—highlighted in red bold in the .xlsx file in both training and test set. Detailed descriptions of all numeric features are provided in Appendix.docx. Here are task specifications:
Target for visualization: You are asked to use excel to create a visualization that complete the following tasks in the training set
O Clean and preprocess the original training set (10 marks)
O Show the impact of the Holiday on the number of total bike rental demand through appropriate tables and charts (5 marks)
O Show the impact of the Weather on the number of total bike rental demand through appropriate tables and charts(5 marks)
O Show the impact of the Temp on the number of total bike rental demand through appropriate tables and charts (5 marks)
O Show the impact of the WindSpeed on the number of total bike rental demand through appropriate tables and charts (5 marks)
Target for model: You are asked to use excel to construct a model that can predict number of total bike rental demand (count, which is in the column in red bond font in the .xlsx file) based on the different features in training set. Your model needs to complete the following tasks in the training set.
o Choose the appropriate independent variable for the appropriate model (8 marks)
o Maximize R-squared (R²) on the training set (6 marks)
o Minimize mean squared error (MSE) on the test set (6 marks)
The submitted Excel file should include:
o The original training set
o The training set after data preprocessing
o All visualized tables and charts
o Summary output of the final model on both training and test sets
Detailed Requirements:
o The formulas and functions used in data preprocessing needs to be retained in your .xlsx file. You need to demonstrate through formulas how the processed data was transformed step by step.
o Visual charts and tables need to be generated by Excel and remain in an editable state in your
.xlsx file. Screenshots will not be accepted.
Additional notes:
o The use of add-ins that have not been mentioned in lecture is allowed, but it is necessary to refer the source and ensure that the add-ins is publicly available
o It is allowed to use newly constructed features during the model constructing, but these features must be based on the original training set, and the process of constructing the new features needs to be retained.
Task 2 (50 marks)
In this task, you need to write a report based on your visualization and modeling results
Target for report: You are asked to write a report (in PDF) to analyze your visualization results and evaluate your model's prediction result in the test set, the report should consist of following contents:
o Analysis of each visualization table and chart (16 marks)
o Conclude which feature has the greatest impact on the number of total bike rental demand in Washington, D.C. and provide corresponding evidence (6 marks)
o Evaluation of the R-Square of the model (5 marks)
o Evaluation of the MSE of the model (5 marks)
o Elaborate on the potential of your predictions in commercial applications (10 marks)
o Discuss the limitations of the model and potential directions for improvement (8 marks)
The formatting requirements in the report:
o Font: Times new roman
o Page limitation: 1
o Line Spacing: single space
o Spacing Before: 0pt
o Spacing After: 12pt
Notes:
o Newly created features can be included in the discussion
o You can evaluate your model by comparing different models
o Discussions on directions for improvement can include discussions on improving the dataset
o You may get marks deducted if your report has more than 1 page
Marking Criteria
|
Tasks |
100 |
Components |
Description |
Maximum Credit |
Mark |
|
Task 1 |
50 |
Data Preprocessing [10 marks] |
Missing value handling |
3 |
|
|
Outlier handling |
4 |
|
|||
|
Non-numeric data handling |
3 |
|
|||
|
Data Visualization [20 marks] |
Visualization for Holiday |
5 |
|
||
|
Visualization for Weather |
5 |
|
|||
|
Visualization for Temp |
5 |
|
|||
|
Visualization for WindSpeed |
5 |
|
|||
|
Model Construction [20 marks] |
Model Choice |
8 |
|
||
|
Prediction MSE |
6 |
|
|||
|
Predicted Result Table |
6 |
|
|||
|
Task 2 |
50 |
Visualization Analysis [22 marks] |
Analysis of pivot table |
8 |
|
|
Analysis of pivot chart |
8 |
|
|||
|
Analysis of the impact level on different features |
6 |
|
|||
|
Model Evaluation [20 marks] |
Evaluation of the fitness of the model |
5 |
|
||
|
Evaluation of the predicted results of different predicted value ranges of the model |
5 |
|
|||
|
Potential of commercial applications |
10 |
|
|||
|
Discussion [8 marks] |
Limitations |
4 |
|
||
|
Future improvement directions |
4 |
|
|||
|
|
|
Late |
Submission? |
oYes oNo |
Days late |
|
|
|||||
|
|
|
Fi |
nal Marks |
|
|
2025-10-14