Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

BUSN5101  Programming for Business

Python Programming

Group Assignment


Learning outcome of the assignment:

  Practice with real-world data sets and understand the structure and organisation of data

  Learn to import  large scale raw data into programming  environments, and to perform  basic  data  wrangling  (cleaning  and  formatting  data  for  downstream process)

•  Learn to apply programming skills to process raw data to solve business related problems

•  Learn to visualize data to display the results effectively, and communicate the gained insights with clients and stakeholders


Two csv files contain some information about airports in the US and the details of the cause of airline delays in Feb 2020.

Use Python programming language to complete the following activities:

Part A:

To complete this section, you are not allowed to import and use any Python module (in particular the csv module) to load data into the IDE and process data, except for part A.3 where you need to visualize data.

The first csv file lists information about the airports like name, city, latitude (lat) and longitude (long) coordinates. We’d like to create a text file including only the airport names and their coordinates . Also, we need to find the airports with, 32o< lat <37o and - 100o< long< -80o, and store them in a second text file. In addition, you must report the number of airports that satisfy theses coordinate conditions .

1.     Develop a flowchart for completing the activity.

2.     Develop a  Python  script for loading the  csv file,  processing data and  creating the text files  with the requested information .

3.     Plot the coordinates of the airports using a scatter diagram with only the negative longitudinal coordinate values.

Part B:

To complete this section, you are asked to use the Python Pandas module.

The second csv file contains details of the airline delays. The delays are presented in two forms: counts of the delayed flights, and the time of delays in minutes . The file also contains the cause of the airline delays which are listed in the table below:

Air Carrier Delay

Weather

Delay

National Aviation

System (NAS)

Delay

Security Delay

Aircraft    Arriving Late

Cancelled

Diverted

Using Pandas and other Python functions and tools develop scripts to complete the following activities:

1.    Find the total number of flights (operations).

2.    Find the total number of delayed flights .

3.    Find the total delayed time in minutes.

4.    Find the airport with largest number of delayed flights .

5.    Find the coordinates (from Part A) of the airport with highest delayed time.

6.    Find the airport in Texas which has the largest number of delayed flights .

7.    Using a pie chart present a display the percentage of on-time flights, and the items listed in the table above.

Brief Marking Criteria:

Item

Criteria

Consideration

Report

Correctness

Results  and  information  presented  in  the  report  are  correct  and relevant.

Organisation

The report has all elements of a good report (such as introduction, body and conclusion etc.).

The elements are structured and developed with a logical connection to form a narrative with the specific purpose of the report.

Presentation

The format of the report is well for its purpose and is consistent. It is easy to understand the figures, plots etc.

The report neatly presents the methodologies, results etc.

Codes

Clarity

Codes are structured in a logical way.

Codes are have consistency in spacing the statements and blocks .

Codes have explanation so that other users can understand the codes easily.

Elegance

The codes are as simple as possible without unnecessary complexity.

Report Length (recommended):

Maximum number of pages: 13

Maximum number of words: 3500