Mad Paws - QBUS6600 Project Outline
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
Mad Paws - QBUS6600 Project Outline
Background
Mad Paws (https://www.madpaws.com.au/) is an online marketplace that provides pet services from over 40,000 pet carers across Australia. With Mad Paws, pet owners can search and connect with nearby pet sitters who can help take care of their pet at times when the owner is unable to. Mad Paws offers a variety of services ranging from pet sitting where the sitter either stays at the owner’s house with the pet or the pet stays at the sitter’s house, dog walking, dog training, or even short house visits where sitters can drop by the owner’s house to feed the pet.
The booking process involves:
● Searching for potential nearby sitters through the Mad Paws’ website or app. Owners input their location, pet service that they are looking for, the number and type of pets they have as well as the request date for the service.
● Mad Paws will then present a list of nearby sitters. Sitters list their own prices, provide their location and other useful information such as what type of pets they are willing to work with (large dogs, small dogs, cats etc.), their type of house, whether they have children and other pets etc.
● Owners then contact the sitter that they have chosen with a booking request for specific dates.
● If the sitter is available and approves the request, it will result in a created booking.
● The sitter will finally provide the service at the requested date and complete the booking. The owner also has the option to cancel anytime prior to the completion of the booking, resulting in an uncompleted booking. Alternatively, the sitter can decline the booking request or not respond which results in an expired and also uncompleted booking.
To understand this process more, you can visit the Mad Paws website (https://www.madpaws.com.au/) or download the app to search up pet sitters.
Mad Paws wishes to understand factors that drive booking completion between pet owners and pet sitters. By better understanding these relationships, Mad Paws can provide pet owners and their pets with a better experience by ranking and matching them with the best sitters that are most suitable for their needs.
Problem Description
You have been provided with a dataset from Mad Paws that summarises features relating to pet owners, pet sitters, and requested bookings (see ‘ Data Description’ for more detail).
In your assignment, you will:
● As a business analyst, do a preliminary Exploratory Data Analysis (EDA) over the dataset. You are expected to find or reveal all possible properties, characteristics, patterns, and statistics hidden in the datasets. The results from your EDA may be used for the final goal of identifying the top attributes that are likely to predict whether a booking request will result in a completed booking. You may consider investigating other booking outcomes (e.g. expired bookings, declined bookings etc.), but the focus is to determine which factors lead to a booking request between owner and sitter to result in being completed versus not completed.
Note that Mad Paws is already aware of some obvious factors (in the Sitter Stats Basic Table) resulting in completed bookings such as sitter completion rates or number of bookings a sitter has completed with an owner before. As such, they are particularly interested in drivers of completed bookings that may not seem so obvious. Therefore, while you are still encouraged to analyse these obvious statistics (as they may be useful features for increasing model performance), please also make sure to analyse the less obvious factors in addition. For example, one hypothesis you may consider is whether some owners match better with sitters that focus on specific breeds etc? Also take care to avoid data leakage when using the pre-computed statistics from the Sitter Stats Basic Table. You can use the timestamp corresponding to each statistic to avoid including any future data.
● Synthesise your potential insights from the EDA and construct a model which can predict whether a booking is completed versus not completed (where a not completed booking could be a result of an expired or declined booking).
You will need to build a model with whatever machine learning approaches you feel appropriate. You should evaluate your model/s on a range of metrics, however, the F1 score (defined below) will be used to evaluate the performance of your final model on the test data. You should follow an industry recognised approach to Data Science problems (e.g. CRISP-DM) and include a justification for your selected model. You will be required to show the methods you used to prioritise your potential insights and defend the models and results with supporting evidence. You will also be required to submit your retention predictions on the test data.
Important:
1) Please use the pre-splitted training and testing set. Your evaluation metrics on the test set are important.
2) Please consider which variables are not available at the time of fare look up, and exclude those as predictor variables (because in real life, your model won’t have them available when making predictions!). You can read more about data leakage here: https://www.kaggle.com/code/alexisbcook/data-leakage
● Based on your analysis, deliver recommendations for Mad Paws, to take advantage of the key behaviours and attributes that can help increase bookings.
Specifically, the deliverable for this group project will be an algorithm that is capable of predicting likelihood of booking completion between an owner and sitter. As a final task, you will be required to present your algorithm, your insights relating to characteristics of highly ranked sitters, as well as estimates relating to the increased bookings from implementing your algorithm and/or recommendations, supported by your analysis and/or any assumptions.
Please limit the number of recommended projects to 1-2. Also note that it is ideal for groups to recommend deployment of their model, however groups can also leverage model insights for recommendations, as long as the recommendations are closely linked to the insights and not overly general in nature (e.g. general app redevelopment or event).
Data Description
The Mad Paws dataset is sorted into 6 csv files:
● Training Reponse: ~55.3k rows for the training data where each row contains a pet owner and sitter pair, the booking created date and the response (1 for completed booking and 0 for uncompleted booking).
● Test Response: ~17.2k rows for the testing data. The features provided are the same as the Training Reponse table.
● Sitter Stats Basic: ~76.9k rows containing basic statistics for a sample of pet sitters such as their completion rate, response rate, total reviews etc. The statistics also have a corresponding timestamp in each row to indicate when the statistic was calculated.
● Owners Pet Detail: ~6.3k rows containing details of the pet owner and their pet such as the owner’s location, number of pets they have, their pet breed, weight etc.
● Sitter Locations: ~22.9k rows containing the location (suburb and postcode) of each pet sitter.
● Booking Data: ~1.2 million rows where each row is a unique booking between a pet owner and sitter; provides information related to the time of booking, pet service requested, booking status and pet details.
2023-08-29