Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Mad Paws - QBUS6600 Project Outline

Background

Mad  Paws  (https://www.madpaws.com.au/)  is  an  online  marketplace  that   provides   pet services from over 40,000  pet carers across Australia. With  Mad  Paws,  pet owners can search and connect with nearby pet sitters who can help take care of their pet at times when the owner is unable to. Mad Paws offers a variety of services ranging from pet sitting where the sitter either stays at the owner’s house with the pet or the pet stays at the sitter’s house, dog walking, dog training, or even short house visits where sitters can drop by the owner’s house to feed the pet.

The booking process involves:

●   Searching for potential nearby sitters through the Mad Paws’ website or app. Owners input their location, pet service that they are looking for, the number and type of pets they have as well as the request date for the service.

●    Mad  Paws  will  then  present  a  list  of  nearby  sitters.  Sitters  list  their  own  prices, provide their location and other useful information such as what type of pets they are willing to work with (large dogs, small dogs, cats etc.), their type of house, whether they have children and other pets etc.

●   Owners then contact the sitter that they  have chosen with a  booking  request for specific dates.

●    If the sitter is available and approves the request, it will result in a created booking.

●   The  sitter will finally  provide the  service at the  requested date and complete the booking.  The owner also has the option to cancel anytime prior to the completion of the booking, resulting in an uncompleted booking. Alternatively, the sitter can decline the booking request or not respond which results in an expired and also uncompleted booking.

To    understand     this     process    more,     you     can    visit    the     Mad    Paws     website (https://www.madpaws.com.au/) or download the app to search up pet sitters.

Mad  Paws wishes to understand factors that drive booking completion between pet owners  and  pet sitters.   By  better  understanding  these  relationships,  Mad  Paws  can provide pet owners and their pets with a better experience by ranking and matching them with the best sitters that are most suitable for their needs.

Problem Description

You have been provided with a dataset from Mad Paws that summarises features relating to pet owners, pet sitters, and requested bookings (see ‘ Data Description’ for more detail).

In your assignment, you will:

●    As  a  business analyst, do a  preliminary Exploratory Data Analysis (EDA) over the dataset. You are expected to find or reveal all possible properties, characteristics, patterns, and statistics hidden in the datasets. The results from your EDA may be used for the final goal of identifying the top attributes that are likely to predict whether a booking request will result in a completed booking. You may consider investigating other  booking  outcomes  (e.g.  expired  bookings,  declined  bookings  etc.), but  the focus is to determine which factors lead to a booking request between owner and sitter to result in being completed versus not completed.

Note that Mad Paws is already aware of some obvious factors (in the Sitter Stats Basic Table) resulting in completed bookings such as sitter completion rates or number of bookings a sitter has completed with an owner before. As such, they are particularly interested in drivers of completed bookings that may not seem so obvious. Therefore, while you are still encouraged to analyse  these  obvious statistics  (as  they may be  useful  features  for increasing model performance), please also make sure  to  analyse  the less obvious factors in addition. For example, one hypothesis you may consider is whether some owners match better with sitters that focus  on specific breeds  etc? Also take care to avoid data leakage when using the pre-computed  statistics from   the Sitter  Stats  Basic   Table. You  can  use the timestamp corresponding to each statistic to avoid including any future data.

●   Synthesise your potential insights from the EDA and construct a model which can predict whether  a  booking  is  completed  versus not completed (where  a  not completed booking could be a result of an expired or declined booking).

You  will   need  to   build  a  model  with  whatever   machine   learning  approaches  you  feel appropriate. You should evaluate your model/s on a range of metrics, however, the F1 score (defined below) will be used to evaluate the performance of your final model on the test data. You  should  follow  an  industry  recognised  approach  to  Data  Science  problems  (e.g. CRISP-DM) and include a justification for your selected model.  You will be required to show the methods you used to prioritise your potential insights and defend the models and results with supporting evidence. You will also be required to submit your retention predictions on the test data.

Important:

1)   Please  use the  pre-splitted training and testing set. Your evaluation  metrics on the test set are important.

2)   Please  consider  which variables  are not available at the time of fare look up, and exclude those as  predictor variables  (because  in  real  life,  your  model won’t have them available when making predictions!). You can read more about data leakage here: https://www.kaggle.com/code/alexisbcook/data-leakage

●    Based on your analysis, deliver recommendations for Mad Paws, to take advantage of the key behaviours and attributes that can help increase bookings.

Specifically,  the  deliverable  for  this  group  project  will  be  an  algorithm  that  is  capable  of predicting likelihood of booking completion between an owner and sitter. As a final task, you will  be  required  to present your  algorithm,  your  insights  relating  to characteristics  of highly ranked  sitters,  as  well  as  estimates  relating  to  the increased bookings  from implementing your  algorithm  and/or   recommendations,  supported   by  your  analysis and/or any assumptions.

Please limit the number of recommended projects to 1-2.   Also note that it is ideal for groups to recommend deployment of their model, however groups can also leverage model insights for  recommendations, as  long as the recommendations are closely linked to the insights and not overly general in nature (e.g. general app redevelopment or event).

Data Description

The Mad Paws dataset is sorted into 6 csv files:

Training Reponse: ~55.3k rows for the training data where each row contains a pet owner and sitter pair, the booking created date and the response (1 for completed booking and 0 for uncompleted booking).

Test Response: ~17.2k rows for the testing data. The features provided are the same as the Training Reponse table.

Sitter Stats Basic: ~76.9k rows containing basic statistics for a sample of pet sitters such as their completion rate, response rate, total reviews etc. The statistics also have a corresponding timestamp in each row to indicate when the statistic was calculated.

Owners Pet Detail: ~6.3k rows containing details of the pet owner and their pet such as the owner’s location, number of pets they have, their pet breed, weight etc.

Sitter Locations: ~22.9k rows containing the location (suburb and postcode) of each pet sitter.

Booking Data: ~1.2 million rows where each row is a unique booking between a pet owner and sitter; provides information related to the time of booking, pet service requested, booking status and pet details.