Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

AD654: Marketing Analytics

Assignment V:  Handling Time Series Data & Modeling with an Interaction Term

Once you have completed this assignment, you will upload two files into Blackboard:   ideally, that will be one .ipynb plus one PDF.

For any question that asks you to perform some particular task, you just need to show your input  and  output  in  Jupyter  Notebook or  Colab.     Tasks  will  always  be  written  in  regular, non-italicized font.

For any question that asks you to include interpretation, write your answer in a Markdown cell in Jupyter  Notebook. Any homework question that needs interpretation will be written in italicized font. Do  not  simply  write  your  answer  in  a  code  cell  as  a  comment,  but use  a Markdown cell instead.

Remember to be resourceful!  There are many helpful resources available to you, including the video library, the class slides, the recitation sessions, the Zoom office hours sessions, and the web.

Part I:  Working with Time Series Data (4 points)

A.  Pick any publicly-traded company that trades on the Nasdaq or the NYSE.

a.   What company did you select, and what is itsticker symbol?

B.   Go to Yahoo! Finance:  finance.yahoo.com.   Enter your company’s ticker symbol in the search bar near the top of your screen.   Next, click on “Historical Data” and then “Download This will  automatically download a .csv with one year’s  worth of the company’s data onto your computer.

C.  Bring the dataset into your environment.   For this  step,  bring the  dataset into your  environment  using  read_csv()  from pandas -- but now,  add some extra parameters to that function:  index_col= ‘Date’ and parse_dates=True.

a.   Use the head() function to explore the variables, and show your results.

b.  Next, call the info() function on your dataset, and show your results.

D. Is this dataframe indexed by time values? How do you know this?

E.   In your Jupyter Notebook, view the index attribute of your time series.

a.   Now, view the max and min value of your index attribute.

b.   Now, view the argmax and argmin values of your index attribute.

c. What do the results of max, min,argmax, and argmin represent?

F.    Let’s visualize the entire time series.

a.   First, just call .plot() on your dataframe object.

i. Describe what you see here.   Why is this a challenging graph to interpret? What would make it easier to understand?

b.   Now,  re-run the  .plot() function,  but this time, call that function on the ‘Close’ variable only.

i.     Now,  in a couple of sentences, describe what you see.   Why is this graph more easily interpretable than the one you plotted in the previousstep?

c.   Plotting a subset of your data

i.      Using  a slice operation, plot the daily ‘Close’ variable from your dataset for any one-month period of your choice.

ii.      Now,  show the plot you drew with the previous step, but with a new figsize, line color, and style

G.  Rolling windows

a.   Generate   a   10-period   moving  average  for  your   ‘Close’  variable,  and create  a  plot  that  overlays  this  10-period  average atop the actual  daily closing prices.

b.   Next, generate a 50-period moving average for your ‘Close’ variable, and create  a  plot  that  overlays  this  50-period  average atop the actual  daily closing prices.

c. How are your two  moving average plots different from one another? What are some pros and cons of shorter and longer moving average windows?

H.  Next, we will try something called resampling.

a.   Resample your time series so that its values are based on quarterly time periods’ mean values for ‘Close’, rather than daily periods.

i.      Plot  this   newly-resampled  time   series,  with  the   dates  on  the x-axis, and the Close values on the y-axis.

ii. Provide an example that explains why someone might care about resampling a time series. To answer this, you may use ANY example that you can think of, or discover, from any field that uses time series data (health, weather,  market forecasting, etc.)  You don’t need to perform any outside research or go toodeeply into domain knowledge here -- 3-4 thoughtful sentences are all you need.

Part II:  Marketing Mix Modeling with an Interaction Term (5 points):

For this part, we will use the dataset ad_data.csv.  This dataset can be found in Blackboard -- it is posted in the same area where this assignment prompt is posted.

We will not use a data partition here, as the model we will build is being used for explanatory purposes, rather than predictive purposes.

1.   After reading the file into your environment, the first question that you will explore here is whether there is any relationship between marketing spending and revenue.

a.   To  explore this, first  create a new variable that shows the total spending.  This variable’s value should be the sum of YouTube ad spending, Spotify ad spending, and banner ad spending.

b.   Now, find the correlation between this new total spending variable and Sales.

i.     What is the correlation between these variables?

ii. What does this correlation suggest about the relationship between total marketing spending and sales?   Why can’t we conclude from this that more ad spending leads to more sales?

c.   Next,  let’s  explore the  relationship  among the YouTube ad spending, Spotify ad spending,  and  banner  ad  spending  variables.    Examine  the  correlations  among these three variables.

i.     Are any of these correlations so high that we might not be able to use them together in a linear model?

d.   Now,  build  a  model that  uses  Sales as the outcome variable, with YouTube ad spending, Spotify ad spending, and banner ad spending as the input variables. Use the statsmodels library for this step, and all of the remaining steps here.

i.     What is the p-value of the F-Statistic for this model? What does this suggest about the model?

ii.     What are the p-values for each of the individual predictors used in this model? What does this suggest about these predictors?

e.   Build  yet  another  model  --  this time, you will  again  use  Sales as the outcome variable.   Your  inputs will  be YouTube ad  spending, Spotify ad spending, and an interaction variable for YouTube & Spotify ad spending.

i.     What do you notice about the p-values for each of these predictors?

ii.     How does the r-squared of this model compare to the r-squared of a model built to predict sales, but with only YouTube and Spotifyasinputs, but without the interaction term? What does this difference suggest about the inclusion of the interaction? (you  will  need  to  generate another model to answer this, but it won’t take long)

iii.     Demonstrate what your model would predict for a marketer using 150 units of YouTube spending and 30 units of Spotify ad spending.   What sales outcome should this marketer expect to see?

iv.     In a few sentences, how do you interpret the interaction effect between YouTube ad spending and Spotify ad spending? What  is this  effect showing us?  (no statistical jargon is required here, but this should make sense to someone who knows about marketing, but not about how to interpret a linear model).

f.    Find an example, or make up an example, of an interaction term in a model (this can  be  from  the  world  of  marketing,  or  from  anywhere  else).   A  very  good answer to the last part of this question will include some genuine reaction from you --  finding  an  example  of an interaction effect is only a ‘half-credit’ answer here.

i. In a  3-5 sentence paragraph describe what you found (or invented). What are the variables that make up this interaction? Is the effect of their interaction on the outcome positive, or negative? How do you feel about the interaction?  Does it make sense to you?  Or does it surprise you?

Part III:  Wildcard:   Google Trends (1 point)

Note: We will work on this in class in the week after Thanksgiving.

First, pick any two rival companies, teams, or organizations.

Using  Google  Trends,  perform  a  side-by-side  comparison  for  these two terms that you are using.

Where do you see a difference in regional popularity among the terms?   (Note that you could  use the US map for this purpose, or you could use a world map and make country-to-country comparisons).   Include a screenshot of the map that you used to make this comparison. What might explain some of the regional differences that you see here? You don’t need to turn this into a research project, but just a search or two might be enough to answer this question.  A few sentences will be okay here (no need for citations).

Next, take a  look  at  related  queries.    Pick at  least one  related query for each of your two terms, and explore it a bit. What is this referring to? What’s the context here? For each of the terms, again, a few sentences will be sufficient, and you do not need citations.

Finally, how could Google Trends data be interesting and useful to a marketer? In your answer, reference at least *something* from your earlier findings that came from analysis of your terms (in other words, don’t just directly copy & paste this answer from chatGPT).