Download the data file from the course website here (under week 12).
Use readRDS
function to read the file.
facebookdata_marketing <- readRDS("_GIVE_FILE PATH_/facebookdata_marketing.rds")
A manager of a retail company wants to develop a regression model to identify the effect of the following variables (see below) on the total number of likes
, comments
, and shares
on facebook posts.
month: Month the post was published (1, 2, 3, …, 12)
category: Type of the post (1 - Link, 2 - Video, 3 - Picture)
hour: Hour the post was published (0, 1, …24)
paid: If the company paid to Facebook for advertising (0 - No, 1 - Yes)
totalReach: Number of people who saw the page post (unique users).
engagedUsers: Number of people who clicked anywhere in the post (unique users).
postConsumers: Number of people who sent a direct message to the owner of the post.
postConsumption: Number of clicks anywhere in the post.
sawbyLiked: Number of people who saw the page post because they have liked that page.
clickbyLiked: Number of people who have liked the Page and clicked anywhere in the post.
Dependent variable:
smp_size <- 400
## set the seed to make your partition reproducible
set.seed(123)
train_ind <- sample(seq_len(nrow(facebookdata_marketing)), size = smp_size)
train <- facebookdata_marketing[train_ind, ]
test <- facebookdata_marketing[-train_ind, ]
Perform a thorough Exploratory Data Analysis on facebookdata_marketing.rds
.
Develop a suitable regression model to predict total interactions (The sum of “likes,” “comments,” and “shares” of the post).
Test for significance of regression. What conclusions can you draw?
Using \(t\) tests, determine the contribution of the regressors in your final model. Discuss your findings.
Plot 95% confidence interval for the regression coefficients of the model in part 2.
Is multicollinearity a potential concern in the model identified in part 2.
Use the model in part 2 to predict each observation in the test test and calculate the out-of-sample accuracy.
Prepare a brief report presenting your EDA and regression analysis.
Moro, S., Rita, P., & Vala, B. (2016). Predicting social media performance metrics and evaluation of the impact on brand building: A data mining approach. Journal of Business Research, 69(9), 3341-3351.