Coursera Practical Machine Learning All Quizzes & Answers

Let’s discuss Coursera Course Practical Machine Learning Week 1 Quiz 1 Answer with you..
 

Practical Machine Learning Quiz 1 Answer

 
Question 1)
Which of the following are steps in building a machine learning algorithm?
  • Machine learning.
  • Statistical inference.
  • Artificial intelligence.
  • Collecting data to answer the question.
 
 
Question 2)
Suppose we build a prediction algorithm on a data set and it is 100% accurate on that data set. Why might the algorithm not work well if we collect a new data set?
  • We have too few predictors to get good out of sample accuracy.
  • We have used neural networks which has notoriously bad performance.
  • We are not asking a relevant question that can be answered with machine learning.
  • Our algorithm may be overfitting the training data, predicting both the signal and the noise.
 
 
Question 3)
What are typical sizes for the training and test sets?
  • 80% training set, 20% test set.
  • 10% test set, 90% training set.
  • 90% training set, 10% test set.
  • 60% in the training set, 40% in the testing set.
 
 
Question 4)
What are some common error rates for predicting binary variables (i.e. variables with two possible values like yes/no, disease/normal, clicked/didn’t click)?
  • P-values.
  • R^2.
  • Median absolute deviation.
  • Predictive value of a positive.
 
 
Question 5)
Suppose that we have created a machine learning algorithm that predicts whether a link will be clicked with 99% sensitivity and 99% specificity. The rate the link is clicked is 1/1000 of visits to a website. If we predict the link will be clicked on a specific visit, what is the probability it will actually be clicked?
  • 9 %.
  • 50 %.
  • 99 %.
  • 90 %

 

 

Practical Machine Learning Quiz 2 Answer

Let’s spend time on Coursera Course Practical Machine Learning Week 2 Quiz 2 Answers here.
  
 
Question 1)
Load the Alzheimer’s disease data using the commands:
 
library(AppliedPredictiveModeling)
data(AlzheimerDisease)
 
Which of the following commands will create non-overlapping training
and test sets with about 50% of the observations assigned to each?
 
adData = data.frame(diagnosis,predictors)
trainIndex = createDataPartition(diagnosis, p = 0.50, list = FALSE)
training = adData[trainIndex,]
testing = adData[-trainIndex,]
 
 
adData = data.frame(diagnosis,predictors)
train = createDataPartition(diagnosis, p = 0.50,list=FALSE)
test = createDataPartition(diagnosis, p = 0.50,list=FALSE)
 
 
adData = data.frame(diagnosis,predictors)
trainIndex = createDataPartition(diagnosis, p = 0.50)
training = adData[trainIndex,]
testing = adData[-trainIndex,]
 
 
adData = data.frame(predictors)
trainIndex = createDataPartition(diagnosis, p=0.5, list=FALSE)
training = adData[trainIndex,]
testing = adData[-trainIndex,]
 
 
 
 
Question 2)
Load the cement data using the commands:
 
library(AppliedPredictiveModeling)
data(concrete)
library(caret)
set.seed(1000)
inTrain = createDataPartition(mixtures$CompressiveStrength, p =
3/4)[[1]]
training = mixtures[ inTrain,]
testing = mixtures[-inTrain,]
 
Make a plot of the outcome (CompressiveStrength) versus the index of
the samples. Color by each of the variables in the data set (you may nd the cut2() function in the Hmisc package
useful for turning continuous covariates into factors). What do you notice in these plots?
 
  • There is a non-random pattern in the plot of the outcome versus index
    that is perfectly explained by the Age variable.
  • There is a non-random pattern in the plot of the outcome versus
    index.
  • There is a non-random pattern in the plot of the outcome versus index
    that is perfectly explained by the FlyAsh variable.
  • There is a non-random pattern in the plot of the outcome versus
    index that does not appear to be perfectly explained by any predictor suggesting a variable may be
    missing.
 
 
 
Question 3)
Load the cement data using the commands:
 
library(AppliedPredictiveModeling)
data(concrete)
library(caret)
set.seed(1000)
inTrain = createDataPartition(mixtures$CompressiveStrength, p =
3/4)[[1]]
training = mixtures[ inTrain,]
testing = mixtures[-inTrain,]
 
Make a histogram and conrm the SuperPlasticizer variable is skewed.
Normally you might use the log transform to try to make the data more symmetric. Why would that be a poor choice
for this variable?
  • The log transform is not a monotone transformation of the data.
  • The log transform does not reduce the skewness of the non-zero values
    of SuperPlasticizer
  • The SuperPlasticizer data include negative values so the log
    transform can not be performed.
  • There are a large number of values that are the same and even if
    you took the log(SuperPlasticizer + 1) they would still all be identical so the distribution would not be
    symmetric.
 
 
Question 4)
Load the Alzheimer’s disease data using the commands:
 
library(caret)
library(AppliedPredictiveModeling)
set.seed(3433)
data(AlzheimerDisease)
adData = data.frame(diagnosis,predictors)
inTrain = createDataPartition(adData$diagnosis, p = 3/4)[[1]]
training = adData[ inTrain,]
testing = adData[-inTrain,]
 
Find all the predictor variables in the training set that begin with
IL. Perform principal components on these variables with the preProcess() function from the caret package. Calculate the
number of principal components needed to capture 90% of the variance. How many are there?
  • 7
  • 5
  • 10
  • 9
 
 
 
Question 5)
Load the Alzheimer’s disease data using the commands:
 
library(caret)
library(AppliedPredictiveModeling)
set.seed(3433)data(AlzheimerDisease)
adData = data.frame(diagnosis,predictors)
inTrain = createDataPartition(adData$diagnosis, p = 3/4)[[1]]training =
adData[ inTrain,]
testing = adData[-inTrain,]
 
Create a training data set consisting of only the predictors with
variable names beginning with IL and the diagnosis.
 
Build two predictive models, one using the predictors as they are and
one using PCA with principal components explaining 80% of the variance in the predictors. Use method=”glm” in
the train function.
 
What is the accuracy of each method in the test set? Which is more
accurate?
 
 
Non-PCA Accuracy: 0.72
PCA Accuracy: 0.65
 
Non-PCA Accuracy: 0.65
PCA Accuracy: 0.72
 
Non-PCA Accuracy: 0.91
PCA Accuracy: 0.93
 
Non-PCA Accuracy: 0.72

PCA Accuracy: 0.93

Practical Machine Learning Quiz 3 Answer

Some important  Coursera Course Practical Machine Learning Week 3 Quiz Answer to practice.
 

 

Question 1)
For this quiz we will be using several R packages. R package versions
change over time, the right answers have been checked using the following versions of the packages.
 
AppliedPredictiveModeling: v1.1.6
caret: v6.0.47
ElemStatLearn: v2012.04-0
pgmm: v1.1
rpart: v4.1.8
 
if you aren’t using these versions of the packages, your answers may not
exactly match the right answer, but hopefully should be close.
 
Load the cell segmentation data from the AppliedPredictiveModeling
package using the commands:
 
library(AppliedPredictiveModeling)
data(segmentationOriginal)
library(caret)
Subset the data to a training set and testing set based on the Case
variable in the data set.
 
Set the seed to 125 and fit a CART model with the rpart method using all
predictor variables and default caret settings.
 
In the final model what would be the final model prediction for cases
with the following variable values:
 
  • TotalIntench2 = 23,000; FiberWidthCh1 = 10; PerimStatusCh1 = 2.
  • TotalIntench2 = 50,000; FiberWidthCh1 = 10; VarIntenCh4 = 100.
  • TotalIntench2 = 57,000; FiberWidthCh1 = 8; VarIntenCh4 = 100.
  • FiberWidthCh1 = 8; VarIntenCh4 = 100; PerimStatusCh1 = 2.
 
Answer:
PS
WS
PS
Not possible to predict
 
 
 
Question 2)
If K is small in a K-fold cross validation is the bias in the estimate of
out-of-sample (test set) accuracy smaller or bigger?
 
If K is small is the variance in the estimate of out-of-sample (test set)
accuracy smaller or bigger. Is K large or small in leave one out cross validation?
  • The bias is smaller and the variance is bigger. Under leave one out
    cross validation K is equal to one.
  • The bias is smaller and the variance is smaller. Under leave one out
    cross validation K is equal to one.
  • The bias is smaller and the variance is smaller. Under leave one out
    cross validation K is equal to the sample size.
  • The bias is larger and the variance is smaller. Under leave one out
    cross validation K is equal to the sample size.
 
 
Question 3)
Load the olive oil data using the commands:
library(pgmm)
data(olive)
olive = olive[,-1]
 
(NOTE: If you have trouble installing the pgmm package, you can download
the -code-olive-/code- dataset here: olive_data.zip
(https://d396qusza40orc.cloudfront.net/predmachlearn/data/olive_data.zip).
After unzipping the archive, you can load the le using the -code-load()-/code- function in
R.)
 
These data contain information on 572 dierent Italian olive oils from
multiple regions in Italy. Fit a classication tree where Area is the outcome variable. Then predict the value of area for
the following data frame using the tree command with all defaults newdata = as.data.frame(t(colMeans(olive)))
 
What is the resulting prediction? Is the resulting prediction strange?
Why or why not?
  • 4.59965. There is no reason why the result is strange.
  • 2.783. There is no reason why this result is strange.
  • 2.783. It is strange because Area should be a qualitative variable –
    but tree is reporting the average value of Area as a numeric variable in the leaf predicted for newdata
  • 0.005291005 0 0.994709 0 0 0 0 0 0. The result is strange because Area
    is a numeric variable and we should get the average within each leaf.
 
 
 
Question 4)
Load the South Africa Heart Disease Data and create training and test
sets with the following code:
 
library(ElemStatLearn)
data(SAheart)
set.seed(8484)
train = sample(1:dim(SAheart)[1],size=dim(SAheart)[1]/2,replace=F)
trainSA = SAheart[train,]
testSA = SAheart[-train,]
 
Then set the seed to 13234 and t a logistic regression model
(method=”glm”, be sure to specify family=”binomial”) with Coronary Heart Disease (chd) as the outcome and age at onset,
current alcohol consumption, obesity levels, cumulative tabacco, type-A behavior, and low density lipoprotein
cholesterol as predictors. Calculate the misclassication rate for your model using this function and a prediction
on the “response” scale:
 
missClass = function(values,prediction){sum(((prediction > 0.5)*1) !=
values)/length(values)}
 
What is the misclassication rate on the training set? What is the
misclassication rate on the test set?
 
Test Set Misclassication: 0.35
Training Set: 0.31
 
Test Set Misclassication: 0.27
Training Set: 0.31
 
Test Set Misclassication: 0.31
Training Set: 0.27
 
Test Set Misclassication: 0.43
Training Set: 0.31
 
 
 
Question 5)
Load the vowel.train and vowel.test data sets:
 
library(ElemStatLearn)
data(vowel.train)
data(vowel.test)
 
Set the variable y to be a factor variable in both the training and test
set. Then set the seed to 33833. Fit a random forest predictor relating the factor variable y to the remaining
variables. Read about variable importance in random forests here:
http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#ooberr (http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#ooberr)
The caret package uses by default the Gini importance.
 
Calculate the variable importance using the varImp function in the caret
package. What is the order of variable importance?
 
 
The order of the variables is:
x.10, x.7, x.9, x.5, x.8, x.4, x.6, x.3, x.1,x.2
 
 
The order of the variables is:
x.1, x.2, x.3, x.8, x.6, x.4, x.5, x.9, x.7,x.10
 
The order of the variables is:
x.2, x.1, x.5, x.6, x.8, x.4, x.9, x.3, x.7,x.10
 
 
 
The order of the variables is:
x.2, x.1, x.5, x.8, x.6, x.4, x.3, x.9, x.7,x.10

Practical Machine Learning Quiz 4 Answer

In this article i am gone to share Coursera
Course Practical Machine Learning Week 4 Quiz Answer with
you..
 
 
 
 
Question 1)
For this quiz we will be using several R packages. R package versions
change over time, the right answers have been checked using the following versions of the packages.
 
AppliedPredictiveModeling: v1.1.6
caret: v6.0.47
 
ElemStatLearn: v2012.04-0
 
pgmm: v1.1
 
rpart: v4.1.8
 
gbm: v2.1
 
lubridate: v1.3.3
 
forecast: v5.6
 
e1071: v1.6.4
 
 
If you aren’t using these versions of the packages, your answers may
not exactly match the right answer, but hopefully should be close.
 
Load the vowel.train and vowel.test data sets:
 
-code--code-library(ElemStatLearn)
 
data(vowel.train)
 
data(vowel.test)
 
-/code--/code
 
 
Set the variable y to be a factor variable in both the training and
test set. Then set the seed to 33833. Fit (1) a random forest predictor relating the factor variable y to the remaining
variables and (2) a boosted predictor using the “gbm” method. Fit these both with the train() command in the caret
package.
 
What are the accuracies for the two approaches on the test data set?
What is the accuracy among the test set samples where the two methods agree?
 
RF Accuracy = 0.9987
GBM Accuracy = 0.5152
Agreement Accuracy = 0.9985
 
RF Accuracy = 0.6082
GBM Accuracy = 0.5152
Agreement Accuracy = 0.5152
 
RF Accuracy = 0.9881
GBM Accuracy = 0.8371
Agreement Accuracy = 0.9983
 
RF Accuracy = 0.6082
GBM Accuracy = 0.5152
Agreement Accuracy = 0.6361
 
 
 
Question 2)
Load the Alzheimer’s data using the following commands
 
-code-library(caret)
library(gbm)
set.seed(3433)
library(AppliedPredictiveModeling)
data(AlzheimerDisease)
adData = data.frame(diagnosis,predictors)
inTrain = createDataPartition(adData$diagnosis, p = 3/4)[[1]]
training = adData[ inTrain,]
testing = adData[-inTrain,]
-/code-
 
Set the seed to 62433 and predict diagnosis with all the other
variables using a random forest (“rf”), boosted trees (“gbm”) and linear discriminant analysis (“lda”) model. Stack the
predictions together using random forests (“rf”). What is the resulting accuracy on the test set? Is it better or worse than
each of the individual predictions?
 
  • Stacked Accuracy: 0.80 is better than all three other methods
  • Stacked Accuracy: 0.88 is better than all three other methods
  • Stacked Accuracy: 0.80 is worse than all the other methods
  • Stacked Accuracy: 0.80 is better than random forests and lda and
    the same as boosting.
 
 
Question 3)
Load the concrete data with the commands:
 
-code-
set.seed(3523)
library(AppliedPredictiveModeling)
data(concrete)
inTrain = createDataPartition(concrete$CompressiveStrength, p =
3/4)[[1]]
training = concrete[ inTrain,]
testing = concrete[-inTrain,]
-/code-
 
Set the seed to 233 and t a lasso model to predict Compressive
Strength. Which variable is the last coecient to be set to zero as the penalty increases? (Hint: it may be useful to look
up ?plot.enet).
  • Cement
  • Water
  • Age
  • CoarseAggregate
 
 
 
Question 4)
Load the data on the number of visitors to the instructors blog from
here:
https://d396qusza40orc.cloudfront.net/predmachlearn/gaData.csv
(https://d396qusza40orc.cloudfront.net/predmachlearn/gaData.csv)
 
Using the commands:
 
-code-library(lubridate) # For year() function below
dat = read.csv("~/Desktop/gaData.csv")
training = dat[year(dat$date) < 2012,]
testing = dat[(year(dat$date)) > 2011,]
tstrain = ts(training$visitsTumblr)
-/code-
 
Fit a model using the bats() function in the forecast package to the
training time series. Then forecast this model for the remaining time points. For how many of the testing points is the
true value within the 95% prediction interval bounds?
  • 96%
  • 94%
  • 100%
  • 92%
 
Question 5)
Load the concrete data with the commands:
 
-codeset.seed(3523)
library(AppliedPredictiveModeling)
data(concrete)
inTrain = createDataPartition(concrete$CompressiveStrength, p =
3/4)[[1]]
training = concrete[ inTrain,]
testing = concrete[-inTrain,]
-/code-
 
Set the seed to 325 and t a support vector machine using the e1071
package to predict Compressive Strength using the default settings. Predict on the testing set. What is the
RMSE?
  • 6.93
  • 45.09
  • 6.72
  • 11543.39

Leave a Comment