Machine Learning Foundations: A Case Study Approach Quiz Answer

Coursera was launched in 2012 by Daphne Koller and Andrew Ng with the goal of giving life-changing learning experiences to students all around the world. In the modern day, Coursera is a worldwide online learning platform that provides anybody, anywhere with access to online courses and degrees from top institutions and corporations.

Join Now

Week 1: Machine Learning Foundations: A Case Study Approach Quiz Answer

Quiz 1: S Frames

Q 1:Download the Wiki People SFrame. Then open a new Jupyter notebook, import TuriCreate, and read the SFrame data.

Answer: Click here

Q 2: How many rows are in the SFrame? (Do NOT use commas or periods.)

Answer: 59071

Q 3: Which name is in the last row?

Conradign Netzer
Cthy Caruth
Fawaz Damrah

Q 4: Read the text column for Harpdog Brown. He was honored with:

A Grammy award for his latest blues album.
A gold harmonica to recognize his innovative playing style.
A lifetime membership in the Hamilton Blues Society.

Q 5: Sort the SFrame according to the text column, in ascending order. What is the name entry in the first row?

Zygfryd Szo
Digby Morrell
007 James Bond
108 (artist)
8 Ball Aitken

Week 2: Machine Learning Foundations: A Case Study Approach Quiz Answer

Quiz 1: Regression

Q 1: Which figure represents an overfitted model?

Answer:

<image: https://d3c33hcgiwev3.cloudfront.net/imageAssetProxy.v1/czbfW1vMEeWVtgr31Ad8Fw_e76f287b4f43f46f9afd6a29ccae1ead_Reg1a.png?expiry=1658620800000&hmac=x156QPF_Btk-yNR45SL6zO9sIUjsN6f2nMLs2tBIhGM>

Q 2: True or false: The model that best minimizes training error is the one that will perform best for the task of prediction on new data.

True
False

Q 3: The following table illustrates the results of evaluating 4 models with different parameter choices on some data set. Which of the following models fits this data the best?

Model index	Parameters (intercept, slope)	Residual sum of squares (RSS)
1	(0,1.4)	20.51
2	(3.1,1.4)	15.23
3	(2.7, 1.9)	13.67
4	(0, 2.3)	18.99

Model 1
Model 2
Model 3
Model 4

Q 4: Assume we fit the following quadratic function: f(x) = w0+w1*x+w2*(x^2) to the dataset shown (blue circles). The fitted function is shown by the green curve in the picture below. Out of the 3 parameters of the fitted function (w0, w1, w2), which ones are estimated to be 0? (Note: you must select all parameters estimated as 0 to get the question correct.)

<image: https://d3c33hcgiwev3.cloudfront.net/imageAssetProxy.v1/8CcG-lvREeWzLwrzeFOkAw_5c39244e7608d47a3a43d6019c0df631_Reg4a.png?expiry=1658620800000&hmac=rX__IhYjQyRZNqr6-abWP3aLnbB2EsxqlnbGVuNvEbE>

w0
w1
w2
none of the above;

Q 5: Assume we fit the following quadratic function: f(x) = w0+w1*x+w2*(x^2) to the dataset shown (blue circles). The fitted function is shown by the green curve in the picture below. Out of the 3 parameters of the fitted function (w0, w1, w2), which ones are estimated to be 0? (Note: you must select all parameters estimated as 0 to get the question correct.)

<image: https://d3c33hcgiwev3.cloudfront.net/imageAssetProxy.v1/Em2X3FvSEeWMhg7baGhc3w_3187d5cb269bf4e998d6f92493793a88_Reg4b.png?expiry=1658620800000&hmac=d8-AWZlSVy2LL00Fel3bLIJxrGraECEA1wnf176E_bs>

w0
w1
w2
none of the above

Q 6: Assume we fit the following quadratic function: f(x) = w0+w1*x+w2*(x^2) to the dataset shown (blue circles). The fitted function is shown by the green curve in the picture below. Out of the 3 parameters of the fitted function (w0, w1, w2), which ones are estimated to be 0? (Note: you must select all parameters estimated as 0 to get the question correct.)

<image: https://d3c33hcgiwev3.cloudfront.net/imageAssetProxy.v1/LD5UH1vSEeWhtQ48PjS6Pw_9ac59a77ea836dd248a38ebde9f2d11f_Reg4c.png?expiry=1658620800000&hmac=tqED_QOOZtkR1F5aQCVc3pgO0Xu2HVtDF9_NMbvR1u0>

w0
w1
w2
none of the above

Q 7: Assume we fit the following quadratic function: f(x) = w0+w1*x+w2*(x^2) to the dataset shown (blue circles). The fitted function is shown by the green curve in the picture below. Out of the 3 parameters of the fitted function (w0, w1, w2), which ones are estimated to be 0? (Note: you must select all parameters estimated as 0 to get the question correct.)

<image: https://d3c33hcgiwev3.cloudfront.net/imageAssetProxy.v1/RIlaI1vSEeWVtgr31Ad8Fw_b7f7b633af94820bc5992c6975d8dc4d_Reg4d.png?expiry=1658620800000&hmac=XBE_l2ZDn-T0jJQGlWbqmy6Hp4yVaI4vsp1nz2zLEn4>

w0
w1
w2
none of the above;

Q 8: Which of the following plots would you not expect to see as a plot of training and test error curves?

Answer:

<image: https://d3c33hcgiwev3.cloudfront.net/imageAssetProxy.v1/yCCWHFvNEeWSuhJSxsy6bQ_33196504673be40e26a66fe9994b80f7_Reg5b.png?expiry=1658620800000&hmac=b0scdHXjrxJqdfyS4DxmTDVGOqfpnSuVk4xd9Trs4dQ>

Q 9: True or false: One always prefers to use a model with more features since it better captures the true underlying process.

True
False

Quiz 2: Predicting house prices

Q 1: Selection and summary statistics: We found the zip code with the highest average house price. What is the average house price of that zip code?

$75,000
$7,700,000
$540,088
$2,160,607;

Q 2: Filtering data: What fraction of the houses have living space between 2000 sq.ft. and 4000 sq.ft.?

Between 0.2 and 0.29
Between 0.3 and 0.39
Between 0.4 and 0.49
Between 0.5 and 0.59
Between 0.6 and 0.69

Q 3: Building a regression model with several more features: What is the difference in RMSE between the model trained with my_features and the one trained with advanced_features?

the RMSE of the model with advanced_features lower by less than $25,000
the RMSE of the model with advanced_features lower by between $25,001 and $35,000
the RMSE of the model with advanced_features lower by between $35,001 and $45,000
the RMSE of the model with advanced_features lower by between $45,001 and $55,000
the RMSE of the model with advanced_features lower by more than $55,000

Week 3: Machine Learning Foundations: A Case Study Approach Quiz Answer

Quiz 1: Classification;

Q 1: The simple threshold classifier for sentiment analysis described in the video (check all that apply):

Must have pre-defined positive and negative attributes
Must either count attributes equally or pre-define weights on attributes
Defines a possibly non-linear decision boundary

Q 2: For a linear classifier classifying between “positive” and “negative” sentiment in a review x, Score(x) = 0 implies (check all that apply):

The review is very clearly “negative”
We are uncertain whether the review is “positive” or “negative”
We need to retrain our classifier because an error has occurred

Q 3: For which of the following datasets would a linear classifier perform perfectly?

Answer:

<image: https://d3c33hcgiwev3.cloudfront.net/imageAssetProxy.v1/D_IigVvQEeWVtgr31Ad8Fw_267aaadfe8ea97a30533a6712d23b0de_Class3b.png?expiry=1658620800000&hmac=OnudUFhXcrzCU-3gBveA322w4shtxUpBhScnIFSD1rE>

Q 4: True or false: High classification accuracy always indicates a good classifier.

True;
False

Q 5: True or false: For a classifier classifying between 5 classes, there always exists a classifier with accuracy greater than 0.18.

True
False

Q 6: True or false: A false negative is always worse than a false positive.

True
False

Q 7: Which of the following statements are true? (Check all that apply)

Test error tends to decrease with more training data until a point, and then does not change (i.e., curve flattens out)
Test error always goes to 0 with an unboundedly large training dataset
Test error is never a function of the amount of training data

Quiz 2: Analyzing product sentiment;

Q 1: Out of the 11 words in selected_words, which one is most used in the reviews in the dataset?

awesome
love
hate
bad
great

Q 2: Out of the 11 words in selected_words, which one is least used in the reviews in the dataset?

wow
amazing
terrible
awful
love

Q 3: Out of the 11 words in selected_words, which one got the most positive weight in the selected_words_model?

(Tip: when printing the list of coefficients, make sure to use print_rows(rows=12) to print ALL coefficients.)

amazing;
awesome
love
fantastic
terrible

Question 4: Out of the 11 words in selected_words, which one got the most negative weight in the selected_words_model?

(Tip: when printing the list of coefficients, make sure to use print_rows(rows=12) to print ALL coefficients.)

horrible
terrible
awful
hate
love

Q 5: Which of the following ranges contains the accuracy of the selected_words_model on the test_data?

0.811 to 0.841
0.841 to 0.871
0.871 to 0.901
0.901 to 0.931

Q 6: Which of the following ranges contains the accuracy of the sentiment_model in the IPython Notebook from lecture on the test_data?

0.811 to 0.841
0.841 to 0.871
0.871 to 0.901
0.901 to 0.931

Q 7: Which of the following ranges contains the accuracy of the majority class classifier, which simply predicts the majority class on the test_data?

0.811 to 0.843
0.843 to 0.871
0.871 to 0.901
0.901 to 0.931;

Q 8: How do you compare the different learned models with the baseline approach where we are just predicting the majority class?

They all performed about the same.
The model learned using all words performed much better than the one using the only the selected_words. And, the model learned using the selected_words performed much better than just predicting the majority class.
The model learned using all words performed much better than the other two. The other two approaches performed about the same.
Predicting the simply majority class performed much better than the other two models.

Q 9: Which of the following ranges contains the ‘predicted_sentiment’ for the most positive review for ‘Baby Trend Diaper Champ’, according to the sentiment_model from the IPython Notebook from lecture?

Below 0.7
0.7 to 0.8
0.8 to 0.9
0.9 to 1.0

Q 10: Consider the most positive review for ‘Baby Trend Diaper Champ’ according to the sentiment_model from the IPython Notebook from lecture. Which of the following ranges contains the predicted_sentiment for this review, if we use the selected_words_model to analyze it?;

Below 0.7
0.7 to 0.8
0.8 to 0.9
0.9 to 1.0

Q 11: Why is the value of the predicted_sentiment for the most positive review found using the sentiment_model much more positive than the value predicted using the selected_words_model?

The sentiment_model is just too positive about everything.
The selected_words_model is just too negative about everything.
This review was positive, but used too many of the negative words in selected_words.
None of the selected_words appeared in the text of this review.

Week 4: Machine Learning Foundations: A Case Study Approach Quiz Answer

Quiz 1: Clustering and Similarity;

Q 1:A country, called Simpleland, has a language with a small vocabulary of just “the”, “on”, “and”, “go”, “round”, “bus”, and “wheels”. For a word count vector with indices ordered as the words appear above, what is the word count vector for a document that simply says “the wheels on the bus go round and round.”

Please enter the vector of counts as follows: If the counts were [“the”=1, “on”=3, “and”=2, “go”=1, “round”=2, “bus”=1, “wheels”=1], enter 1321211.

Answer: 21112111

Question 2: In Simpleland, a reader is enjoying a document with a representation: [1 3 2 1 2 1 1]. Which of the following articles would you recommend to this reader next?

[7 0 2 1 0 0 1]
[1 7 0 0 2 0 1]
[1 0 0 0 7 1 2]
[0 2 0 0 7 1 1]

Question 3: A corpus in Simpleland has 99 articles. If you pick one article and perform 1-nearest neighbor search to find the closest article to this query article, how many times must you compute the similarity between two articles?

98;
98*2 = 196
98/2 = 49
(98)^2
99

Question 4: For the TF-IDF representation, does the relative importance of words in a document depend on the base of the logarithm used? For example, take the words “bus” and “wheels” in a particular document. Is the ratio between the TF-IDF values for “bus” and “wheels” different when computed using log base 2 versus log base 10?

Yes
No

Question 5:Which of the following statements are true? (Check all that apply):

Deciding whether an email is spam or not spam using the text of the email and some spam / not spam labels is a supervised learning problem.
Dividing emails into two groups based on the text of each email is a supervised learning problem.
If we are performing clustering, we typically assume we either do not have or do not use class labels in training the model.

Question 6: Which of the following pictures represents the best k-means solution? (Squares represent observations, plus signs are cluster centers, and colors indicate assignments of observations to cluster centers.)

Answer

<image: https://d3c33hcgiwev3.cloudfront.net/imageAssetProxy.v1/AW6FxVvVEeWzLwrzeFOkAw_3e7caa843845e525f9275753265c0900_Clust5b.png?expiry=1658620800000&hmac=XDtBpsTCunhlQ9O9-DRPncW6PNGZ83Dd9PQFRx1O-Go>

Quiz 2: Retrieving Wikipedia articles;

Q 1: Top word count words for Elton John

(the, john, singer)
(england, awards, musician)
(the, in, and)
(his, the, since)
(rock, artists, best)

Question 2: Top TF-IDF words for Elton John

(furnish,elton,billboard)
(john,elton,fivedecade)
(the,of,has)
(awards,rock,john)
(elton,john,singer)

Question 3: The cosine distance between ‘Elton John’s and ‘Victoria Beckham’s articles (represented with TF-IDF) falls within which range?

0.1 to 0.29;
0.3 to 0.49
0.5 to 0.69
0.7 to 0.89
0.9 to 1.0

Question 4: The cosine distance between ‘Elton John’s and ‘Paul McCartney’s articles (represented with TF-IDF) falls within which range?

0.1 to 0.29
0.3 to 0.49
0.5 to 0.69
0.7 to 0.89
0.9 to 1

Question 5: Who is closer to ‘Elton John’, ‘Victoria Beckham’ or ‘Paul McCartney’?

Victoria Beckham
Paul McCartney

Question 6: Who is the nearest cosine-distance neighbor to ‘Elton John’ using raw word counts?;

Billy Joel
Cliff Richard
Roger Daltrey
George Bush

Question 7: Who is the nearest cosine-distance neighbor to ‘Elton John’ using TF-IDF?

Roger Daltrey
Rod Stewart
Tommy Haas
Elvis Presley

Question 8: Who is the nearest cosine-distance neighbor to ‘Victoria Beckham’ using raw word counts?

Stephen Dow Beckham
Louis Molloy
Adrienne Corri
Mary Fitzgerald (artist);

Question 9: Who is the nearest cosine-distance neighbor to ‘Victoria Beckham’ using TF-IDF?

Mel B
Caroline Rush
David Beckham
Carrie Reichardt

Week 5: Machine Learning Foundations: A Case Study Approach Quiz Answer

Quiz 1: Recommender Systems

Q 1: Recommending items based on global popularity can (check all that apply):

provide personalization
capture context (e.g., time of day)
none of the above

Question 2: Recommending items using a classification approach can (check all that apply):;

provide personalization
capture context (e.g., time of day)
none of the above

Question 3:Recommending items using a simple count based co-occurrence matrix can (check all that apply):

provide personalization
capture context (e.g., time of day)
none of the above

Question 4:Recommending items using featurized matrix factorization can (check all that apply):

provide personalization
capture context (e.g., time of day)
none of the above

Question 5:Normalizing co-occurrence matrices is used primarily to account for:

people who purchased many items
items purchased by many;
eliminating rare products
none of the above

Question 6: A store has 3 customers and 3 products. Below are the learned feature vectors for each user and product. Based on this estimated model, which product would you recommend most highly to User #2?

User ID	Feature vector
1	(1.73, 0.01, 5.22)
2	(0.03, 4.41, 2.05)
3	(1.13, 0.89, 3.76)
Product ID	Feature vector
1	(3.29, 3.44, 3.67)
2	(0.82, 9.71, 3.88)
3	(8.34, 1.72, 0.02)

Product #1
Product #2
Product #3;

Question 7: For the liked and recommended items displayed below, calculate the recall and round to 2 decimal points. (As in the lesson, green squares indicate recommended items, magenta squares are liked items. Items not recommended are grayed out for clarity.) Note: enter your answer in American decimal format (e.g. enter 0.98, not 0,98)

<image: https://d3c33hcgiwev3.cloudfront.net/imageAssetProxy.v1/C0Ri1FvZEeWMhg7baGhc3w_290d82e965c33e663968151f43a71743_Rec8.png?expiry=1658620800000&hmac=ro8CVcehdzhMoZDhUaIZXJOqieK7dJ0XcGNb2DHCFzw>

Answer: 0.33

Question 8: For the liked and recommended items displayed below, calculate the precision and round to 2 decimal points. (As in the lesson, green squares indicate recommended items, magenta squares are liked items. Items not recommended are grayed out for clarity.) Note: enter your answer in American decimal format (e.g. enter 0.98, not 0,98)

<image: https://d3c33hcgiwev3.cloudfront.net/imageAssetProxy.v1/QkZrJ1vZEeWZgBLZEKssZQ_f80562a68423c8ffe11565327abee8c8_Rec8.png?expiry=1658620800000&hmac=wdW97z3_apaxidVHhNYrLVtPmk6ryAf1fNgOSyvdLjw>

Answer: 0.25

Question 9: Based on the precision-recall curves in the figure below, which recommender would you use?

<image: https://d3c33hcgiwev3.cloudfront.net/imageAssetProxy.v1/JaMj1VvYEeWSuhJSxsy6bQ_648fbff528d436fc414fd485af5cb56d_Rec9.png?expiry=1658620800000&hmac=TdvA-JDmDM9SVzTbUD9UEMPc-crG42GgkFl6spDyve8>

RecSys #1
RecSys #2
RecSys #3

Quiz 2: Recommending songs

Question 1: Which of the artists below have had the most unique users listening to their songs?

Kanye West
Foo Fighters
Taylor Swift
Lady GaGa

Question 2: Which of the artists below is the most popular artist, the one with highest total listen_count, in the data set?

Taylor Swift
Kings of Leon
Coldplay
Lady GaGa

Question 3: Which of the artists below is the least popular artist, the one with smallest total listen_count, in the data set?

William Tabbert
Velvet Underground & Nico
Kanye West
The Cool Kids;

Week 6: Machine Learning Foundations: A Case Study Approach Quiz Answer

Quiz 1: Deep Learning

Question 1: Which of the following statements are true? (Check all that apply)

Linear classifiers are never useful, because they cannot represent XOR.
Linear classifiers are useful, because, with enough data, they can represent anything.
Having good non-linear features can allow us to learn very accurate linear classifiers.
none of the above

Question 2: A simple linear classifier can represent which of the following functions? (Check all that apply)

x1 OR x2 OR NOT x3
x1 AND x2 AND NOT x3
x1 OR (x2 AND NOT x3)
none of the above

Question 3: Which of the the following neural networks can represent the following function? Select all that apply.

(x1 AND x2) OR (NOT x1 AND NOT x2)

Answer:

<image: https://d3c33hcgiwev3.cloudfront.net/imageAssetProxy.v1/FG_Wy1vaEeWhtQ48PjS6Pw_d8ed3b37fc1e16f793f6a3c7fbb1531b_Deep3d.png?expiry=1658620800000&hmac=Y13fXXF0RyLZ9QsOvSEhdLZ25HwPcUk6Ek3VVhjTCMs>

Question 4: Which of the following statements is true? (Check all that apply)

Features in computer vision act like local detectors.
Deep learning has had impact in computer vision, because it’s used to combine all the different hand-created features that already exist.
By learning non-linear features, neural networks have allowed us to automatically learn detectors for computer vision.
none of the above

Question 5: If you have lots of images of different types of plankton labeled with their species name, and lots of computational resources, what would you expect to perform better predictions:

a deep neural network trained on this data.
a simple classifier trained on this data, using deep features as input, which were trained using ImageNet data.

Question 6: If you have a few images of different types of plankton labeled with their species name, what would you expect to perform better predictions:

a deep neural network trained on this data.
a simple classifier trained on this data, using deep features as input, which were trained using ImageNet data.

Quiz 2: Deep features for image retrieval

Question 1: What’s the least common category in the training data?

bird
dog
cat
automobile

Question 2: Of the images below, which is the nearest ‘cat’ labeled image in the training data to the the first image in the test data (image_test[0:1])?

Answer:

<image: https://d3c33hcgiwev3.cloudfront.net/imageAssetProxy.v1/xlEzz2DcEeW67AqL8VPUFQ_b58f25deeeb2bb4b4603fee6597ad3fd_cat_correct.png?expiry=1658620800000&hmac=Gn9tCJyaaZlS-Yj4IBx711HGqJQvdOTiJwrmA1cfM-I>

Question 3: Of the images below, which is the nearest ‘dog’ labeled image in the training data to the the first image in the test data (image_test[0:1])?

Answer:

<image: https://d3c33hcgiwev3.cloudfront.net/imageAssetProxy.v1/2KmNYGDcEeWSthLJWZH1gw_302e98a3196d8bf12bf7be8950ad77dd_dog_correct.png?expiry=1658620800000&hmac=MwbQ389JZJvXqH8bPWBjWmZJa-z7vdqxsEXShL2XYCI>

Question 4: :For the first image in the test data, in what range is the mean distance between this image and its 5 nearest neighbors that were labeled ‘cat’ in the training data?

33 to 35
35 to 37
37 to 39
39 to 41
Above 41

Question 5: For the first image in the test data, in what range is the mean distance between this image and its 5 nearest neighbors that were labeled ‘dog’ in the training data?

33 to 35
35 to 37;
37 to 39
39 to 41
Above 41

Question 6: On average, is the first image in the test data closer to its 5 nearest neighbors in the ‘cat’ data or in the ‘dog’ data?

cat
dog

Question 7: In what range is the accuracy of the 1-nearest neighbor classifier at classifying ‘dog’ images from the test set?

50 to 60
60 to 70
70 to 80
80 to 90
90 to 100

Review:

Based on our knowledge, we urge you to enroll in this course so you can pick up new skills from specialists. It will be worthwhile, we trust.

Week 1: Machine Learning Foundations: A Case Study Approach Quiz Answer

Quiz 1: S Frames

Week 2: Machine Learning Foundations: A Case Study Approach Quiz Answer

Quiz 1: Regression

Quiz 2: Predicting house prices

Week 3: Machine Learning Foundations: A Case Study Approach Quiz Answer

Quiz 1: Classification;

Quiz 2: Analyzing product sentiment;

Week 4: Machine Learning Foundations: A Case Study Approach Quiz Answer

Quiz 1: Clustering and Similarity;

Quiz 2: Retrieving Wikipedia articles;

Week 5: Machine Learning Foundations: A Case Study Approach Quiz Answer

Quiz 1: Recommender Systems

Quiz 2: Recommending songs

Week 6: Machine Learning Foundations: A Case Study Approach Quiz Answer

Quiz 1: Deep Learning

Quiz 2: Deep features for image retrieval

Leave a Comment Cancel reply