Machine Learning Foundations: A Case Study Approach Quiz Answer

Coursera was launched in 2012 by Daphne Koller and Andrew Ng with the goal of giving life-changing learning experiences to students all around the world. In the modern day, Coursera is a worldwide online learning platform that provides anybody, anywhere with access to online courses and degrees from top institutions and corporations.

Join Now

Week 1: Machine Learning Foundations: A Case Study Approach Quiz Answer

Quiz 1: S Frames

Q 1:Download the Wiki People SFrame. Then open a new Jupyter notebook, import TuriCreate, and read the SFrame data.

Answer: Click here

Q 2: How many rows are in the SFrame? (Do NOT use commas or periods.)

Answer: 59071

Q 3: Which name is in the last row?

  • C​onradign Netzer
  • C​thy Caruth
  • F​awaz Damrah

Q 4: Read the text column for Harpdog Brown. He was honored with:

  • A​ Grammy award for his latest blues album.
  • A gold harmonica to recognize his innovative playing style.
  • A lifetime membership in the Hamilton Blues Society.

Q 5: Sort the SFrame according to the text column, in ascending order. What is the name entry in the first row?

  • Z​ygfryd Szo
  • D​igby Morrell
  • 0​07 James Bond
  • 108 (artist)
  • 8​ Ball Aitken

Week 2: Machine Learning Foundations: A Case Study Approach Quiz Answer

Quiz 1: Regression

Q 1: Which figure represents an overfitted model?

Answer:

<image: https://d3c33hcgiwev3.cloudfront.net/imageAssetProxy.v1/czbfW1vMEeWVtgr31Ad8Fw_e76f287b4f43f46f9afd6a29ccae1ead_Reg1a.png?expiry=1658620800000&hmac=x156QPF_Btk-yNR45SL6zO9sIUjsN6f2nMLs2tBIhGM>

Q 2: True or false: The model that best minimizes training error is the one that will perform best for the task of prediction on new data.

  • True
  • False

Q 3: The following table illustrates the results of evaluating 4 models with different parameter choices on some data set. Which of the following models fits this data the best?

Model index Parameters (intercept, slope) Residual sum of squares (RSS)
1 (0,1.4) 20.51
2 (3.1,1.4) 15.23
3 (2.7, 1.9) 13.67
4 (0, 2.3) 18.99
  • Model 1
  • Model 2
  • Model 3
  • Model 4

Q 4: Assume we fit the following quadratic function: f(x) = w0+w1*x+w2*(x^2) to the dataset shown (blue circles). The fitted function is shown by the green curve in the picture below. Out of the 3 parameters of the fitted function (w0, w1, w2), which ones are estimated to be 0? (Note: you must select all parameters estimated as 0 to get the question correct.)

<image: https://d3c33hcgiwev3.cloudfront.net/imageAssetProxy.v1/8CcG-lvREeWzLwrzeFOkAw_5c39244e7608d47a3a43d6019c0df631_Reg4a.png?expiry=1658620800000&hmac=rX__IhYjQyRZNqr6-abWP3aLnbB2EsxqlnbGVuNvEbE>

  • w0
  • w1
  • w2
  • none of the above;

Q 5: Assume we fit the following quadratic function: f(x) = w0+w1*x+w2*(x^2) to the dataset shown (blue circles). The fitted function is shown by the green curve in the picture below. Out of the 3 parameters of the fitted function (w0, w1, w2), which ones are estimated to be 0? (Note: you must select all parameters estimated as 0 to get the question correct.)

<image: https://d3c33hcgiwev3.cloudfront.net/imageAssetProxy.v1/Em2X3FvSEeWMhg7baGhc3w_3187d5cb269bf4e998d6f92493793a88_Reg4b.png?expiry=1658620800000&hmac=d8-AWZlSVy2LL00Fel3bLIJxrGraECEA1wnf176E_bs>

  • w0
  • w1
  • w2
  • none of the above

Q 6: Assume we fit the following quadratic function: f(x) = w0+w1*x+w2*(x^2) to the dataset shown (blue circles). The fitted function is shown by the green curve in the picture below. Out of the 3 parameters of the fitted function (w0, w1, w2), which ones are estimated to be 0? (Note: you must select all parameters estimated as 0 to get the question correct.)

<image: https://d3c33hcgiwev3.cloudfront.net/imageAssetProxy.v1/LD5UH1vSEeWhtQ48PjS6Pw_9ac59a77ea836dd248a38ebde9f2d11f_Reg4c.png?expiry=1658620800000&hmac=tqED_QOOZtkR1F5aQCVc3pgO0Xu2HVtDF9_NMbvR1u0>

  • w0
  • w1
  • w2
  • none of the above

Q 7: Assume we fit the following quadratic function: f(x) = w0+w1*x+w2*(x^2) to the dataset shown (blue circles). The fitted function is shown by the green curve in the picture below. Out of the 3 parameters of the fitted function (w0, w1, w2), which ones are estimated to be 0? (Note: you must select all parameters estimated as 0 to get the question correct.)

<image: https://d3c33hcgiwev3.cloudfront.net/imageAssetProxy.v1/RIlaI1vSEeWVtgr31Ad8Fw_b7f7b633af94820bc5992c6975d8dc4d_Reg4d.png?expiry=1658620800000&hmac=XBE_l2ZDn-T0jJQGlWbqmy6Hp4yVaI4vsp1nz2zLEn4>

  • w0
  • w1
  • w2
  • none of the above;

Q 8: Which of the following plots would you not expect to see as a plot of training and test error curves?

Answer:

<image: https://d3c33hcgiwev3.cloudfront.net/imageAssetProxy.v1/yCCWHFvNEeWSuhJSxsy6bQ_33196504673be40e26a66fe9994b80f7_Reg5b.png?expiry=1658620800000&hmac=b0scdHXjrxJqdfyS4DxmTDVGOqfpnSuVk4xd9Trs4dQ>

Q 9: True or false: One always prefers to use a model with more features since it better captures the true underlying process.

  • True
  • False

Quiz 2: Predicting house prices

Q 1: Selection and summary statistics: We found the zip code with the highest average house price. What is the average house price of that zip code?

  • $75,000
  • $7,700,000
  • $540,088
  • $2,160,607;

Q 2: Filtering data: What fraction of the houses have living space between 2000 sq.ft. and 4000 sq.ft.?

  • Between 0.2 and 0.29
  • Between 0.3 and 0.39
  • Between 0.4 and 0.49
  • Between 0.5 and 0.59
  • Between 0.6 and 0.69

Q 3: Building a regression model with several more features: What is the difference in RMSE between the model trained with my_features and the one trained with advanced_features?

  • the RMSE of the model with advanced_features lower by less than $25,000
  • the RMSE of the model with advanced_features lower by between $25,001 and $35,000
  • the RMSE of the model with advanced_features lower by between $35,001 and $45,000
  • the RMSE of the model with advanced_features lower by between $45,001 and $55,000
  • the RMSE of the model with advanced_features lower by more than $55,000

Week 3: Machine Learning Foundations: A Case Study Approach Quiz Answer

Quiz 1: Classification;

Q 1: The simple threshold classifier for sentiment analysis described in the video (check all that apply):

  • Must have pre-defined positive and negative attributes
  • Must either count attributes equally or pre-define weights on attributes
  • Defines a possibly non-linear decision boundary

Q 2: For a linear classifier classifying between “positive” and “negative” sentiment in a review x, Score(x) = 0 implies (check all that apply):

  • The review is very clearly “negative”
  • We are uncertain whether the review is “positive” or “negative”
  • We need to retrain our classifier because an error has occurred

Q 3: For which of the following datasets would a linear classifier perform perfectly?

Answer:

<image: https://d3c33hcgiwev3.cloudfront.net/imageAssetProxy.v1/D_IigVvQEeWVtgr31Ad8Fw_267aaadfe8ea97a30533a6712d23b0de_Class3b.png?expiry=1658620800000&hmac=OnudUFhXcrzCU-3gBveA322w4shtxUpBhScnIFSD1rE>

Q 4: True or false: High classification accuracy always indicates a good classifier.

  • True;
  • False

Q 5: True or false: For a classifier classifying between 5 classes, there always exists a classifier with accuracy greater than 0.18.

  • True
  • False

Q 6: True or false: A false negative is always worse than a false positive.

  • True
  • False

Q 7: Which of the following statements are true? (Check all that apply)

  • Test error tends to decrease with more training data until a point, and then does not change (i.e., curve flattens out)
  • Test error always goes to 0 with an unboundedly large training dataset
  • Test error is never a function of the amount of training data

Quiz 2: Analyzing product sentiment;

Q 1: Out of the 11 words in selected_words, which one is most used in the reviews in the dataset?

  • awesome
  • love
  • hate
  • bad
  • great

Q 2: Out of the 11 words in selected_words, which one is least used in the reviews in the dataset?

  • wow
  • amazing
  • terrible
  • awful
  • love

Q 3: Out of the 11 words in selected_words, which one got the most positive weight in the selected_words_model?

(Tip: when printing the list of coefficients, make sure to use print_rows(rows=12) to print ALL coefficients.)

  • amazing;
  • awesome
  • love
  • fantastic
  • terrible

Question 4: Out of the 11 words in selected_words, which one got the most negative weight in the selected_words_model?

(Tip: when printing the list of coefficients, make sure to use print_rows(rows=12) to print ALL coefficients.)

  • horrible
  • terrible
  • awful
  • hate
  • love

Q 5: Which of the following ranges contains the accuracy of the selected_words_model on the test_data?

  • 0.811 to 0.841
  • 0.841 to 0.871
  • 0.871 to 0.901
  • 0.901 to 0.931

Q 6: Which of the following ranges contains the accuracy of the sentiment_model in the IPython Notebook from lecture on the test_data?

  • 0.811 to 0.841
  • 0.841 to 0.871
  • 0.871 to 0.901
  • 0.901 to 0.931

Q 7: Which of the following ranges contains the accuracy of the majority class classifier, which simply predicts the majority class on the test_data?

  • 0.811 to 0.843
  • 0.843 to 0.871
  • 0.871 to 0.901
  • 0.901 to 0.931;

Q 8: How do you compare the different learned models with the baseline approach where we are just predicting the majority class?

  • They all performed about the same.
  • The model learned using all words performed much better than the one using the only the selected_words. And, the model learned using the selected_words performed much better than just predicting the majority class.
  • The model learned using all words performed much better than the other two. The other two approaches performed about the same.
  • Predicting the simply majority class performed much better than the other two models.

Q 9: Which of the following ranges contains the ‘predicted_sentiment’ for the most positive review for ‘Baby Trend Diaper Champ’, according to the sentiment_model from the IPython Notebook from lecture?

  • Below 0.7
  • 0.7 to 0.8
  • 0.8 to 0.9
  • 0.9 to 1.0

Q 10: Consider the most positive review for ‘Baby Trend Diaper Champ’ according to the sentiment_model from the IPython Notebook from lecture. Which of the following ranges contains the predicted_sentiment for this review, if we use the selected_words_model to analyze it?;

  • Below 0.7
  • 0.7 to 0.8
  • 0.8 to 0.9
  • 0.9 to 1.0

Q 11: Why is the value of the predicted_sentiment for the most positive review found using the sentiment_model much more positive than the value predicted using the selected_words_model?

  • The sentiment_model is just too positive about everything.
  • The selected_words_model is just too negative about everything.
  • This review was positive, but used too many of the negative words in selected_words.
  • None of the selected_words appeared in the text of this review.

Week 4: Machine Learning Foundations: A Case Study Approach Quiz Answer

Quiz 1: Clustering and Similarity;

Q 1:A country, called Simpleland, has a language with a small vocabulary of just “the”, “on”, “and”, “go”, “round”, “bus”, and “wheels”. For a word count vector with indices ordered as the words appear above, what is the word count vector for a document that simply says “the wheels on the bus go round and round.”

Please enter the vector of counts as follows: If the counts were [“the”=1, “on”=3, “and”=2, “go”=1, “round”=2, “bus”=1, “wheels”=1], enter 1321211.

Answer: 21112111

Question 2: In Simpleland, a reader is enjoying a document with a representation: [1 3 2 1 2 1 1]. Which of the following articles would you recommend to this reader next?

  • [7 0 2 1 0 0 1]
  • [1 7 0 0 2 0 1]
  • [1 0 0 0 7 1 2]
  • [0 2 0 0 7 1 1]

Question 3: A corpus in Simpleland has 99 articles. If you pick one article and perform 1-nearest neighbor search to find the closest article to this query article, how many times must you compute the similarity between two articles?

  • 98;
  • 98*2 = 196
  • 98/2 = 49
  • (98)^2
  • 99

Question 4: For the TF-IDF representation, does the relative importance of words in a document depend on the base of the logarithm used? For example, take the words “bus” and “wheels” in a particular document. Is the ratio between the TF-IDF values for “bus” and “wheels” different when computed using log base 2 versus log base 10?

  • Yes
  • No

Question 5:Which of the following statements are true? (Check all that apply):

  • Deciding whether an email is spam or not spam using the text of the email and some spam / not spam labels is a supervised learning problem.
  • Dividing emails into two groups based on the text of each email is a supervised learning problem.
  • If we are performing clustering, we typically assume we either do not have or do not use class labels in training the model.

Question 6: Which of the following pictures represents the best k-means solution? (Squares represent observations, plus signs are cluster centers, and colors indicate assignments of observations to cluster centers.)

Answer

<image: https://d3c33hcgiwev3.cloudfront.net/imageAssetProxy.v1/AW6FxVvVEeWzLwrzeFOkAw_3e7caa843845e525f9275753265c0900_Clust5b.png?expiry=1658620800000&hmac=XDtBpsTCunhlQ9O9-DRPncW6PNGZ83Dd9PQFRx1O-Go>

Quiz 2: Retrieving Wikipedia articles;

Q 1: Top word count words for Elton John

  • (the, john, singer)
  • (england, awards, musician)
  • (the, in, and)
  • (his, the, since)
  • (rock, artists, best)

Question 2: Top TF-IDF words for Elton John

  • (furnish,elton,billboard)
  • (john,elton,fivedecade)
  • (the,of,has)
  • (awards,rock,john)
  • (elton,john,singer)

Question 3: The cosine distance between ‘Elton John’s and ‘Victoria Beckham’s articles (represented with TF-IDF) falls within which range?

  • 0.1 to 0.29;
  • 0.3 to 0.49
  • 0.5 to 0.69
  • 0.7 to 0.89
  • 0.9 to 1.0

Question 4: The cosine distance between ‘Elton John’s and ‘Paul McCartney’s articles (represented with TF-IDF) falls within which range?

  • 0.1 to 0.29
  • 0.3 to 0.49
  • 0.5 to 0.69
  • 0.7 to 0.89
  • 0.9 to 1

Question 5: Who is closer to ‘Elton John’, ‘Victoria Beckham’ or ‘Paul McCartney’?

  • Victoria Beckham
  • Paul McCartney

Question 6: Who is the nearest cosine-distance neighbor to ‘Elton John’ using raw word counts?;

  • Billy Joel
  • Cliff Richard
  • Roger Daltrey
  • George Bush

Question 7: Who is the nearest cosine-distance neighbor to ‘Elton John’ using TF-IDF?

  • Roger Daltrey
  • Rod Stewart
  • Tommy Haas
  • Elvis Presley

Question 8: Who is the nearest cosine-distance neighbor to ‘Victoria Beckham’ using raw word counts?

  • Stephen Dow Beckham
  • Louis Molloy
  • Adrienne Corri
  • Mary Fitzgerald (artist);

Question 9: Who is the nearest cosine-distance neighbor to ‘Victoria Beckham’ using TF-IDF?

  • Mel B
  • Caroline Rush
  • David Beckham
  • Carrie Reichardt

Week 5: Machine Learning Foundations: A Case Study Approach Quiz Answer

Quiz 1: Recommender Systems

Q 1: Recommending items based on global popularity can (check all that apply):

  • provide personalization
  • capture context (e.g., time of day)
  • none of the above

Question 2: Recommending items using a classification approach can (check all that apply):;

  • provide personalization
  • capture context (e.g., time of day)
  • none of the above

Question 3:Recommending items using a simple count based co-occurrence matrix can (check all that apply):

  • provide personalization
  • capture context (e.g., time of day)
  • none of the above

Question 4:Recommending items using featurized matrix factorization can (check all that apply):

  • provide personalization
  • capture context (e.g., time of day)
  • none of the above

Question 5:Normalizing co-occurrence matrices is used primarily to account for:

  • people who purchased many items
  • items purchased by many;
  • eliminating rare products
  • none of the above

Question 6: A store has 3 customers and 3 products. Below are the learned feature vectors for each user and product. Based on this estimated model, which product would you recommend most highly to User #2?

User ID Feature vector
1 (1.73, 0.01, 5.22)
2 (0.03, 4.41, 2.05)
3 (1.13, 0.89, 3.76)
Product ID Feature vector
1 (3.29, 3.44, 3.67)
2 (0.82, 9.71, 3.88)
3 (8.34, 1.72, 0.02)
  • Product #1
  • Product #2
  • Product #3;

Question 7: For the liked and recommended items displayed below, calculate the recall and round to 2 decimal points. (As in the lesson, green squares indicate recommended items, magenta squares are liked items. Items not recommended are grayed out for clarity.) Note: enter your answer in American decimal format (e.g. enter 0.98, not 0,98)

<image: https://d3c33hcgiwev3.cloudfront.net/imageAssetProxy.v1/C0Ri1FvZEeWMhg7baGhc3w_290d82e965c33e663968151f43a71743_Rec8.png?expiry=1658620800000&hmac=ro8CVcehdzhMoZDhUaIZXJOqieK7dJ0XcGNb2DHCFzw>

Answer: 0.33

Question 8: For the liked and recommended items displayed below, calculate the precision and round to 2 decimal points. (As in the lesson, green squares indicate recommended items, magenta squares are liked items. Items not recommended are grayed out for clarity.) Note: enter your answer in American decimal format (e.g. enter 0.98, not 0,98)

<image: https://d3c33hcgiwev3.cloudfront.net/imageAssetProxy.v1/QkZrJ1vZEeWZgBLZEKssZQ_f80562a68423c8ffe11565327abee8c8_Rec8.png?expiry=1658620800000&hmac=wdW97z3_apaxidVHhNYrLVtPmk6ryAf1fNgOSyvdLjw>

Answer: 0.25

Question 9: Based on the precision-recall curves in the figure below, which recommender would you use?

<image: https://d3c33hcgiwev3.cloudfront.net/imageAssetProxy.v1/JaMj1VvYEeWSuhJSxsy6bQ_648fbff528d436fc414fd485af5cb56d_Rec9.png?expiry=1658620800000&hmac=TdvA-JDmDM9SVzTbUD9UEMPc-crG42GgkFl6spDyve8>

  • RecSys #1
  • RecSys #2
  • RecSys #3

Quiz 2: Recommending songs

Question 1: Which of the artists below have had the most unique users listening to their songs?

  • Kanye West
  • Foo Fighters
  • Taylor Swift
  • Lady GaGa

Question 2: Which of the artists below is the most popular artist, the one with highest total listen_count, in the data set?

  • Taylor Swift
  • Kings of Leon
  • Coldplay
  • Lady GaGa

Question 3: Which of the artists below is the least popular artist, the one with smallest total listen_count, in the data set?

  • William Tabbert
  • Velvet Underground & Nico
  • Kanye West
  • The Cool Kids;

Week 6: Machine Learning Foundations: A Case Study Approach Quiz Answer

Quiz 1: Deep Learning

Question 1: Which of the following statements are true? (Check all that apply)

  • Linear classifiers are never useful, because they cannot represent XOR.
  • Linear classifiers are useful, because, with enough data, they can represent anything.
  • Having good non-linear features can allow us to learn very accurate linear classifiers.
  • none of the above

Question 2: A simple linear classifier can represent which of the following functions? (Check all that apply)

  • x1 OR x2 OR NOT x3
  • x1 AND x2 AND NOT x3
  • x1 OR (x2 AND NOT x3)
  • none of the above

Question 3: Which of the the following neural networks can represent the following function? Select all that apply.

(x1 AND x2) OR (NOT x1 AND NOT x2)

Answer:

<image: https://d3c33hcgiwev3.cloudfront.net/imageAssetProxy.v1/FG_Wy1vaEeWhtQ48PjS6Pw_d8ed3b37fc1e16f793f6a3c7fbb1531b_Deep3d.png?expiry=1658620800000&hmac=Y13fXXF0RyLZ9QsOvSEhdLZ25HwPcUk6Ek3VVhjTCMs>

Question 4: Which of the following statements is true? (Check all that apply)

  • Features in computer vision act like local detectors.
  • Deep learning has had impact in computer vision, because it’s used to combine all the different hand-created features that already exist.
  • By learning non-linear features, neural networks have allowed us to automatically learn detectors for computer vision.
  • none of the above

Question 5: If you have lots of images of different types of plankton labeled with their species name, and lots of computational resources, what would you expect to perform better predictions:

  • a deep neural network trained on this data.
  • a simple classifier trained on this data, using deep features as input, which were trained using ImageNet data.

Question 6: If you have a few images of different types of plankton labeled with their species name, what would you expect to perform better predictions:

  • a deep neural network trained on this data.
  • a simple classifier trained on this data, using deep features as input, which were trained using ImageNet data.

Quiz 2: Deep features for image retrieval

Question 1: What’s the least common category in the training data?

  • bird
  • dog
  • cat
  • automobile

Question 2: Of the images below, which is the nearest ‘cat’ labeled image in the training data to the the first image in the test data (image_test[0:1])?

Answer:

<image: https://d3c33hcgiwev3.cloudfront.net/imageAssetProxy.v1/xlEzz2DcEeW67AqL8VPUFQ_b58f25deeeb2bb4b4603fee6597ad3fd_cat_correct.png?expiry=1658620800000&hmac=Gn9tCJyaaZlS-Yj4IBx711HGqJQvdOTiJwrmA1cfM-I>

Question 3: Of the images below, which is the nearest ‘dog’ labeled image in the training data to the the first image in the test data (image_test[0:1])?

Answer:

<image: https://d3c33hcgiwev3.cloudfront.net/imageAssetProxy.v1/2KmNYGDcEeWSthLJWZH1gw_302e98a3196d8bf12bf7be8950ad77dd_dog_correct.png?expiry=1658620800000&hmac=MwbQ389JZJvXqH8bPWBjWmZJa-z7vdqxsEXShL2XYCI>

Question 4: :For the first image in the test data, in what range is the mean distance between this image and its 5 nearest neighbors that were labeled ‘cat’ in the training data?

  • 33 to 35
  • 35 to 37
  • 37 to 39
  • 39 to 41
  • Above 41

Question 5: For the first image in the test data, in what range is the mean distance between this image and its 5 nearest neighbors that were labeled ‘dog’ in the training data?

  • 33 to 35
  • 35 to 37;
  • 37 to 39
  • 39 to 41
  • Above 41

Question 6: On average, is the first image in the test data closer to its 5 nearest neighbors in the ‘cat’ data or in the ‘dog’ data?

  • cat
  • dog

Question 7: In what range is the accuracy of the 1-nearest neighbor classifier at classifying ‘dog’ images from the test set?

  • 50 to 60
  • 60 to 70
  • 70 to 80
  • 80 to 90
  • 90 to 100

Review:

Based on our knowledge, we urge you to enroll in this course so you can pick up new skills from specialists. It will be worthwhile, we trust.

Leave a Comment