Pattern Discovery in Data Mining ALL Weeks Quiz Answer

Week 1 Quiz Answer

 
 

Lesson 1 Quiz

 
Question 1)
Table 1: Transactions from a database
Given the transactions in Table 1, mini-support (minsup) s = 50%,
which of the following isnot a frequent itemset?
  • {Coffee}
  • {Beer}
  • {Eggs}
  • {Beer, Diapers}
Question 2)
Table 1: Transactions from a database
Given the transactions in Table 1, what is the confidence and
relative support of the association rule {Diapers} ⇒ {Coffee,
Nuts}?
  • support s = 0.4, confidence c = 0.5
  • support s = 0.8, confidence c = 0.5
  • support s = 0.4, confidence c = 1
  • support s = 0.8, confidence c = 1
  • None of the above
Question 3)
Consider the database containing the transaction T1 : {a1, a2, a3},
T2 : {a2, a3, a4}, T3 : {a1,a3, a4}. Let mini-support (minsup) = 2.
Which of the following frequent patterns is closed?
  • {a2}
  • {a1}
  • {a1, a3}
  • {a4}
Question 4)
Consider the database containing the transactions T1 : {a1, …, a3},
T2 : {a2, …, a4}. Letminsup = 1. What fraction of all frequent
patterns is max frequent patterns?
  • 1/11
  • 2/11
  • 1/3
  • There are no max frequent patterns for the given minsup.
  • 3/11
Question 5)
Rank the following sets by their cardinality for a given database:
{all frequent patterns}, {closed frequent patterns}, and {max frequent
patterns}.
  • {all frequent patterns} ≥ {closed frequent patterns} ≥ {max frequent
    patterns}
  • {all frequent patterns} ≥ {max frequent patterns} ≥ {closed frequent
    patterns}
  • {all frequent patterns} ≥ {max frequent patterns} = {closed
    frequent patterns}, i.e. the set of max frequent patterns and the
    set of closed frequent patterns are identical.
  • {all frequent patterns} ≥ {max frequent patterns}, {all frequent
    patterns} ≥ {closed frequent patterns}, but the order of {max
    frequent patterns} and {closed frequent patterns} cannot be
    determined without further information.
  • Ranking is impossible without further information.
Question 6)
Table 1: Transactions from a database
Given the transaction in Table 1 and mini-support (minsup) s = 40%,
which of the following is a length-3 frequent item set?
  • Beer, Nuts, Eggs
  • Beer, Coffee, Milk
  • Coffee, Diapers, Eggs
  • Beer, Nuts, Diapers
Question 7)
 
 
A strong association rule satisfies both the mini-support (minsup)
and minconfthresholds. Given the transactions in Table 1, mini-support
(minsup)s = 50%, andminconf c = 50%, which of the following is not a
strong association rule?
  • {Beer} ⇒ {Diapers}
  • {Beer, Nuts} ⇒ {Diapers}
  • {Diapers} ⇒ {Nuts}
  • {Nuts} ⇒ {Diapers}
  • {Diapers} ⇒ {Beer}
Question 8)
Consider the database containing the transaction T1 : {a1, a2, a3},
T2 : {a2, a3, a4}, T3 : {a1,a3, a4}. Let mini-support (minsup) = 2.
Which of the following frequent patterns is NOT closed?
  • {a2}
  • {a1, a3}
  • {a3}
  • {a3, a4}
Question 9)
Consider the database containing the transactions T1 : {a1, a2, a3,
a4, a5}, T2 : {a2, a3, a4, a5,a6}. Let minsup = 1. Which of the
following is both a max frequent and a closed frequent pattern? Select
all that apply.
  • {a2, a3, a4, a5}
  • {a2, a5}
  • {a1, a2, a3, a4, a5}
  • {a2, a3, a4, a5, a6}
  • {a1, a2, a3, a4, a5, a6}
Question 10)
Given the set of closed frequent patterns, we can ___________. Select
all that apply.
  • Recover all transactions in the database
  • Find the set of max frequent patterns
  • Recover the set of all frequent patterns and their support in some
    situations but not all

  • Always recover the set of all frequent patterns and their support
Question 11)
 
 
Given the transactions in Table 1, mini-support (minsup) s= 50%, and
minconf c = 50%, which of the following is an association rule? Select
all that apply.
  • Nuts ⇒ Eggs
  • Coffee ⇒ Milk
  • Diapers ⇒ Eggs
  • Nuts ⇒ Diapers
  • Beer ⇒ Nuts
Question 12)
Which of the following statements is true?
  • The set of closed frequent patterns is always the same as the set of
    max frequent patterns.
  • Since both closed and max frequent patterns are a subset of all
    frequent patterns, we cannot recover all frequent patterns and their
    supports given just the closed and max frequent patterns.
  • Closed frequent patterns can always be determined from the set of
    max frequent patterns.
  • We can recover all frequent patterns and their supports from the set
    of max frequent patterns.
  • We can recover all frequent patterns and their supports from the
    set of closed frequent patterns.
Question 13)
Given the transactions in Table 1, what is the confidence and
relative support of the association rule {Diapers} ⇒ {Coffee,
Nuts}?
  • support s = 0.4, confidence c = 0.5
  • support s = 0.8, confidence c = 0.5
  • support s = 0.4, confidence c = 1
  • support s = 0.8, confidence c = 1
  • None of the above
 

Lesson 2 Quiz

Question 1)
If we know the support of itemset {a, b} is 10, which of the
following numbers are the possible supports of itemset {a, b, c}?
Select all that apply.
  • 11
  • 9
  • 10
Question 2)
If we know the support of itemset {a} is 50 and the support of
itemset {a, b, c} is 30, which of the following numbers are the
possible supports of itemset {a, b}? Select all that apply.
  • 10
  • 5
  • 30
  • 100
  • 50
Question 3)
Considering the Apriori algorithm, assume we have obtained all size-2
(i.e., containing 2 items, e.g. {A, B}) frequent itemsets. They are
{A, B}, {A, C}, {A, D}, {B, C}, {B, E}, and {C, E}. In the following
size-3 itemsets, which of them should be considered, i.e., have
potential to be size-3 frequent itemsets? Select all that apply.
  • {A, B, D}
  • {A, C, D}
  • {B, C, E}
  • {A, B, C}
Question 4)
Given the FP-tree as shown in Figure 1, how many transactions do we
have in total?
  • 4
  • 5
  • 3
  • 1
  • 2
Question 5)
If we know the support of itemset {a} is 50 and the support of
itemset {a, b, c} is 10, which of the following numbers are the
possible supports of itemset {a, b}? Select all that apply.
  • 5
  • 10
  • 50
  • 30
  • 100
Question 6)
Considering the Apriori algorithm, assume we have 5 items (A to E) in
total. In the 1st scan, we find out all frequent items A, B, C, and E.
How many size-2 (i.e., containing 2 items, e.g. A, B) itemsets should
be considered in the 2nd scan, i.e., have potential to be size-2
frequent itemsets? Select all that apply.
  • 10
  • 25
  • 4
  • 6
Question 7)
Given the FP-tree as shown Figure 1, which of the following choices
is in the f-conditional database? Select all that apply.
  • {c, a, b, m} : 1
  • {c, b, p} : 1
  • {b} : 1
  • {c, a, m, p} : 2

 

Extra Question

Question 1)
Which of the following tasks does not fall under the scope of data
mining? Select all that apply.
  • Data entry.
  • Data Cleaning.
 
 
 
Question 2)
 
 
Given the transaction in Table 1 and minsup s = 50%, how many
frequent 3-itemsets are there?
 
Answer:
  • 0
 
 
 
Question 3)
 

 

A strong association rule satisfies both the minsup and minconf
thresholds. Given the transactions in Table 1, minsup s = 50%, and
minconf c = 50%, how many strong association rules are there? Note
that the association rule A => B and B => A are distinct.
 
Answer:
  • 6
 
 
 
Question 4)
 

 

Given the transactions in Table 1, minsup s = 50%, and minconf c =
50%, which of the following is an association rule? Select all that
apply.
 
Answer:
  • Beer => Nuts
  • Nuts => Diaper
 
 
 
Question 5)
Consider the database containing the transaction T1 : {a1, …, a5},
T2 : {a1, …, a1}, T3 : {a3, …, a7}, T4 : {a4, …, a8}. For what
value of minsup do we have the most number of closed frequent
patterns?
 
Answer:
  • minsup = 1
 
 
 
Question 6)
Consider the database containing the transactions T1 : {a1, …, a3},
T2 : {a2, …, a4}. Let minsup = 1. What fraction of all frequent
patterns is max frequent patterns?
 
Answer:
  • 2/11
 
 
Question 7)
Consider the database containing the transaction T1 : {a1, a2, a3},
T2 : {a2, a3, a4}. Let minsup = 1. What fraction of all frequent
patterns is closed?
 
Answer:
  • 3/11
 
 
Question 8)
Rank the following sets by their cardinality for a given database:
{all frequent patterns}, {closed frequent patterns}, {max frequent
patterns}
 
Answer:
  • {all frequent patterns} >= {closed frequent patterns} >= {max
    frequent patterns}
 
 
 
Question 9)
Which of the following statements is true?
 
Answer:
  • We can recover all frequent patterns from the set of closed
    frequent patterns.
 
 
Question 10)
If we know the support of itemset {a, b, c} is 10, which of the
following numbers are the possible supports of the itemset {a, b}?
 
Answer:
  • 10
  • 11
 
 
 
Question 11)
If we know the support of itemset {a, b} is 10, which of the
following numbers are the possible supports of itemset {a, b, c}?
 
Answer:
  • 9
  • 10
 
 
 
Question 12)
If we know the support of itemset {a} is 50, and the support of
itemset {a, b, c} is 10, which of the following numbers are the
possible supports of itemset {a, d}?
 
Answer:
  • 5
  • 50
  • 30
  • 10
 
 
Question 13)
Considering Apriori Algorithm, assume we have 5 items (A to E) in
total. In the 1-st scan, we find out all frequent items A, B, C, and
E. How many size-2 (i.e. containing 2 items, e.g. A, B) itemsets
should be considered in 2-nd scan, i.e. are potential to be size-2
frequent itemsets?
 
Answer:
  • 6
 
 
 
Question 14)
Considering Apriori Algorithm, assume we have obtained all size-2
(i.e. containing 2 items, e.g. {A, B}) frequent itemsets. They are {A,
B}, {A, C}, {A, D}, {B, C}, {B, E}, {C, E}. In the following size-3
itemsets, which of them should be considered, i.e. are potential to be
size-3 frequent itemsets?
 
Answer:
  • {A, B, C}
  • {B, C, E}
 
 
 
Question 15)
 

 

 
Given the FP-tree as shown in Figure 1, what is the support of {c,
p}?
 
Answer:
  • 3

Pattern Discovery in Data Mining

Week 2 Quiz Answer

 
 

Lesson 3 Quiz

 
Question 1)
What is the value range of the Kulczynski measure?
  • (-∞, +∞)
  • [-1, 1]
  • [0, 1]
  • [0, +∞)
Question 2)
What is the value range of the χ2 measure?
  • (-∞, +∞)
  • [-1, 1]
  • [0, 1]
  • [0, +∞)
Question 6)
Which of the following measures is NOT null invariant?
  • Cosine
  • Lift
  • All confidence
  • Kulcyzynski
Question 7)
 Suppose we are interested in analyzing the purchase of comics
(CM) and fiction (FC) in the transaction history of a bookstore. We have
the following 2 × 2 contingency table summarizing the transactions. If
χ2 is used to measure the correlation between CM and FC, what is the χ2
score?
  • -240
  • -80
  • 80
  • 240
Question 7)
What is the value range of the Kulczynski measure?
  • [0, 1]
  • (-∞, +∞)
  • [-1, 1]
  • [0, +∞)
Question 10)
Suppose we are interested in analyzing the purchase of comics (CM) and
fiction (FC) in the transaction history of a bookstore. We have the
following 2 × 2 contingency table summarizing the transactions. If lift
is used to measure the correlation between CM and FC, what is the value
for lift(CM, FC)?
  • -0.6
  • 0.6
  • -2e-4
  • 2e-4
Questine 11)
Suppose we are interested in analyzing the transaction history of
several supermarkets with respect to purchase of apples (A) and bananas
(B). We have the following table summarizing the transactions.
Which of the following measures would you use to determine the
correlation of purchases between apples and bananas across all these
supermarkets?
  • χ2
  • Kulcyzynski
  • Lift
  • Cosine
Question 12)
Suppose a school collected some data on students’ preference for hot
dogs (HD) vs. hamburgers (HM). We have the following 2×2 contingency
table summarizing the statistics. If χ2 is used to measure the
correlation between HD and HM, what is the χ2score?
  • 0
  • -1
  • -∞
  • 1

Lesson 4 Quiz

Question 1)
Suppose one needs to frequent patterns at two different levels, with
mini-support (minsup) of 5% (higher level) and 3% (lower level),
respectively. If using shared multi-level mining, which mini-support
(minsup) threshold should be used to generate candidate patterns for
the higher level?
  • 3%
  • 1%
  • 8%
  • 5%
Question 2)
A store had 100,000 total transactions in Q4 2014. 10,000
transactions contained eggs, while 5,000 contained bacon. 2000
transactions contained both eggs and bacon. Which of the following
choices for the value of ε is the smallest such that {eggs, bacon} is
considered a negative pattern under the null-invariant definition?
  • 0.1
  • 0.81
  • 0.5
  • 01
  • A value for ε such that {eggs, bacon} is a negative pattern under
    the null-invariant definition does not exist.
Question 3)
Below is a table of transactions. According to the introduced pattern
distance measure, what is the distance between pattern “abc” and
pattern “abd”?
  • 0
  • 0.5
  • 0.2
  • 0.333
Question 4) 
Given the itemsets in Table 1 and a cluster quality measure δ =
0.001, what could be a set of representative patterns that covers all
itemsets in Table 1?
Hint: The pattern with the least support is {F, A, C, E, T, S}. Consider
which pattern in the table may δ-cover the pattern {F, A, C, E, T, S}.
  • {{F, A, C, E, T, S}}
  • {{F, A, C, E, S}, {A, C, E, S}}
  • {{F, A, C, E, S}, {F, A, C, T, S}}
  • {{F, A, C, E, S}, {F, A, C, E, T, S}, {F, A, C, T, S}}
  • {{A, C, E, S}, {A, C, T, S}}
Question 5)
A store had 100,000 total transactions in Q4 2014. 10,000
transactions contained beer, while 5,000 contained frying pans. 600
transactions contained both beer and frying pans. Which of the
following is true?
  • More information is needed to determine if {beer, frying pans} is a
    negative pattern.
  • {beer, frying pans} is a negative pattern under the support-based
    definition of negatively correlated patterns.
  • For ε = 0.1, {beer, frying pans} is a negative pattern under the
    null-invariant definition of negatively correlated patterns.
  • There does not exist a value for ε such that {beer, frying pans} is
    a negative pattern by the null-invariant definition of negative
    patterns.
Question 6)  
Given the itemsets in Table 1, which of the following patterns are in
the δ-cluster containing the pattern {A, C, E, S} for δ = 0.0001?
Hint: Consider two patterns P1 and P2 such that O(P1) ⊆ O(P2), where
O(Pi) is the corresponding itemset of pattern Pi . Take a second to
convince yourself that the following is true:
  • {A, C, T, S}
  • {F, A, C, E, S}
  • {F, A, C, T, S}
  • {F, A, C, E, T, S}
Question 7) 
Consider two patterns P1 and P2 such that O(P1) ⊆ O(P2), where O(Pi)
is the corresponding itemset of pattern Pi. Take a second to convince
yourself that the following is true:
Which of the following patterns in Table 1 is δ-covered by {F, A, C,
E, T, S} for δ=0.4? Select all that apply.
  • {A, C, E, S}
  • {F, A, C, T, S}
  • {A, C, T, S}
  • {F, A, C, E, S}

 

Extra Questions

Question 1)
Suppose a school collected some data on students’ preference for
hot dogs(HD) vs. hamburgers (HM). We have the following 2×2
contingency table summarizing the statistics. If lift is used to
measure the correlation between HD and HM, what is the value for
lift(HD, HM)?

 

 
Answer:
  • 1
  •  -∞
  •  0
  •  -1
 
 
 
 
Question 2)
Suppose Coursera collected statistics on the number of students
who take courses on data mining (DM) and machine learning (ML). We
have the following 2×2 contingency table summarizing the
statistics. If χ2 is used to measure the correlation between DM
and ML, what is the χ2 score?
 

 

Answer:
  • 562.5
  • -562.5
  • -225
  • 225
 
 
 
Question 3)
What is the value range of the Lift measure?
 
Answer:
  • ric: normal; vertical-align: baseline; white-space: pre-wrap;”>[0, +∞)
  • [0, 1]
  • (-∞, +∞)
  • [-1, 1]
 
 
 
Question 4)
Which of the following measures is NOT null invariant?
 
Answer:
  • X2
 
 
Question 5)
Suppose we are interested in analyzing the transaction history of
several supermarkets with respect to purchase of apples(A) and
bananas(B). We have the following table summarizing the
transactions.
 

 

Denote li as the lift measure and ki as the Kulcyzynski measure
for supermarket Si(i = 1, 2). Which of the following is
correct?
 
Answer:
  • l1 ≠ l2, k1 = k2
 
 

Question 6)
A store had 100,000 total transactions in Q4 2014. 10,000
transactions contained eggs, while 5,000 contained bacon. 2000
transactions contained both eggs and bacon. Which of the following
choices for the value of ε is the smallest such that {eggs, bacon}
is considered a negative pattern under the null-invariant
definition?
 
Answer:
  • 0.5
 
 
 
 
Question 7)
 
 
Given the itemsets in Table 1, which of the following patterns
are in the δ-cluster containing the pattern {A, C, E, S} for δ =
0.0001?
 
Answer:
  • {F, A, C, E, S}
 
 
 
Question 8)
Given the transactions in Table 2, which of the following is a
(1, 0.5)-robust pattern in the database? Select all that apply.
 
Answer:
  • None of the other options are correct.
 
 
 
Question 9)
A constraint is anti-monotone if an itemset S violates the
constraint, so do all of its supersets. Which of following
constraints is anti-monotone?
 
Answer:
  • range(S.price) < 10
 
 
Question 10)
A constraint is monotone if an itemset S satisfies the
constraint, so do all of its supersets. Which of following
constraints is monotone?
 
Answer:
  • min(S.price) < 15
 
 
 
Question 11)
A constraint is succinct if the constraint c can be enforced by
directly manipulating the data. Which of following constraints is
succinct
 
Answer:
  • ax(S.price) > 20

    Week 3 Quiz Answer

     
     

    Lesson 5 Quiz

     
    Question 1)
    Given a sequence database, as shown in Table 3, with support
    threshold minsup = 3, which of the following sequences are
    frequent?
     

     

    • <abc>
    • < a(bc) >
    • <ade>
    • <acf>
    • None of the above
    Question 2)
    Suppose we use Generalized Sequential Patterns (GSP) to find the
    frequent sequential patterns. After scanning the database once, we
    find the frequent singleton sequences are: a, b, d. Which of the
    following could be possible length-2 candidate sequences?
    • <(ac)>
    • <ab>
    • <ad>
    • <(bd)>
    Question 3)
    Given a sequence database, as shown in the following table, suppose
    we use the SPADE algorithm to find the frequent sequential patterns.
    Which of the following sequences (in the format of <SID, EID>)
    belong to the mapped database of item b?
     

     

    • <3, 1>
    • <3, 2>
    • <4, 1>
    • <1, 2>
    Question 4)
    Given a sequence database, as shown in Table 10. Suppose min_sup =
    1. Which of the following does not belong to the < a
    >-projected database?
     

     

    • < b(bd) >
    • < f(e)(cdeh)cfg(abe) >
    • < d(bc)c(fg)(ch) >
    • < (_d)ebf(cdfgh) >
    • All of the above belong to <a>-projected database.
    Question 5)
    Suppose we use the CloSpan algorithm to find all closed sequential
    patterns from a sequence database with minimum support 15. During
    the mining process, we derive the following sequences along with the
    sizes of their projected DBs: <c>: 50, <ac> 45,
    <b> 30, <bc>: 30. Then we use the backward sub-pattern
    rule and the backward super-pattern rule to prune redundant search
    space. Which of the projected DBs will remain after the pruning?
    • <c>
    • <bc>
    • <b>
    • <ac>
    Question 6)
    Given a sequence database, as shown in Table 2, with support
    threshold minsup = 3, which of the following sequences are
    frequent?
     

     

    • <abc>
    • < f(ab) >
    • < (bd)b >
    • < (ae)c >
    • None of the above
    Question 7)
    Given a sequence database as shown in the following table, suppose
    we use the SPADE algorithm to find the frequent sequential patterns.
    Which of the following sequences (in the format of <SID, EID>)
    belong to the mapped database of item a?

     

    • <4, 1>
    • <3, 2>
    • <1, 2>
    • <1, 1>
    Question 8)
    Given a sequence database, as shown in Table 10. Suppose min_sup =
    1. Which of the following does not belong to the < a
    >-projected database?
     

     

    • < b(bd) >
    • < f(e)(cdeh)cfg(abe) >
    • < d(bc)c(fg)(ch) >
    • < (_d)ebf(cdfgh) >
    • All of the above belong to <a>-projected database.
    Question 9)
    Suppose we use the CloSpan algorithm to find all closed sequential
    patterns from a sequence database with minimum support 15. During
    the mining process, we derive the following sequences along with the
    sizes of their projected DBs: <c>: 50, <ac> 40,
    <ab> 30, <bc>: 50. Then we use the backward sub-pattern
    rule and the backward super-pattern rule to prune redundant search
    space. Which of the projected DBs will remain after the pruning?
    • <c>
    • <ab>
    • <ac>
    • <bc>
    Question 10)
    Given a sequence database, as shown in Table 11. Suppose min_sup =
    1. Which of the following does not belong to the < d
    >-projected database?
    • < de >
    • < (bc)c(fg)(ch) >
    • <ebf(cdfgh) >
    • < (c_eh)cfg(abe) >
    Question 11)
    Suppose we use the CloSpan algorithm to find all closed sequential
    patterns from a sequence database with minimum support 15. During
    the mining process, we derive the following sequences along with the
    sizes of their projected DBs: <c>: 50, <ac> 50,
    <ab> 30, <bc>: 30. Then we use the backward sub-pattern
    rule and the backward super-pattern rule to prune redundant search
    space. Which of the projected DBs will remain after the pruning?
    • <c>
    • <ac>
    • <ab>
    • <bc>
    Question 12)
    Given a sequence database as shown in Table 1 with support
    threshold mini-support (minsup) = 3, which of the following
    sequences is frequent?
     

     

    • <abc>
    • < (ab)f >
    • < f(bd) >
    • < a (bf) >
    Question 13)
    Suppose we use Generalized Sequential Patterns (GSP) to find the
    frequent sequential patterns. After scanning the database once, we
    find the frequent singleton sequences are: a, b, d. Which of the
    following could be possible length-2 candidate sequences?
    • <ac>
    • <ab>
    • <(bc)>
    • <(bd)>
    Question 14)
    Given a sequence database, as shown in Table 12. Suppose min_sup =
    1. Which of the following does not belong to the < e
    >-projected database?
     

     

    • < bf(cdfgh) >
    • < (_g)(adf)gh>
    • < (cdeh)cfg(abe) >
    • < (_h)cfg(abe) >
    Question 14)
    Given a sequence database, as shown in the following table, suppose
    we use the SPADE algorithm to find the frequent sequential patterns.
    Which of the following sequences (in the format of <SID, EID>)
    belong to the mapped database of item c?
     

     

    • <1, 3>
    • <3, 2>
    • <1, 1>
    • <4, 1>
    Question 15)
    Given a sequence database, as shown in Table 11. Suppose min_sup =
    1. Which of the following does not belong to the < d
    >-projected database?
     

     

    • < de >
    • <ebf(cdfgh) >
    • < (bc)c(fg)(ch) >
    • < (c_eh)cfg(abe) >

    Lesson 6 Quiz

    Question 1)
    Which of the following is true about spatial association mining?
    Select all that apply.
    • A rule is called a spatial association as long as its confidence
      is no less than the given confidence threshold.
    • In the progressive refinement framework, the result associations
      will be the refinement of the rough patterns obtained in the
      first step.
    • There is no difference between mining spatial associations and
      mining classic association rules.
    • A rule is called a spatial association if its support is no less
      than the given support threshold and its confidence is no less
      than the given confidence threshold.
    Question 2)
    Consider a spatial database that consists of 1000 records. If an
    item A appears 200 times in the database and the rule “if A, then
    B” appears 100 times, what are the support and confidence for the
    rule “if A, then B”?
    • support: 20%; confidence: 20%
    • support: 20%; confidence: 10%
    • support: 10%; confidence: 50%
    • support: 20%; confidence: 50%
    Question 3)
    For a frequent trajectory pattern, we require that the
    consecutive places in the trajectory pattern have a time gap no
    larger than the time constraint. Given a time constraint of 30 min
    and a support threshold of 5%, which of the following are valid
    frequent trajectory patterns?
    • Railway Station —15min→ Castle Square —15min→ Museum [Support:
      3%]
    • Railway Station —15min→ Castle Square —45min→ Museum [Support:
      6%]
    • Railway Station —15min→ Castle Square —2h15min→ Museum [Support:
      7%]
    • Railway Station —10min→ Middle Bridge —10min → Campus
      [Support: 7%]
    Question 4)
    For mining semantics-rich movement patterns, which of the
    following statements are true about the top-down mining approach
    Splitter? Select all that apply.
    • The top-down mining approach can effectively reduce the search
      space of movement patterns.
    • The final movement patterns reflect only people’s spatial
      transitions from one region to another.
    • The coarse patterns generated by the first step mainly reflect
      people’s semantics-level transitions.
    • When grouping the places in the first step, the places having
      the same semantic category should be put into the same group.
    Question 5)
    For a frequent trajectory pattern, we require that the
    consecutive places in the trajectory pattern have a time gap no
    larger than the time constraint. Given a time constraint of 30 min
    and a support threshold of 8%, which of the following are valid
    frequent trajectory patterns?
    • Railway Station —45min→ Castle Square —15min→ Museum [Support:
      15%]
    • Railway Station —20min→ Middle Bridge —10min → Campus
      [Support: 8%]
    • Railway Station —35min→ Castle Square —15min→ Museum [Support:
      3%]
    • Railway Station —55min→ Castle Square —15min→ Museum [Support:
      10%]
    Question 6)
    For mining semantics-rich movement patterns, which of the
    following statements are true about the top-down mining approach
    Splitter? Select all that apply.
    • The top-down mining approach can effectively reduce the search
      space of movement patterns.
    • The final movement patterns reflect only people’s spatial
      transitions from one region to another.
    • Given a support threshold d, the support of any result movement
      patterns must be no less than d.
    • In this approach, similar places should be put into the same
      group to collectively meet the support threshold.
    Question 7)
    Which of the following is true about spatial association
    mining?
    • There is no difference between mining spatial associations and
      mining classic association rules.
    • A rule is called a spatial association as long as its support is
      no less than the given support threshold.
    • A rule is called a spatial association as long as its confidence
      is no less than the given confidence threshold.
    • For mining spatial associations, the hierarchy of spatial
      relationship can be used to speed up the mining process.
    Question 8)
    Which of the following is true about spatial association mining?
    Select all that apply.
    • The progressive refinement framework is mainly for visualization
      purposes.
    • The progressive refinement framework can reduce the search
      space of spatial associations.
    • A rule is called a spatial association as long as its support is
      no less than the given support threshold.
    • A rule is called a spatial association if its support is no
      less than the given support threshold and its confidence is no
      less than the given confidence threshold.

     

Extra Questions

Question 1)
Given a sequence database, as shown in Table 2, with support threshold
min-sup = 3, which of the following sequences are frequent?
Answer:

 

Question 2)
Given a sequence database, as shown in Table 5, and support threshold
min-sup = 4, use Generalized Sequential Patterns (GSP) to find the
frequent sequential patterns. After scanning the database once, how many
length-2 candidate sequences will be generated after Apriori pruning?
How many length-2 candidate sequences will be generated if not using
Apriori pruning?

 

Answer:
  • 22; 51
Question 3)
Given a sequence database, as shown in Table 8, and support threshold
min-sup = 4, use Generalized Sequential Patterns (GSP) to find the
frequent sequential patterns. What is the minimum number of times we
need to scan the database in order to find all the frequent sequential
patterns?

 

Answer:
  • 2
Question 4)
Given a sequence database, as shown in Table 10, and min-sup = 1, which
of the following does not belong to the -projected database?
Answer:
  • <b(bd)>
Question 5)
Given a sequence database, as shown in Table 15, which of the following
sequential patterns are closed?
Answer:

Question 6)
In our database, we have the following three graphs:

 

If we set the support threshold min-sup = 3, which of the following
sequences is NOT a frequent graph pattern?

 

Answer:

 

Question 7)
When we use the Apiori-based approach to find the frequent graph
pattern for a candidate graph, we need to check all of its subgraphs.
Given the following graph, how many distinct subgraphs with seven
vertices are there?

 

Answer:
  • 1
Question 8)
In our database, we have the following three graphs:
What is the support of the following graph?
Answer:
  • 1
Question 9)
Suppose we have learned two ranked rules as follows (the default is
Type 2):
{“ipad”, “iphone”} -> Type 1
{“kindle”, “iphone”} -> Type 2
{“ipad”} -> Type 1
For the people who have {“kindle”, “iphone”}, which type will they be classified as by CBA algorithm?
Answer:
  • Type 2

    Week 4 Quiz Answer

     
     

    Lesson 7 Quiz

    Question 1)
    For mining text data, which of the following algorithms will not
    output phrases?
    • KERT
    • SegPhrase
    • LDA
    • ToPMine
    Question 2)
    Given a text corpus, which of the following can be used for
    measuring the colocation strength for a pair of words? Select all
    that apply.
    • Z-test
    • T-test
    • Edit distance
    • Mutual information
    Question 3)
    Suppose we want to use contiguous pattern mining to extract
    candidate phrases. Given the five statements below and a support
    threshold 3, which of the given phrases can be considered as
    candidates? Select all that apply.
    (1) Support vector machine is a classifier.
    (2) Neural network performs equally well as support vector machine.
    (3) We propose a method that combines support vector machine with
    kernel method.
    (4) Neural network is harder to tune than support vector machine.
    (5) Support vector machine is important for regression.
    • vector machine
    • neural network
    • support vector
    • support vector machine
    Question 4)
    Which of the following measures has been used for ranking phrases
    in KERT? Select all that apply.
    • Completeness
    • Popularity 
    • Informativeness
    • Likelihood ratio
    Question 5)
    Suppose we want to use contiguous pattern mining to extract
    candidate phrases. Given the five statements below and a support
    threshold 3, which of the given phrases can be considered as
    candidates?
    (1) Support vector machine is a classifier.
    (2) Neural network performs equally well as support vector machine.
    (3) We propose a method that combines support vector machine with
    kernel method.
    (4) Neural network is harder to tune than support vector machine.
    (5) Support vector machine is important for regression.
    • kernel method
    • neural network
    • support machine
    • support vector machine
    Question 6)
    Which of the following measures has been used for ranking phrases
    in KERT? Select all that apply.
    • KL divergence
    • Popularity
    • Completeness
    • Mutual information

    Lesson 8 Quiz

    Question 1)
    Which of the following algorithms is not designed for frequent
    pattern mining in stream data with approximation?
    • FP-growth
    • Space saving algorithm
    • Sticky sampling algorithm
    • Lossy counting algorithm
    Question 2)
    A data scientist is applying the lossy counting algorithm to a
    transactional data stream in order to obtain the counts of
    different items. If the bucket size is set to 1000, the total
    length of the transactional data stream is 10000, and the true
    count of an item A is 100, which of the following could be the
    possible outputs of item A’s count by lossy counting? Select all
    that apply.
    • 98
    • 90
    • 80
    • 102
    Question 3)
    In CP-Miner, we use constraint-based sequential pattern mining
    to obtain the frequent sequences. Let us consider a source file,
    which has been transformed into a sequence DB after tokenization
    and hashing. If we set the max gap to 2 (the index difference
    between two items is no larger than 2) and the support threshold
    to 0.6, which of the following can be the frequent sequences
    output by CP-Miner? Select all that apply.
    (1) <1, 2, 1, 3>
    (2) <2, 3, 4, 1>
    (3) <1, 2, 4, 3>
    (4) <3, 2, 4, 3>
    (5) <1, 2, 5, 4>
    • <1, 2>
    • <2, 4>
    • <3, 4>
    • <1, 3>
    Question 4)
    Which of the following are designed for preserving data
    privacy? Select all that apply.
    • σ-frequent
    • t-closeness
    • K-anonymity
    • Differential privacy
    Question 5)
    Which of the following algorithms is not designed for frequent
    pattern mining in stream data with approximation?
    • CloSpan
    • Space saving algorithm
    • Sticky sampling algorithm
    • Lossy counting algorithm
    Question 6)
    A data scientist is applying the lossy counting algorithm to a
    transactional data stream in order to obtain the counts of
    different items. If the bucket size is set to 1000, the total
    length of the transactional data stream is 10000, and the true
    count of an item A is 100, which of the following could be the
    possible outputs of item A’s count by lossy counting?
    • 85
    • 110
    • 95
    • 105
    Question 7)
    In CP-Miner, we use constraint-based sequential pattern mining
    to obtain the frequent sequences. Let us consider a source file,
    which has been transformed into a sequence DB after tokenization
    and hashing. If we set the max gap to 2 (the index difference
    between two items is no larger than 2) and the support threshold
    to 0.6, which of the following can be the frequent sequences
    output by CP-Miner? Select all that apply.
    (1) <1, 2, 1, 3>
    (2) <2, 3, 1, 4>
    (3) <1, 2, 4, 3>
    (4) <3, 2, 4, 3>
    (5) <1, 2, 5, 4>
    • <1, 2>
    • <1, 3>
    • <2, 4>
    • <2, 3>
 

Extra Questions

 
Question 1)
For the task of frequent pattern mining for text data, which of the
following algorithms will NOT output phrases?
Answer:
  • LDA
Question 2)
Which of the following algorithms are not designed for frequent
pattern mining in stream data with approximation?
Answer:
  • FP-growth (Han, Pei & Yin, SIGMOD’00)
Question 3)
Which of the following algorithms are not designed for spatiotemporal
and trajectory pattern mining?
Answer:
  • CP-Miner (Li, Lu, Myagmar, Zhou, OSDI’04)
Question 4)
Which of the following algorithms is NOT designed for
privacy-preserving pattern/association rule mining?
Answer:
  • Co-location Mining Algorithm (Huang, Shekhar & Xiong,
    TKDE’04)
    Pattern Discovery in Data Mining Week 1

    Orientation Quiz Answer

     
    In this article i am gone to share Coursera Course Pattern Discovery
    in Data Mining Week 1 Orientation Quiz Answer with you..
     
     

    Practice Exercise Orientation Quiz 

     
    Question 1)
    This course lasts for ___ weeks.
    Answer:
    • 4
    • 6
    • 10
    • 8
    Question 2)
    I am required to read a textbook for this course.
    • True
    • False
    Question 3)
    Which of the following activities are required to pass this course
    in order to receive the Course Certificate? Check all that apply.
    • Lectures
    • In-lecture questions
    • Eight graded lesson quizzes
    • One required programming assignment
    Question 4)
    The following tools will help me use the discussion forums.
    • “Up-voting” posts that are thoughtful, interesting, or helpful.
    • “Tagging” my posts with keywords other students might use in
      searching the forums.
    • Subscribing to any forums that are particularly interesting to me.
    • All of the other options are correct.
    Question 5)
    If I have a problem in the course I should:
    • Email the instructor
    • Call the instructor
    • Drop the class
    • Report it to the Learner Help Center (if the problem is
      technical) or to the Content Issues Forum (if the problem is an
      error in the course materials).
    Question 6)
    I am required to purchase a textbook for this course.
    Answer
    • True
    • False
    Question 7)
    Which of the following activities are required each week? Check all
    that apply.
    Answer
    • Quiz
    • Lectures
    Question 8)
    The following tools will help me use the discussion forums:
    Answer
    • All of the other options are correct.
    Question 9)
    If I have a problem in the course I should:
    Answer
    • Report it to the Learner Help Center (if the problem is technical)
      or to the Content Issues forum (if the problem is an error in the
      course materials).

Leave a Comment