**Week 1 Quiz Answer**

### Lesson 1 Quiz

**Question 1)**

**Table 1: Transactions from a database**

**Given the transactions in Table 1, mini-support (minsup) s = 50%,**

which of the following isnot a frequent itemset?

which of the following isnot a frequent itemset?

**{Coffee}**- {Beer}
- {Eggs}
- {Beer, Diapers}

**Question 2)**

**Table 1: Transactions from a database**

**Given the transactions in Table 1, what is the confidence and**

relative support of the association rule {Diapers} ⇒ {Coffee,

Nuts}?

relative support of the association rule {Diapers} ⇒ {Coffee,

Nuts}?

**support s = 0.4, confidence c = 0.5**- support s = 0.8, confidence c = 0.5
- support s = 0.4, confidence c = 1
- support s = 0.8, confidence c = 1
- None of the above

**Question 3)**

**Consider the database containing the transaction T1 : {a1, a2, a3},**

T2 : {a2, a3, a4}, T3 : {a1,a3, a4}. Let mini-support (minsup) = 2.

Which of the following frequent patterns is closed?

T2 : {a2, a3, a4}, T3 : {a1,a3, a4}. Let mini-support (minsup) = 2.

Which of the following frequent patterns is closed?

- {a2}
**{a1}**- {a1, a3}
- {a4}

**Question 4)**

**Consider the database containing the transactions T1 : {a1, …, a3},**

T2 : {a2, …, a4}. Letminsup = 1. What fraction of all frequent

patterns is max frequent patterns?

T2 : {a2, …, a4}. Letminsup = 1. What fraction of all frequent

patterns is max frequent patterns?

- 1/11
**2/11**- 1/3
- There are no max frequent patterns for the given minsup.
- 3/11

**Question 5)**

**Rank the following sets by their cardinality for a given database:**

{all frequent patterns}, {closed frequent patterns}, and {max frequent

patterns}.

{all frequent patterns}, {closed frequent patterns}, and {max frequent

patterns}.

- {all frequent patterns} ≥ {closed frequent patterns} ≥ {max frequent

patterns}

- {all frequent patterns} ≥ {max frequent patterns} ≥ {closed frequent

patterns}

- {all frequent patterns} ≥ {max frequent patterns} = {closed

frequent patterns}, i.e. the set of max frequent patterns and the

set of closed frequent patterns are identical.

- {all frequent patterns} ≥ {max frequent patterns}, {all frequent

patterns} ≥ {closed frequent patterns}, but the order of {max

frequent patterns} and {closed frequent patterns} cannot be

determined without further information.

- Ranking is impossible without further information.

**Question 6)**

**Table 1: Transactions from a database**

**Given the transaction in Table 1 and mini-support (minsup) s = 40%,**

which of the following is a length-3 frequent item set?

which of the following is a length-3 frequent item set?

- Beer, Nuts, Eggs
- Beer, Coffee, Milk
- Coffee, Diapers, Eggs
**Beer, Nuts, Diapers**

**Question 7)**

**A strong association rule satisfies both the mini-support (minsup)**

and minconfthresholds. Given the transactions in Table 1, mini-support

(minsup)s = 50%, andminconf c = 50%, which of the following is not a

strong association rule?

and minconfthresholds. Given the transactions in Table 1, mini-support

(minsup)s = 50%, andminconf c = 50%, which of the following is not a

strong association rule?

- {Beer} ⇒ {Diapers}
**{Beer, Nuts} ⇒ {Diapers}**- {Diapers} ⇒ {Nuts}
- {Nuts} ⇒ {Diapers}
- {Diapers} ⇒ {Beer}

**Question 8)**

**Consider the database containing the transaction T1 : {a1, a2, a3},**

T2 : {a2, a3, a4}, T3 : {a1,a3, a4}. Let mini-support (minsup) = 2.

Which of the following frequent patterns is NOT closed?

T2 : {a2, a3, a4}, T3 : {a1,a3, a4}. Let mini-support (minsup) = 2.

Which of the following frequent patterns is NOT closed?

- {a2}
- {a1, a3}
- {a3}
**{a3, a4}**

**Question 9)**

**Consider the database containing the transactions T1 : {a1, a2, a3,**

a4, a5}, T2 : {a2, a3, a4, a5,a6}. Let minsup = 1. Which of the

following is both a max frequent and a closed frequent pattern? Select

all that apply.

a4, a5}, T2 : {a2, a3, a4, a5,a6}. Let minsup = 1. Which of the

following is both a max frequent and a closed frequent pattern? Select

all that apply.

- {a2, a3, a4, a5}
- {a2, a5}
**{a1, a2, a3, a4, a5}****{a2, a3, a4, a5, a6}**- {a1, a2, a3, a4, a5, a6}

**Question 10)**

**Given the set of closed frequent patterns, we can ___________. Select**

all that apply.

all that apply.

- Recover all transactions in the database
**Find the set of max frequent patterns**- Recover the set of all frequent patterns and their support in some

situations but not all

Always recover the set of all frequent patterns and their support

**Question 11)**

**Given the transactions in Table 1, mini-support (minsup) s= 50%, and**

minconf c = 50%, which of the following is an association rule? Select

all that apply.

minconf c = 50%, which of the following is an association rule? Select

all that apply.

- Nuts ⇒ Eggs
- Coffee ⇒ Milk
- Diapers ⇒ Eggs
- Nuts ⇒ Diapers
- Beer ⇒ Nuts

**Question 12)**

**Which of the following statements is true?**

- The set of closed frequent patterns is always the same as the set of

max frequent patterns.

- Since both closed and max frequent patterns are a subset of all

frequent patterns, we cannot recover all frequent patterns and their

supports given just the closed and max frequent patterns.

- Closed frequent patterns can always be determined from the set of

max frequent patterns.

- We can recover all frequent patterns and their supports from the set

of max frequent patterns.

**We can recover all frequent patterns and their supports from the**

set of closed frequent patterns.

**Question 13)**

**Given the transactions in Table 1, what is the confidence and**

relative support of the association rule {Diapers} ⇒ {Coffee,

Nuts}?

relative support of the association rule {Diapers} ⇒ {Coffee,

Nuts}?

**support s = 0.4, confidence c = 0.5**- support s = 0.8, confidence c = 0.5
- support s = 0.4, confidence c = 1
- support s = 0.8, confidence c = 1
- None of the above

**Lesson 2 Quiz**

**Question 1)**

**If we know the support of itemset {a, b} is 10, which of the**

following numbers are the possible supports of itemset {a, b, c}?

Select all that apply.

following numbers are the possible supports of itemset {a, b, c}?

Select all that apply.

- 11
**9****10**

**Question 2)**

**If we know the support of itemset {a} is 50 and the support of**

itemset {a, b, c} is 30, which of the following numbers are the

possible supports of itemset {a, b}? Select all that apply.

itemset {a, b, c} is 30, which of the following numbers are the

possible supports of itemset {a, b}? Select all that apply.

**10**- 5
**30**- 100
- 50

**Question 3)**

**Considering the Apriori algorithm, assume we have obtained all size-2**

(i.e., containing 2 items, e.g. {A, B}) frequent itemsets. They are

{A, B}, {A, C}, {A, D}, {B, C}, {B, E}, and {C, E}. In the following

size-3 itemsets, which of them should be considered, i.e., have

potential to be size-3 frequent itemsets? Select all that apply.

(i.e., containing 2 items, e.g. {A, B}) frequent itemsets. They are

{A, B}, {A, C}, {A, D}, {B, C}, {B, E}, and {C, E}. In the following

size-3 itemsets, which of them should be considered, i.e., have

potential to be size-3 frequent itemsets? Select all that apply.

- {A, B, D}
- {A, C, D}
**{B, C, E}**- {A, B, C}

**Question 4)**

**Given the FP-tree as shown in Figure 1, how many transactions do we**

have in total?

have in total?

- 4
**5**- 3
- 1
- 2

**Question 5)**

**If we know the support of itemset {a} is 50 and the support of**

itemset {a, b, c} is 10, which of the following numbers are the

possible supports of itemset {a, b}? Select all that apply.

itemset {a, b, c} is 10, which of the following numbers are the

possible supports of itemset {a, b}? Select all that apply.

- 5
**10****50****30**- 100

**Question 6)**

**Considering the Apriori algorithm, assume we have 5 items (A to E) in**

total. In the 1st scan, we find out all frequent items A, B, C, and E.

How many size-2 (i.e., containing 2 items, e.g. A, B) itemsets should

be considered in the 2nd scan, i.e., have potential to be size-2

frequent itemsets? Select all that apply.

total. In the 1st scan, we find out all frequent items A, B, C, and E.

How many size-2 (i.e., containing 2 items, e.g. A, B) itemsets should

be considered in the 2nd scan, i.e., have potential to be size-2

frequent itemsets? Select all that apply.

- 10
- 25
- 4
**6**

**Question 7)**

**Given the FP-tree as shown Figure 1, which of the following choices**

is in the f-conditional database? Select all that apply.

is in the f-conditional database? Select all that apply.

**{c, a, b, m} : 1**- {c, b, p} : 1
**{b} : 1****{c, a, m, p} : 2**

**Extra Question**

**Question 1)**

**Which of the following tasks does not fall under the scope of data**

mining? Select all that apply.

mining? Select all that apply.

**Data entry.****Data Cleaning.**

**Question 2)**

**Given the transaction in Table 1 and minsup s = 50%, how many**

frequent 3-itemsets are there?

frequent 3-itemsets are there?

**Answer:**

**0**

**Question 3)**

**A strong association rule satisfies both the minsup and minconf**

thresholds. Given the transactions in Table 1, minsup s = 50%, and

minconf c = 50%, how many strong association rules are there? Note

that the association rule A => B and B => A are distinct.

thresholds. Given the transactions in Table 1, minsup s = 50%, and

minconf c = 50%, how many strong association rules are there? Note

that the association rule A => B and B => A are distinct.

**Answer:**

**6**

**Question 4)**

**Given the transactions in Table 1, minsup s = 50%, and minconf c =**

50%, which of the following is an association rule? Select all that

apply.

50%, which of the following is an association rule? Select all that

apply.

**Answer:**

**Beer => Nuts****Nuts => Diaper**

**Question 5)**

**Consider the database containing the transaction T1 : {a1, …, a5},**

T2 : {a1, …, a1}, T3 : {a3, …, a7}, T4 : {a4, …, a8}. For what

value of minsup do we have the most number of closed frequent

patterns?

T2 : {a1, …, a1}, T3 : {a3, …, a7}, T4 : {a4, …, a8}. For what

value of minsup do we have the most number of closed frequent

patterns?

**Answer:**

**minsup = 1**

**Question 6)**

**Consider the database containing the transactions T1 : {a1, …, a3},**

T2 : {a2, …, a4}. Let minsup = 1. What fraction of all frequent

patterns is max frequent patterns?

T2 : {a2, …, a4}. Let minsup = 1. What fraction of all frequent

patterns is max frequent patterns?

**Answer:**

**2/11**

**Question 7)**

**Consider the database containing the transaction T1 : {a1, a2, a3},**

T2 : {a2, a3, a4}. Let minsup = 1. What fraction of all frequent

patterns is closed?

T2 : {a2, a3, a4}. Let minsup = 1. What fraction of all frequent

patterns is closed?

**Answer:**

**3/11**

**Question 8)**

**Rank the following sets by their cardinality for a given database:**

{all frequent patterns}, {closed frequent patterns}, {max frequent

patterns}

{all frequent patterns}, {closed frequent patterns}, {max frequent

patterns}

**Answer:**

**{all frequent patterns} >= {closed frequent patterns} >= {max**

frequent patterns}

**Question 9)**

**Which of the following statements is true?**

**Answer:**

**We can recover all frequent patterns from the set of closed**

frequent patterns.

**Question 10)**

**If we know the support of itemset {a, b, c} is 10, which of the**

following numbers are the possible supports of the itemset {a, b}?

following numbers are the possible supports of the itemset {a, b}?

**Answer:**

**10****11**

**Question 11)**

**If we know the support of itemset {a, b} is 10, which of the**

following numbers are the possible supports of itemset {a, b, c}?

following numbers are the possible supports of itemset {a, b, c}?

**Answer:**

**9****10**

**Question 12)**

**If we know the support of itemset {a} is 50, and the support of**

itemset {a, b, c} is 10, which of the following numbers are the

possible supports of itemset {a, d}?

itemset {a, b, c} is 10, which of the following numbers are the

possible supports of itemset {a, d}?

**Answer:**

**5****50****30****10**

**Question 13)**

**Considering Apriori Algorithm, assume we have 5 items (A to E) in**

total. In the 1-st scan, we find out all frequent items A, B, C, and

E. How many size-2 (i.e. containing 2 items, e.g. A, B) itemsets

should be considered in 2-nd scan, i.e. are potential to be size-2

frequent itemsets?

total. In the 1-st scan, we find out all frequent items A, B, C, and

E. How many size-2 (i.e. containing 2 items, e.g. A, B) itemsets

should be considered in 2-nd scan, i.e. are potential to be size-2

frequent itemsets?

**Answer:**

**6**

**Question 14)**

**Considering Apriori Algorithm, assume we have obtained all size-2**

(i.e. containing 2 items, e.g. {A, B}) frequent itemsets. They are {A,

B}, {A, C}, {A, D}, {B, C}, {B, E}, {C, E}. In the following size-3

itemsets, which of them should be considered, i.e. are potential to be

size-3 frequent itemsets?

(i.e. containing 2 items, e.g. {A, B}) frequent itemsets. They are {A,

B}, {A, C}, {A, D}, {B, C}, {B, E}, {C, E}. In the following size-3

itemsets, which of them should be considered, i.e. are potential to be

size-3 frequent itemsets?

**Answer:**

**{A, B, C}****{B, C, E}**

**Question 15)**

**Given the FP-tree as shown in Figure 1, what is the support of {c,**

p}?

p}?

**Answer:**

**3**

Pattern Discovery in Data Mining

**Week 2 Quiz Answer**

### Lesson 3 Quiz

**Question 1)**

**What is the value range of the Kulczynski measure?**

- (-∞, +∞)
- [-1, 1]
**[0, 1]**- [0, +∞)

**Question 2)**

**What is the value range of the χ2 measure?**

- (-∞, +∞)
- [-1, 1]
- [0, 1]
- [0, +∞)

**Question 6)**

**Which of the following measures is NOT null invariant?**

- Cosine
- Lift
- All confidence
**Kulcyzynski**

**Question 7)**

**Suppose we are interested in analyzing the purchase of comics**

(CM) and fiction (FC) in the transaction history of a bookstore. We have

the following 2 × 2 contingency table summarizing the transactions. If

χ2 is used to measure the correlation between CM and FC, what is the χ2

score?

(CM) and fiction (FC) in the transaction history of a bookstore. We have

the following 2 × 2 contingency table summarizing the transactions. If

χ2 is used to measure the correlation between CM and FC, what is the χ2

score?

- -240
- -80
- 80
**240**

**Question 7)**

**What is the value range of the Kulczynski measure?**

- [0, 1]
- (-∞, +∞)
- [-1, 1]
- [0, +∞)

**Question 10)**

**Suppose we are interested in analyzing the purchase of comics (CM) and**

fiction (FC) in the transaction history of a bookstore. We have the

following 2 × 2 contingency table summarizing the transactions. If lift

is used to measure the correlation between CM and FC, what is the value

for lift(CM, FC)?

fiction (FC) in the transaction history of a bookstore. We have the

following 2 × 2 contingency table summarizing the transactions. If lift

is used to measure the correlation between CM and FC, what is the value

for lift(CM, FC)?

- -0.6
**0.6**- -2e-4
- 2e-4

**Questine 11)**

**Suppose we are interested in analyzing the transaction history of**

several supermarkets with respect to purchase of apples (A) and bananas

(B). We have the following table summarizing the transactions.

several supermarkets with respect to purchase of apples (A) and bananas

(B). We have the following table summarizing the transactions.

**Which of the following measures would you use to determine the**

correlation of purchases between apples and bananas across all these

supermarkets?

correlation of purchases between apples and bananas across all these

supermarkets?

- χ2
**Kulcyzynski**- Lift
- Cosine

**Question 12)**

**Suppose a school collected some data on students’ preference for hot**

dogs (HD) vs. hamburgers (HM). We have the following 2×2 contingency

table summarizing the statistics. If χ2 is used to measure the

correlation between HD and HM, what is the χ2score?

dogs (HD) vs. hamburgers (HM). We have the following 2×2 contingency

table summarizing the statistics. If χ2 is used to measure the

correlation between HD and HM, what is the χ2score?

- 0
- -1
- -∞
- 1

### Lesson 4 Quiz

**Question 1)**

**Suppose one needs to frequent patterns at two different levels, with**

mini-support (minsup) of 5% (higher level) and 3% (lower level),

respectively. If using shared multi-level mining, which mini-support

(minsup) threshold should be used to generate candidate patterns for

the higher level?

mini-support (minsup) of 5% (higher level) and 3% (lower level),

respectively. If using shared multi-level mining, which mini-support

(minsup) threshold should be used to generate candidate patterns for

the higher level?

**3%**- 1%
- 8%
- 5%

**Question 2)**

**A store had 100,000 total transactions in Q4 2014. 10,000**

transactions contained eggs, while 5,000 contained bacon. 2000

transactions contained both eggs and bacon. Which of the following

choices for the value of ε is the smallest such that {eggs, bacon} is

considered a negative pattern under the null-invariant definition?

transactions contained eggs, while 5,000 contained bacon. 2000

transactions contained both eggs and bacon. Which of the following

choices for the value of ε is the smallest such that {eggs, bacon} is

considered a negative pattern under the null-invariant definition?

- 0.1
- 0.81
- 0.5
- 01
- A value for ε such that {eggs, bacon} is a negative pattern under

the null-invariant definition does not exist.

**Question 3)**

**Below is a table of transactions. According to the introduced pattern**

distance measure, what is the distance between pattern “abc” and

pattern “abd”?

distance measure, what is the distance between pattern “abc” and

pattern “abd”?

- 0
- 0.5
- 0.2
- 0.333

**Question 4)**

**Given the itemsets in Table 1 and a cluster quality measure δ =**

0.001, what could be a set of representative patterns that covers all

itemsets in Table 1?

0.001, what could be a set of representative patterns that covers all

itemsets in Table 1?

Hint: The pattern with the least support is {F, A, C, E, T, S}. Consider

which pattern in the table may δ-cover the pattern {F, A, C, E, T, S}.

which pattern in the table may δ-cover the pattern {F, A, C, E, T, S}.

- {{F, A, C, E, T, S}}

- {{F, A, C, E, S}, {A, C, E, S}}

- {{F, A, C, E, S}, {F, A, C, T, S}}

- {{F, A, C, E, S}, {F, A, C, E, T, S}, {F, A, C, T, S}}

- {{A, C, E, S}, {A, C, T, S}}

**Question 5)**

**A store had 100,000 total transactions in Q4 2014. 10,000**

transactions contained beer, while 5,000 contained frying pans. 600

transactions contained both beer and frying pans. Which of the

following is true?

transactions contained beer, while 5,000 contained frying pans. 600

transactions contained both beer and frying pans. Which of the

following is true?

- More information is needed to determine if {beer, frying pans} is a

negative pattern.

- {beer, frying pans} is a negative pattern under the support-based

definition of negatively correlated patterns.

**For ε = 0.1, {beer, frying pans} is a negative pattern under the**

null-invariant definition of negatively correlated patterns.

- There does not exist a value for ε such that {beer, frying pans} is

a negative pattern by the null-invariant definition of negative

patterns.

**Question 6)**

**Given the itemsets in Table 1, which of the following patterns are in**

the δ-cluster containing the pattern {A, C, E, S} for δ = 0.0001?

the δ-cluster containing the pattern {A, C, E, S} for δ = 0.0001?

Hint: Consider two patterns P1 and P2 such that O(P1) ⊆ O(P2), where

O(Pi) is the corresponding itemset of pattern Pi . Take a second to

convince yourself that the following is true:

O(Pi) is the corresponding itemset of pattern Pi . Take a second to

convince yourself that the following is true:

- {A, C, T, S}

**{F, A, C, E, S}**

- {F, A, C, T, S}

- {F, A, C, E, T, S}

**Question 7)**

**Consider two patterns P1 and P2 such that O(P1) ⊆ O(P2), where O(Pi)**

is the corresponding itemset of pattern Pi. Take a second to convince

yourself that the following is true:

is the corresponding itemset of pattern Pi. Take a second to convince

yourself that the following is true:

**Which of the following patterns in Table 1 is δ-covered by {F, A, C,**

E, T, S} for δ=0.4? Select all that apply.

E, T, S} for δ=0.4? Select all that apply.

- {A, C, E, S}

- {F, A, C, T, S}

- {A, C, T, S}

- {F, A, C, E, S}

### Extra Questions

**Question 1)**

**Suppose a school collected some data on students’ preference for**

hot dogs(HD) vs. hamburgers (HM). We have the following 2×2

contingency table summarizing the statistics. If lift is used to

measure the correlation between HD and HM, what is the value for

lift(HD, HM)?

hot dogs(HD) vs. hamburgers (HM). We have the following 2×2

contingency table summarizing the statistics. If lift is used to

measure the correlation between HD and HM, what is the value for

lift(HD, HM)?

**Answer:**

**1**- -∞
- 0
- -1

**Question 2)**

**Suppose Coursera collected statistics on the number of students**

who take courses on data mining (DM) and machine learning (ML). We

have the following 2×2 contingency table summarizing the

statistics. If χ2 is used to measure the correlation between DM

and ML, what is the χ2 score?

who take courses on data mining (DM) and machine learning (ML). We

have the following 2×2 contingency table summarizing the

statistics. If χ2 is used to measure the correlation between DM

and ML, what is the χ2 score?

**Answer:**

**562.5**- -562.5
- -225
- 225

**Question 3)**

**What is the value range of the Lift measure?**

**Answer:**

**ric: normal; vertical-align: baseline; white-space: pre-wrap;”>[0, +∞)**- [0, 1]
- (-∞, +∞)
- [-1, 1]

**Question 4)**

**Which of the following measures is NOT null invariant?**

**Answer:**

**X**2

**Question 5)**

**Suppose we are interested in analyzing the transaction history of**

several supermarkets with respect to purchase of apples(A) and

bananas(B). We have the following table summarizing the

transactions.

several supermarkets with respect to purchase of apples(A) and

bananas(B). We have the following table summarizing the

transactions.

**Denote li as the lift measure and ki as the Kulcyzynski measure**

for supermarket Si(i = 1, 2). Which of the following is

correct?

for supermarket Si(i = 1, 2). Which of the following is

correct?

**Answer:**

**l1 ≠ l2, k1 = k2**

**Question 6)**

**A store had 100,000 total transactions in Q4 2014. 10,000**

transactions contained eggs, while 5,000 contained bacon. 2000

transactions contained both eggs and bacon. Which of the following

choices for the value of ε is the smallest such that {eggs, bacon}

is considered a negative pattern under the null-invariant

definition?

transactions contained eggs, while 5,000 contained bacon. 2000

transactions contained both eggs and bacon. Which of the following

choices for the value of ε is the smallest such that {eggs, bacon}

is considered a negative pattern under the null-invariant

definition?

**Answer:**

**0.5**

**Question 7)**

**Given the itemsets in Table 1, which of the following patterns**

are in the δ-cluster containing the pattern {A, C, E, S} for δ =

0.0001?

are in the δ-cluster containing the pattern {A, C, E, S} for δ =

0.0001?

**Answer:**

**{F, A, C, E, S}**

**Question 8)**

**Given the transactions in Table 2, which of the following is a**

(1, 0.5)-robust pattern in the database? Select all that apply.

(1, 0.5)-robust pattern in the database? Select all that apply.

**Answer:**

**None of the other options are correct.**

**Question 9)**

**A constraint is anti-monotone if an itemset S violates the**

constraint, so do all of its supersets. Which of following

constraints is anti-monotone?

constraint, so do all of its supersets. Which of following

constraints is anti-monotone?

**Answer:**

**range(S.price) < 10**

**Question 10)**

**A constraint is monotone if an itemset S satisfies the**

constraint, so do all of its supersets. Which of following

constraints is monotone?

constraint, so do all of its supersets. Which of following

constraints is monotone?

**Answer:**

**min(S.price) < 15**

**Question 11)**

**A constraint is succinct if the constraint c can be enforced by**

directly manipulating the data. Which of following constraints is

succinct

directly manipulating the data. Which of following constraints is

succinct

**Answer:**

**ax(S.price) > 20**

**Week 3 Quiz Answer**

### Lesson 5 Quiz

**Question 1)****Given a sequence database, as shown in Table 3, with support**

threshold minsup = 3, which of the following sequences are

frequent?**<abc>**- < a(bc) >
- <ade>
- <acf>
- None of the above

**Question 2)****Suppose we use Generalized Sequential Patterns (GSP) to find the**

frequent sequential patterns. After scanning the database once, we

find the frequent singleton sequences are: a, b, d. Which of the

following could be possible length-2 candidate sequences?- <(ac)>
**<ab>**- <ad>
**<(bd)>**

**Question 3)****Given a sequence database, as shown in the following table, suppose**

we use the SPADE algorithm to find the frequent sequential patterns.

Which of the following sequences (in the format of <SID, EID>)

belong to the mapped database of item b?**<3, 1>**- <3, 2>
- <4, 1>
**<1, 2>**

**Question 4)****Given a sequence database, as shown in Table 10. Suppose min_sup =**

1. Which of the following does not belong to the < a

>-projected database?- < b(bd) >
- < f(e)(cdeh)cfg(abe) >
- < d(bc)c(fg)(ch) >
- < (_d)ebf(cdfgh) >
- All of the above belong to <a>-projected database.

**Question 5)****Suppose we use the CloSpan algorithm to find all closed sequential**

patterns from a sequence database with minimum support 15. During

the mining process, we derive the following sequences along with the

sizes of their projected DBs: <c>: 50, <ac> 45,

<b> 30, <bc>: 30. Then we use the backward sub-pattern

rule and the backward super-pattern rule to prune redundant search

space. Which of the projected DBs will remain after the pruning?- <c>
- <bc>
- <b>
- <ac>

**Question 6)****Given a sequence database, as shown in Table 2, with support**

threshold minsup = 3, which of the following sequences are

frequent?- <abc>
- < f(ab) >
- < (bd)b >
**< (ae)c >**- None of the above

**Question 7)****Given a sequence database as shown in the following table, suppose**

we use the SPADE algorithm to find the frequent sequential patterns.

Which of the following sequences (in the format of <SID, EID>)

belong to the mapped database of item a?- <4, 1>
- <3, 2>
**<1, 2>****<1, 1>**

**Question 8)****Given a sequence database, as shown in Table 10. Suppose min_sup =**

1. Which of the following does not belong to the < a

>-projected database?- < b(bd) >
- < f(e)(cdeh)cfg(abe) >
- < d(bc)c(fg)(ch) >
- < (_d)ebf(cdfgh) >
- All of the above belong to <a>-projected database.

**Question 9)****Suppose we use the CloSpan algorithm to find all closed sequential**

patterns from a sequence database with minimum support 15. During

the mining process, we derive the following sequences along with the

sizes of their projected DBs: <c>: 50, <ac> 40,

<ab> 30, <bc>: 50. Then we use the backward sub-pattern

rule and the backward super-pattern rule to prune redundant search

space. Which of the projected DBs will remain after the pruning?- <c>
- <ab>
- <ac>
- <bc>

**Question 10)****Given a sequence database, as shown in Table 11. Suppose min_sup =**

1. Which of the following does not belong to the < d

>-projected database?- < de >
- < (bc)c(fg)(ch) >
- <ebf(cdfgh) >
**< (c_eh)cfg(abe) >**

**Question 11)****Suppose we use the CloSpan algorithm to find all closed sequential**

patterns from a sequence database with minimum support 15. During

the mining process, we derive the following sequences along with the

sizes of their projected DBs: <c>: 50, <ac> 50,

<ab> 30, <bc>: 30. Then we use the backward sub-pattern

rule and the backward super-pattern rule to prune redundant search

space. Which of the projected DBs will remain after the pruning?- <c>
- <ac>
- <ab>
- <bc>

**Question 12)****Given a sequence database as shown in Table 1 with support**

threshold mini-support (minsup) = 3, which of the following

sequences is frequent?- <abc>
- < (ab)f >
- < f(bd) >
**< a (bf) >**

**Question 13)****Suppose we use Generalized Sequential Patterns (GSP) to find the**

frequent sequential patterns. After scanning the database once, we

find the frequent singleton sequences are: a, b, d. Which of the

following could be possible length-2 candidate sequences?- <ac>
- <ab>
- <(bc)>
- <(bd)>

**Question 14)****Given a sequence database, as shown in Table 12. Suppose min_sup =**

1. Which of the following does not belong to the < e

>-projected database?- < bf(cdfgh) >
- < (_g)(adf)gh>
- < (cdeh)cfg(abe) >
**< (_h)cfg(abe) >**

**Question 14)****Given a sequence database, as shown in the following table, suppose**

we use the SPADE algorithm to find the frequent sequential patterns.

Which of the following sequences (in the format of <SID, EID>)

belong to the mapped database of item c?**<1, 3>****<3, 2>**- <1, 1>
- <4, 1>

**Question 15)****Given a sequence database, as shown in Table 11. Suppose min_sup =**

1. Which of the following does not belong to the < d

>-projected database?- < de >
- <ebf(cdfgh) >
- < (bc)c(fg)(ch) >
**< (c_eh)cfg(abe) >**

### Lesson 6 Quiz

**Question 1)****Which of the following is true about spatial association mining?**

Select all that apply.- A rule is called a spatial association as long as its confidence

is no less than the given confidence threshold.

- In the progressive refinement framework, the result associations

will be the refinement of the rough patterns obtained in the

first step.

- There is no difference between mining spatial associations and

mining classic association rules.

- A rule is called a spatial association if its support is no less

than the given support threshold and its confidence is no less

than the given confidence threshold.

**Question 2)****Consider a spatial database that consists of 1000 records. If an**

item A appears 200 times in the database and the rule “if A, then

B” appears 100 times, what are the support and confidence for the

rule “if A, then B”?- support: 20%; confidence: 20%
- support: 20%; confidence: 10%
**support: 10%; confidence: 50%**- support: 20%; confidence: 50%

**Question 3)****For a frequent trajectory pattern, we require that the**

consecutive places in the trajectory pattern have a time gap no

larger than the time constraint. Given a time constraint of 30 min

and a support threshold of 5%, which of the following are valid

frequent trajectory patterns?- Railway Station —15min→ Castle Square —15min→ Museum [Support:

3%] - Railway Station —15min→ Castle Square —45min→ Museum [Support:

6%] - Railway Station —15min→ Castle Square —2h15min→ Museum [Support:

7%] **Railway Station —10min→ Middle Bridge —10min → Campus**

[Support: 7%]

**Question 4)****For mining semantics-rich movement patterns, which of the**

following statements are true about the top-down mining approach

Splitter? Select all that apply.- The top-down mining approach can effectively reduce the search

space of movement patterns. - The final movement patterns reflect only people’s spatial

transitions from one region to another. - The coarse patterns generated by the first step mainly reflect

people’s semantics-level transitions. - When grouping the places in the first step, the places having

the same semantic category should be put into the same group.

**Question 5)****For a frequent trajectory pattern, we require that the**

consecutive places in the trajectory pattern have a time gap no

larger than the time constraint. Given a time constraint of 30 min

and a support threshold of 8%, which of the following are valid

frequent trajectory patterns?- Railway Station —45min→ Castle Square —15min→ Museum [Support:

15%] **Railway Station —20min→ Middle Bridge —10min → Campus**

[Support: 8%]- Railway Station —35min→ Castle Square —15min→ Museum [Support:

3%] - Railway Station —55min→ Castle Square —15min→ Museum [Support:

10%]

**Question 6)****For mining semantics-rich movement patterns, which of the**

following statements are true about the top-down mining approach

Splitter? Select all that apply.- The top-down mining approach can effectively reduce the search

space of movement patterns. - The final movement patterns reflect only people’s spatial

transitions from one region to another. - Given a support threshold d, the support of any result movement

patterns must be no less than d. - In this approach, similar places should be put into the same

group to collectively meet the support threshold.

**Question 7)****Which of the following is true about spatial association**

mining?- There is no difference between mining spatial associations and

mining classic association rules.

- A rule is called a spatial association as long as its support is

no less than the given support threshold.

- A rule is called a spatial association as long as its confidence

is no less than the given confidence threshold.

**For mining spatial associations, the hierarchy of spatial**

relationship can be used to speed up the mining process.

**Question 8)****Which of the following is true about spatial association mining?**

Select all that apply.- The progressive refinement framework is mainly for visualization

purposes.

**The progressive refinement framework can reduce the search**

space of spatial associations.

- A rule is called a spatial association as long as its support is

no less than the given support threshold.

**A rule is called a spatial association if its support is no**

less than the given support threshold and its confidence is no

less than the given confidence threshold.

**Extra Questions**

**Question 1)**

**Given a sequence database, as shown in Table 2, with support threshold**

min-sup = 3, which of the following sequences are frequent?

min-sup = 3, which of the following sequences are frequent?

**Answer:**

**Question 2)**

**Given a sequence database, as shown in Table 5, and support threshold**

min-sup = 4, use Generalized Sequential Patterns (GSP) to find the

frequent sequential patterns. After scanning the database once, how many

length-2 candidate sequences will be generated after Apriori pruning?

How many length-2 candidate sequences will be generated if not using

Apriori pruning?

min-sup = 4, use Generalized Sequential Patterns (GSP) to find the

frequent sequential patterns. After scanning the database once, how many

length-2 candidate sequences will be generated after Apriori pruning?

How many length-2 candidate sequences will be generated if not using

Apriori pruning?

**Answer:**

**22; 51**

**Question 3)**

**Given a sequence database, as shown in Table 8, and support threshold**

min-sup = 4, use Generalized Sequential Patterns (GSP) to find the

frequent sequential patterns. What is the minimum number of times we

need to scan the database in order to find all the frequent sequential

patterns?

min-sup = 4, use Generalized Sequential Patterns (GSP) to find the

frequent sequential patterns. What is the minimum number of times we

need to scan the database in order to find all the frequent sequential

patterns?

**Answer:**

**2**

**Question 4)**

**Given a sequence database, as shown in Table 10, and min-sup = 1, which**

of the following does not belong to the -projected database?

of the following does not belong to the -projected database?

**Answer:**

**<b(bd)>**

**Question 5)**

**Given a sequence database, as shown in Table 15, which of the following**

sequential patterns are closed?

sequential patterns are closed?

**Answer:**

**Question 6)**

**In our database, we have the following three graphs:**

**If we set the support threshold min-sup = 3, which of the following**

sequences is NOT a frequent graph pattern?

sequences is NOT a frequent graph pattern?

**Answer:**

**Question 7)**

**When we use the Apiori-based approach to find the frequent graph**

pattern for a candidate graph, we need to check all of its subgraphs.

Given the following graph, how many distinct subgraphs with seven

vertices are there?

pattern for a candidate graph, we need to check all of its subgraphs.

Given the following graph, how many distinct subgraphs with seven

vertices are there?

**Answer:**

**1**

**Question 8)**

**In our database, we have the following three graphs:**

**What is the support of the following graph?**

**Answer:**

**1**

**Question 9)**

**Suppose we have learned two ranked rules as follows (the default is**

Type 2):

Type 2):

{“ipad”, “iphone”} -> Type 1

{“kindle”, “iphone”} -> Type 2

{“ipad”} -> Type 1

**For the people who have**{“kindle”, “iphone”}

**, which type will they be classified as by CBA algorithm?**

**Answer:**

**Type 2**

**Week 4 Quiz Answer**

### Lesson 7 Quiz

**Question 1)****For mining text data, which of the following algorithms will not**

output phrases?- KERT
- SegPhrase
**LDA**- ToPMine

**Question 2)****Given a text corpus, which of the following can be used for**

measuring the colocation strength for a pair of words? Select all

that apply.- Z-test
- T-test
- Edit distance
- Mutual information

**Question 3)****Suppose we want to use contiguous pattern mining to extract**

candidate phrases. Given the five statements below and a support

threshold 3, which of the given phrases can be considered as

candidates? Select all that apply.(1) Support vector machine is a classifier.(2) Neural network performs equally well as support vector machine.(3) We propose a method that combines support vector machine with

kernel method.(4) Neural network is harder to tune than support vector machine.(5) Support vector machine is important for regression.**vector machine**- neural network
**support vector****support vector machine**

**Question 4)****Which of the following measures has been used for ranking phrases**

in KERT? Select all that apply.**Completeness****Popularity**- Informativeness
- Likelihood ratio

**Question 5)****Suppose we want to use contiguous pattern mining to extract**

candidate phrases. Given the five statements below and a support

threshold 3, which of the given phrases can be considered as

candidates?(1) Support vector machine is a classifier.(2) Neural network performs equally well as support vector machine.(3) We propose a method that combines support vector machine with

kernel method.(4) Neural network is harder to tune than support vector machine.(5) Support vector machine is important for regression.- kernel method
- neural network
- support machine
**support vector machine**

**Question 6)****Which of the following measures has been used for ranking phrases**

in KERT? Select all that apply.- KL divergence
**Popularity****Completeness**- Mutual information

### Lesson 8 Quiz

**Question 1)****Which of the following algorithms is not designed for frequent**

pattern mining in stream data with approximation?**FP-growth**- Space saving algorithm
- Sticky sampling algorithm
- Lossy counting algorithm

**Question 2)****A data scientist is applying the lossy counting algorithm to a**

transactional data stream in order to obtain the counts of

different items. If the bucket size is set to 1000, the total

length of the transactional data stream is 10000, and the true

count of an item A is 100, which of the following could be the

possible outputs of item A’s count by lossy counting? Select all

that apply.**98****90**- 80
- 102

**Question 3)****In CP-Miner, we use constraint-based sequential pattern mining**

to obtain the frequent sequences. Let us consider a source file,

which has been transformed into a sequence DB after tokenization

and hashing. If we set the max gap to 2 (the index difference

between two items is no larger than 2) and the support threshold

to 0.6, which of the following can be the frequent sequences

output by CP-Miner? Select all that apply.(1) <1, 2, 1, 3>(2) <2, 3, 4, 1>(3) <1, 2, 4, 3>(4) <3, 2, 4, 3>(5) <1, 2, 5, 4>- <1, 2>
- <2, 4>
- <3, 4>
- <1, 3>

**Question 4)****Which of the following are designed for preserving data**

privacy? Select all that apply.- σ-frequent
**t-closeness****K-anonymity****Differential privacy**

**Question 5)****Which of the following algorithms is not designed for frequent**

pattern mining in stream data with approximation?**CloSpan**- Space saving algorithm
- Sticky sampling algorithm
- Lossy counting algorithm

**Question 6)****A data scientist is applying the lossy counting algorithm to a**

transactional data stream in order to obtain the counts of

different items. If the bucket size is set to 1000, the total

length of the transactional data stream is 10000, and the true

count of an item A is 100, which of the following could be the

possible outputs of item A’s count by lossy counting?- 85
- 110
**95**- 105

**Question 7)****In CP-Miner, we use constraint-based sequential pattern mining**

to obtain the frequent sequences. Let us consider a source file,

which has been transformed into a sequence DB after tokenization

and hashing. If we set the max gap to 2 (the index difference

between two items is no larger than 2) and the support threshold

to 0.6, which of the following can be the frequent sequences

output by CP-Miner? Select all that apply.(1) <1, 2, 1, 3>(2) <2, 3, 1, 4>(3) <1, 2, 4, 3>(4) <3, 2, 4, 3>(5) <1, 2, 5, 4>- <1, 2>
- <1, 3>
- <2, 4>
- <2, 3>

**Extra Questions**

**Question 1)**

**For the task of frequent pattern mining for text data, which of the**

following algorithms will NOT output phrases?

following algorithms will NOT output phrases?

**Answer:**

**LDA**

**Question 2)**

**Which of the following algorithms are not designed for frequent**

pattern mining in stream data with approximation?

pattern mining in stream data with approximation?

**Answer:**

**FP-growth (Han, Pei & Yin, SIGMOD’00)**

**Question 3)**

**Which of the following algorithms are not designed for spatiotemporal**

and trajectory pattern mining?

and trajectory pattern mining?

**Answer:**

**CP-Miner (Li, Lu, Myagmar, Zhou, OSDI’04)**

**Question 4)**

**Which of the following algorithms is NOT designed for**

privacy-preserving pattern/association rule mining?

privacy-preserving pattern/association rule mining?

**Answer:**

**Co-location Mining Algorithm (Huang, Shekhar & Xiong,**Pattern Discovery in Data Mining Week 1

TKDE’04)

## Orientation Quiz Answer

In this article i am gone to share Coursera Course Pattern Discovery

in Data Mining Week 1 Orientation Quiz Answer with you..### Practice Exercise Orientation Quiz

**Question 1)****This course lasts for ___ weeks.**Answer:**4**- 6
- 10
- 8

**Question 2)****I am required to read a textbook for this course.**- True
**False**

**Question 3)****Which of the following activities are required to pass this course**

in order to receive the Course Certificate? Check all that apply.**Lectures**- In-lecture questions
**Eight graded lesson quizzes****One required programming assignment**

**Question 4)****The following tools will help me use the discussion forums.**- “Up-voting” posts that are thoughtful, interesting, or helpful.
- “Tagging” my posts with keywords other students might use in

searching the forums. - Subscribing to any forums that are particularly interesting to me.
**All of the other options are correct.**

**Question 5)****If I have a problem in the course I should:**- Email the instructor
- Call the instructor
- Drop the class
**Report it to the Learner Help Center (if the problem is**

technical) or to the Content Issues Forum (if the problem is an

error in the course materials).

**Question 6)****I am required to purchase a textbook for this course.**Answer- True
**False**

**Question 7)****Which of the following activities are required each week? Check all**

that apply.Answer**Quiz****Lectures**

**Question 8)****The following tools will help me use the discussion forums:**Answer**All of the other options are correct.**

**Question 9)****If I have a problem in the course I should:**Answer**Report it to the Learner Help Center (if the problem is technical)**

or to the Content Issues forum (if the problem is an error in the

course materials).