The MongoDB Aggregation Framework Chapter 0 – Introduction and Aggregation Concepts
The Concept of Pipelines
Which of the following is true about pipelines and the Aggregation
Framework?
Framework?
- Pipelines must consist of at least two stages.
- Stages cannot be configured to produce our desired output.
- Documents flow through the pipeline, passing from one stage to the
next - The Aggregation Framework provides us many stages to filter and
transform our data
Aggregation Structure and Syntax
Which of the following statements is true?
- Only one expression per stage can be used.
- An aggregation pipeline is an array of stages.
- Some expressions can only be used in certain stages.
The MongoDB Aggregation Framework Chapter 1 – Basic Aggregation – $match and $project
$match: Filtering documents
Which of the following is/are true of the $match stage?
- $match can only filter documents on one field.
- It uses the familiar MongoDB query language.
- It should come very early in an aggregation pipeline.
- $match can use both query operators and aggregation expressions.
Shaping documents with $project
Which of the following statements are true of the $project stage?
- $project can only be used once within an Aggregation pipeline.
- $project cannot be used to assign new values to existing fields.
- Beyond simply removing and retaining fields, $project lets us add new fields.
- Once we specify a field to retain or perform some computation in a $project stage, we must specify all fields we wish to retain. The only exception to this is the _id field.
Optional Lab – Expressions with $project
Let’s find how many movies in our movies collection are a “labor of love”, where the same person appears in cast, directors, and writers
Note that you may have a dataset that has duplicate entries for some films. Don’t worry if you count them few times, meaning you should not try to find those duplicates.
To get a count after you have defined your pipeline, there are two simple methods.
// add the $count stage to the end of your pipeline
// you will learn about this stage shortly!
db.movies.aggregate([
{$stage1},
{$stage2},
…$stageN,
{ $count: “labors of love” }
])
// or use itcount()
db.movies.aggregate([
{$stage1},
{$stage2},
{…$stageN},
]).itcount()
How many movies are “labors of love”?
- 1259
- 1595
- 1263
- 1597
The MongoDB Aggregation Framework Chapter 2 – Basic Aggregation – Utility Stages
Lab: SUsing Cursor-like Stages
MongoDB has another movie night scheduled. This time, we polled employees for their favorite actress or actor, and got these results
favorites = [
“Sandra Bullock”,
“Tom Hanks”,
“Julia Roberts”,
“Kevin Spacey”,
“George Clooney”]
For movies released in the USA with a tomatoes.viewer.rating greater than or equal to 3, calculate a new field called num_favs that represets how many favorites appear in the cast field of the movie.
Sort your results by num_favs, tomatoes.viewer.rating, and title, all in descending order.
What is the title of the 25th film in the aggregation result?
- Recount
- The Heat
- Erin Brockovich
- Wrestling Ernest Hemingway
Lab: Setting Up the Vagrant Environment
Calculate an average rating for each movie in our collection where English is an available language, the minimum imdb.rating is at least 1, the minimum imdb.votes is at least 1, and it was released in 1990 or after. You’ll be required to rescale (or normalize) imdb.votes. The formula to rescale imdb.votes and calculate normalized_rating is included as a handout.
What film has the lowest normalized_rating?
- DMZ
- The Christmas Tree
- Twilight
- Avatar: The Last Airbender
The MongoDB Aggregation Framework Chapter 3 – Core Aggregation – Combining Information
The $lookup Stage
Which of the following statements is true about the $lookup stage?
- The collection specified in from cannot be sharded
- You can specify a collection in another database to from
- $lookup matches between localField and foreignField with an equality
match - Specifying an existing field name to as will overwrite the the
existing field
$graphLookup Introduction
Which of the following statements apply to $graphLookup operator? check
all that apply
all that apply
- Provides MongoDB with graph or graph-like capabilities
- $graphLookup provides MongoDB a transitive closure
implementation - $graphLookup depends on $lookup operator. Cannot be used without
$lookup - $lookup and $graphLookup stages require the exact same fields in
their specification. - $graphLookup is a new stage of the aggregation pipeline introduced in
MongoDB 3.2
$graphLookup: Simple Lookup
Which of the following statements is/are correct? Check all that
apply.
apply.
- connectToField will be used on recursive find operations
- as determines a collection where $graphLookup will store the stage
results - startWith indicates the index that should be use to execute the
recursive match - connectFromField value will be use to match connectToField in a
recursive match
$graphLookup: maxDepth and depthField
Which of the following statements are incorrect? Check all that
apply
apply
- maxDepth only takes $long values
- maxDepth allows to specify the number of recursive lookups
- depthField determines a field, in the result document, which
specifies the number of recursive lookup needed to reach that
document
- depthField determines a field, which contains the value number of
documents matched by the recursive lookup
$graphLookup: General Considerations
Consider the following statement:
“$graphLookup“ is required to be the last element on the
pipeline.
pipeline.
Which of the following is true about the statement?
- This is correct because of the recursive nature of $graphLookup we
want to save resources for last.
- This is incorrect. graphLookup needs to be the first element of the
pipeline, regardless of other stages needed to perform the desired
query.
- This is incorrect. $graphLookup can be used in any position of the
pipeline and acts in the same way as a regular $lookup.
- This is correct because $graphLookup pipes out the results of
recursive search into a collection, similar to $out stage.
The MongoDB Aggregation Framework | Chapter 4 – Core Aggregation – Multidimensional Grouping
Facets: Single Facet Query
Which of the following aggregation pipelines are single facet
queries?
queries?
[
{“$match”: { “$text”: {“$search”: “network”}}},
{“$sortByCount”: “$offices.city”},
]
[
{“$unwind”: “$offices”},
{“$project”: { “_id”: “$name”, “hq”: “$offices.city”}},
{“$sortByCount”: “$hq”},
{“$sort”: {“_id”:-1}},
{“$limit”: 100}
]
[
{“$match”: { “$text”: {“$search”: “network”}}},
{“$unwind”: “$offices”},
{“$sort”: {“_id”:-1}}
]
Facets: Manual Buckets
Assuming that field1 is composed of double values, ranging between 0
and Infinity, and field2 is of type string, which of the following
stages are correct?
and Infinity, and field2 is of type string, which of the following
stages are correct?
- {‘$bucket’: { ‘groupBy’: ‘$field1’, ‘boundaries’: [ 0.4, Infinity
]}} - {‘$bucket’: { ‘groupBy’: ‘$field1’, ‘boundaries’: [ “a”, 3, 5.5
]}} - {‘$bucket’: { ‘groupBy’: ‘$field2’, ‘boundaries’: [ “a”, “asdas”,
“z” ], ‘default’: ‘Others’}}
Facets: Auto Buckets
Auto Bucketing will …
- adhere bucket boundaries to a numerical series set by the
granularity option. - randomly distributed documents accross arbitrarily defined bucket
boundaries. - given a number of buckets, try to distribute documents evenly
accross buckets. - count only documents that contain the groupBy field defined in the
documents.
Facets: Multiple Facets
Which of the following statement(s) apply to the $facet stage?
- The $facet stage allows several sub-pipelines to be executed to
produce multiple facets.
- The $facet stage allows the application to generate several different
facets with one single database request.
- The output of the individual facetsub −
pipelinescanbesharedusingtheexpression $FACET.$.
- We can only use facets stages ($sortByCount, $bucket and
$bucketAuto) as sub-pipelines of $facet stage.
The MongoDB Aggregation Framework | Chapter 5 – Miscellaneous Aggregation
The $out Stage
Which of the following statements is true regarding the $out stage?
- $out will overwrite an existing collection if specified.
- $out removes all indexes when it overwrites a collection.
- If a pipeline with $out errors, you must delete the collection specified to the $out stage.
- Using $out within many sub-piplines of a $facet stage is a quick way to generate many differently shaped collections.
Views
Which of the following statements are true regarding MongoDB Views?
- A view cannot be created that contains both horizontal and vertical slices.
- Inserting data into a view is slow because MongoDB must perform the pipeline in reverse.
- Views should be used cautiously because the documents they contain can grow incredibly large.
- View performance can be increased by creating the appropriate indexes on the source collection.
The MongoDB Aggregation Framework | Chapter 6 – Aggregation Performance and Pipeline Quiz Answer
Aggregation Performance
With regards to aggregation performance, which of the following are true?
- You can increase index usage by moving $match stages to the end of your pipeline
- When $limit and $sort are close together a very performant top-k sort can be performed
- Passing allowDiskUsage to your aggregation queries will seriously increase their performance
- Transforming data in a pipeline stage prevents us from using indexes in the stages that follow
Aggregation Pipeline on a Sharded Cluster
What operators will cause a merge stage on the primary shard for a database?
- $lookup
- $out
- $group
Pipeline Optimization – Part 2
Which of the following statements is/are true?
- The query in a $match stage can be entirely covered by an index
- The Aggregation Framework will automatically reorder stages in certain conditions
- The Aggregation Framework can automatically project fields if the shape of the final document is only dependent upon those fields in the input document.
- Causing a merge in a sharded deployment will cause all subsequent pipeline stages to be performed in the same location as the merge
The MongoDB Aggregation Framework Final Exam Quiz Answer
Final Exam Quiz
Question 1)
Consider the following aggregation pipelines:
Pipeline 1
db.coll.aggregate([
{“$match”: {“field_a”: {“$gt”: 1983}}},
{“$project”: { “field_a”: “$field_a.1”, “field_b”: 1, “field_c”:
1 }},
1 }},
{“$replaceRoot”:{“newRoot”: {“_id”: “$field_c”, “field_b”:
“$field_b”}}},
“$field_b”}}},
{“$out”: “coll2”},
{“$match”: {“_id.field_f”: {“$gt”: 1}}},
{“$replaceRoot”:{“newRoot”: {“_id”: “$field_b”, “field_c”:
“$_id”}}}
“$_id”}}}
])
Pipeline 2
db.coll.aggregate([
{“$match”: {“field_a”: {“$gt”: 111}}},
{“$geoNear”: {
“near”: { “type”: “Point”, “coordinates”: [ -73.99279 ,
40.719296 ] },
40.719296 ] },
“distanceField”: “distance”}},
{“$project”: { “distance”: “$distance”, “name”: 1, “_id”: 0
}}
}}
])
Pipeline 3
db.coll.aggregate([
{
“$facet”: {
“averageCount”: [
{“$unwind”: “$array_field”},
{“$group”: {“_id”: “$array_field”, “count”:
{“$sum”: 1}}}
{“$sum”: 1}}}
],
“categorized”: [{“$sortByCount”:
“$arrayField”}]
“$arrayField”}]
},
},
{
“$facet”: {
“new_shape”: [{“$project”: {“range”:
“$categorized._id”}}],
“$categorized._id”}}],
“stats”: [{“$match”: {“range”: 1}}, {“$indexStats”:
{}}]
{}}]
}
}
])
Which of the following statements are correct?
- Pipeline 3 executes correctly
- Pipeline 2 fails because we cannot project distance field
- Pipeline 3 fails since you can only have one $facet stage per
pipeline
- Pipeline 1 is incorrect because you can only have one $replaceRoot
stage in your pipeline
- Pipeline 1 fails since $out is required to be the last stage of the
pipeline
- Pipeline 2 is incorrect because $geoNear needs to be the first stage
of our pipeline
- Pipeline 3 fails because $indexStats must be the first stage in a
pipeline and may not be used within a $facet
Question 2)
Consider the following collection:
db.collection.find()
{
“a”: [1, 34, 13]
}
The following pipelines are executed on top of this collection, using a
mixed set of different expression accross the different stages:
mixed set of different expression accross the different stages:
Pipeline 1
db.collection.aggregate([
{“$match”: { “a” : {“$sum”: 1} }},
{“$project”: { “_id” : {“$addToSet”: “$a”} }},
{“$group”: { “_id” : “”, “max_a”: {“$max”: “$_id”} }}
])
Pipeline 2
db.collection.aggregate([
{“$project”: { “a_divided” : {“$divide”: [“$a”, 1]}
}}
}}
])
Pipeline 3
db.collection.aggregate([
{“$project”: {“a”: {“$max”: “$a”}}},
{“$group”: {“_id”: “$$ROOT._id”, “all_as”: {“$sum”:
“$a”}}}
“$a”}}}
])
Given these pipelines, which of the following statements are
correct?
correct?
- Pipeline 3 is correct and will execute with no error
- Pipeline 2 fails because the $divide operator only supports numeric
types - Pipeline 1 will fail because $max can not operator on _id field
- Pipeline 2 is incorrect since $divide cannot operate over field
expressions - Pipeline 1 is incorrect because you cannot use an accumulator
expression in a $match stage.
Question 3)
Consider the following collection documents:
db.people.find()
{ “_id” : 0, “name” : “Bernice Pope”, “age” : 69, “date” :
ISODate(“2017-10-04T18:35:44.011Z”) }
ISODate(“2017-10-04T18:35:44.011Z”) }
{ “_id” : 1, “name” : “Eric Malone”, “age” : 57, “date” :
ISODate(“2017-10-04T18:35:44.014Z”) }
ISODate(“2017-10-04T18:35:44.014Z”) }
{ “_id” : 2, “name” : “Blanche Miller”, “age” : 35, “date” :
ISODate(“2017-10-04T18:35:44.015Z”) }
ISODate(“2017-10-04T18:35:44.015Z”) }
{ “_id” : 3, “name” : “Sue Perez”, “age” : 64, “date” :
ISODate(“2017-10-04T18:35:44.016Z”) }
ISODate(“2017-10-04T18:35:44.016Z”) }
{ “_id” : 4, “name” : “Ryan White”, “age” : 39, “date” :
ISODate(“2017-10-04T18:35:44.019Z”) }
ISODate(“2017-10-04T18:35:44.019Z”) }
{ “_id” : 5, “name” : “Grace Payne”, “age” : 56, “date” :
ISODate(“2017-10-04T18:35:44.020Z”) }
ISODate(“2017-10-04T18:35:44.020Z”) }
{ “_id” : 6, “name” : “Jessie Yates”, “age” : 53, “date” :
ISODate(“2017-10-04T18:35:44.020Z”) }
ISODate(“2017-10-04T18:35:44.020Z”) }
{ “_id” : 7, “name” : “Herbert Mason”, “age” : 37, “date” :
ISODate(“2017-10-04T18:35:44.020Z”) }
ISODate(“2017-10-04T18:35:44.020Z”) }
{ “_id” : 8, “name” : “Jesse Jordan”, “age” : 47, “date” :
ISODate(“2017-10-04T18:35:44.020Z”) }
ISODate(“2017-10-04T18:35:44.020Z”) }
{ “_id” : 9, “name” : “Hulda Fuller”, “age” : 25, “date” :
ISODate(“2017-10-04T18:35:44.020Z”) }
ISODate(“2017-10-04T18:35:44.020Z”) }
And the aggregation pipeline execution result:
db.people.aggregate(pipeline)
{ “_id” : 8, “names” : [ “Sue Perez” ], “word” : “P” }
{ “_id” : 9, “names” : [ “Ryan White” ], “word” : “W” }
{ “_id” : 10, “names” : [ “Eric Malone”, “Grace Payne” ], “word” : “MP”
}
}
{ “_id” : 11, “names” : [ “Bernice Pope”, “Jessie Yates”, “Jesse
Jordan”, “Hulda Fuller” ], “word” : “PYJF” }
Jordan”, “Hulda Fuller” ], “word” : “PYJF” }
{ “_id” : 12, “names” : [ “Herbert Mason” ], “word” : “M” }
{ “_id” : 13, “names” : [ “Blanche Miller” ], “word” : “M” }
Which of the following pipelines generates the output result?
var pipeline = [{
“$project”: {
“surname”: { “$arrayElemAt”: [ {“$split”: [
“$name”, ” ” ] }, 1]},
“$name”, ” ” ] }, 1]},
“name_size”: { “$add” : [{“$strLenCP”:
“$name”}, -1]},
“$name”}, -1]},
“name”:1
}
},
{
“$group”: {
“_id”: “$name_size”,
“word”: { “$addToSet”: {“$substr”:
[{“$toUpper”:”$name”}, 3, 2]} },
[{“$toUpper”:”$name”}, 3, 2]} },
“names”: {“$push”: “$surname”}
}
},
{
“$sort”: {“_id”: -1}
}
]
“`
“`
[X]
var pipeline = [{
“$project”: {
“surname_capital”: { “$substr”: [{“$arrayElemAt”:
[ {“$split”: [ “$name”, ” ” ] }, 1]}, 0, 1 ] },
[ {“$split”: [ “$name”, ” ” ] }, 1]}, 0, 1 ] },
“name_size”: { “$add” : [{“$strLenCP”:
“$name”}, -1]},
“$name”}, -1]},
“name”: 1
}
},
{
“$group”: {
“_id”: “$name_size”,
“word”: { “$push”: “$surname_capital” },
“names”: {“$push”: “$name”}
}
},
{
“$project”: {
“word”: {
“$reduce”: {
“input”: “$word”,
“initialValue”: “”,
“in”: { “$concat”: [“$$value”,
“$$this”] }
“$$this”] }
}
},
“names”: 1
}
},
{
“$sort”: { “_id”: 1}
}
]
“`
“`
var pipeline = [{
“$sort”: { “date”: 1 }
},
{
“$group”: {
“_id”: { “$size”: { “$split”: [“$name”, ” “]}
},
},
“names”: {“$push”: “$name”}
}
},
{
“$project”: {
“word”: {
“$zip”: {
“inputs”: [“$names”],
“useLongestLength”: false,
}
},
“names”: 1
}
}]
“`
Question 4)
$facet is an aggregation stage that allows for sub-pipelines to be
executed.
executed.
var pipeline = [
{
$match: { a: { $type: “int” } }
},
{
$project: {
_id: 0,
a_times_b: { $multiply: [“$a”, “$b”] }
}
},
{
$facet: {
facet_1: [{ $sortByCount: “a_times_b” }],
facet_2: [{ $project: { abs_facet1: { $abs:
“$facet_1._id” } } }],
“$facet_1._id” } } }],
facet_3: [
{
$facet: {
facet_3_1: [{ $bucketAuto:
{ groupBy: “$_id”, buckets: 2 } }]
{ groupBy: “$_id”, buckets: 2 } }]
}
}
]
}
}
]
In the above pipeline, which uses $facet, there are some incorrect
stages or/and expressions being used.
stages or/and expressions being used.
Which of the following statements point out errors in the
pipeline?
pipeline?
- a $multiply expression takes a document as input, not an array.
- can not nest a $facet stage as a sub-pipeline.
- $sortByCount cannot be used within $facet stage.
- facet_2 uses the output of a parallel sub-pipeline, facet_1, to
compute an expression - a $type expression does not take a string as its value; only the
BSON numeric values can be specified to identify the types.
Question 5)
Consider a company producing solar panels and looking for the next
markets they want to target in the USA. We have a collection with all
the major cities (more than 100,000 inhabitants) from all over the
World with recorded number of sunny days for some of the last
years.
markets they want to target in the USA. We have a collection with all
the major cities (more than 100,000 inhabitants) from all over the
World with recorded number of sunny days for some of the last
years.
A sample document looks like the following:
db.cities.findOne()
{
“_id”: 10,
“city”: “San Diego”,
“region”: “CA”,
“country”: “USA”,
“sunnydays”: [220, 232, 205, 211, 242, 270]
}
The collection also has these indexes:
db.cities.getIndexes()
[
{
“v”: 2,
“key”: {
“_id”: 1
},
“name”: “_id_”,
“ns”: “test.cities”
},
{
“v”: 2,
“key”: {
“city”: 1
},
“name”: “city_1”,
“ns”: “test.cities”
},
{
“v”: 2,
“key”: {
“country”: 1
},
“name”: “country_1”,
“ns”: “test.cities”
}
]
We would like to find the cities in the USA where the minimum number of
sunny days is 200 and the average number of sunny days is at least 220.
Lastly, we’d like to have the results sorted by the city’s name. The
matching documents may or may not have a different shape than the
initial one.
sunny days is 200 and the average number of sunny days is at least 220.
Lastly, we’d like to have the results sorted by the city’s name. The
matching documents may or may not have a different shape than the
initial one.
We have the following query:
var pipeline = [
{“$addFields”: { “min”: {“$min”: “$sunnydays”}}},
{“$addFields”: { “mean”: {“$avg”: “$sunnydays” }}},
{“$sort”: {“city”: 1}},
{“$match”: { “country”: “USA”, “min”: {“$gte”: 200},
“mean”: {“$gte”: 220}}},
“mean”: {“$gte”: 220}}},
]
db.cities.aggregate(pipeline)
However, this pipeline execution can be optimized!
Which of the following choices is still going to produce the expected
results and likely improve the most the execution of this aggregation
pipeline?
results and likely improve the most the execution of this aggregation
pipeline?
var pipeline = [
{“$sort”: {“city”: 1}},
{“$addFields”: { “min”: {“$min”: “$sunnydays”}}},
{“$match”: { “country”: “USA”, “min”: {“$gte”:
200}}},
200}}},
]
var pipeline = [
{“$match”: { “country”: “USA”}},
{“$sort”: {“city”: 1}},
{“$addFields”: { “min”: {“$min”: “$sunnydays”}}},
{“$match”: { “min”: {“$gte”: 200}, “mean”: {“$gte”:
220}}},
220}}},
{“$addFields”: { “mean”: {“$avg”: “$sunnydays”
}}},
}}},
]
var pipeline = [
{“$sort”: {“city”: 1}},
{“$addFields”: { “min”: {“$min”: “$sunnydays”}}},
{“$addFields”: { “mean”: {“$avg”: “$sunnydays”
}}},
}}},
{“$match”: { “country”: “USA”, “min”: {“$gte”: 200},
“mean”: {“$gte”: 220}}},
“mean”: {“$gte”: 220}}},
]
var pipeline = [
{“$sort”: {“city”: 1}},
{“$match”: { “country”: “USA”}},
{“$addFields”: { “min”: {“$min”: “$sunnydays”}}},
{“$match”: { “min”: {“$gte”: 200}, “mean”: {“$gte”:
220}}},
220}}},
{“$addFields”: { “mean”: {“$avg”: “$sunnydays”
}}},
}}},
]
var pipeline = [
{“$match”: { “country”: “USA”}},
{“$addFields”: { “mean”: {“$avg”:
“$sunnydays”}}},
“$sunnydays”}}},
{“$match”: { “mean”: {“$gte”: 220}, “sunnydays”:
{“$not”: {“$lt”: 200 }}}},
{“$not”: {“$lt”: 200 }}}},
{“$sort”: {“city”: 1}},
]
Question 6)
Consider the following people collection:
db.people.find().limit(5)
{ “_id” : 0, “name” : “Iva Estrada”, “age” : 95, “state” : “WA”, “phone”
: “(739) 557-2576”, “ssn” : “901-34-4492” }
: “(739) 557-2576”, “ssn” : “901-34-4492” }
{ “_id” : 1, “name” : “Roger Walton”, “age” : 92, “state” : “ID”,
“phone” : “(948) 527-2370”, “ssn” : “498-61-9106” }
“phone” : “(948) 527-2370”, “ssn” : “498-61-9106” }
{ “_id” : 2, “name” : “Isaiah Norton”, “age” : 26, “state” : “FL”,
“phone” : “(344) 479-5646”, “ssn” : “052-49-6049” }
“phone” : “(344) 479-5646”, “ssn” : “052-49-6049” }
{ “_id” : 3, “name” : “Tillie Salazar”, “age” : 88, “state” : “ND”,
“phone” : “(216) 414-5981”, “ssn” : “708-26-3486” }
“phone” : “(216) 414-5981”, “ssn” : “708-26-3486” }
{ “_id” : 4, “name” : “Cecelia Wells”, “age” : 16, “state” : “SD”,
“phone” : “(669) 809-9128”, “ssn” : “977-00-7372” }
“phone” : “(669) 809-9128”, “ssn” : “977-00-7372” }
And the corresponding people_contacts view:
db.people_contacts.find().limit(5)
{ “_id” : 6585, “name” : “Aaron Alvarado”, “phone” :
“(631)*********”, “ssn” : “********8014” }
“(631)*********”, “ssn” : “********8014” }
{ “_id” : 8510, “name” : “Aaron Barnes”, “phone” : “(944)*********”,
“ssn” : “********6820” }
“ssn” : “********6820” }
{ “_id” : 6441, “name” : “Aaron Barton”, “phone” : “(234)*********”,
“ssn” : “********1937” }
“ssn” : “********1937” }
{ “_id” : 8180, “name” : “Aaron Coleman”, “phone” : “(431)*********”,
“ssn” : “********7559” }
“ssn” : “********7559” }
{ “_id” : 9738, “name” : “Aaron Fernandez”, “phone” :
“(578)*********”, “ssn” : “********0211” }
“(578)*********”, “ssn” : “********0211” }
Which of the of the following commands generates this people_contacts
view?
view?
var pipeline = [
{
“$sort”: {“name”: 1}
},
{
“$project”: {“name”:1,
“phone”: {
“$concat”: [
{“$arrayElemAt”: [{“$split”: [“$phone”, ”
“]}, 0]} ,
“]}, 0]} ,
“*********” ]
},
“ssn”: {
“$concat”: [
“********”,
{“$arrayElemAt”: [{“$split”: [“$ssn”,
“-“]}, 2]}
“-“]}, 2]}
]
}
}
}
];
db.createView(“people”, “people_contacts” pipeline);
var pipeline = [
{
“$project”: {“name”:1,
“phone”: {
“$concat”: [
{“$arrayElemAt”: [{“$split”: [“$phone”, ”
“]}, 0]} ,
“]}, 0]} ,
“*********” ]
},
“ssn”: {
“$concat”: [
“********”,
{“$arrayElemAt”: [{“$split”: [“$ssn”,
“-“]}, 2]}
“-“]}, 2]}
]
}
}
}
];
db.runCommand({
“create”: “people_contacts”,
“viewOn”:”people”,
“pipeline”: pipeline})
var pipeline = [
{
“$sort”: {“state”: 1}
},
{
“$project”: {“name”:1,
“phone”: {
“$concat”: [
{“$arrayElemAt”: [{“$split”: [“$phone”, ”
“]}, 0]} ,
“]}, 0]} ,
“*********” ]
},
“ssn”: {
“$concat”: [
“********”,
{“$arrayElemAt”: [{“$split”: [“$ssn”,
“-“]}, 2]}
“-“]}, 2]}
]
}
}
}
];
db.runCommand({
“create”: “people”,
“viewOn”:”people”,
“pipeline”: pipeline})
var pipeline = [
{
“$sort”: {“name”: 1}
},
{
“$project”: {“name”:1,
“phone”: {
“$concat”: [
{“$arrayElemAt”: [{“$split”: [“$phone”, ”
“]}, 0]} ,
“]}, 0]} ,
“*********” ]
},
“ssn”: {
“$concat”: [
“********”,
{“$arrayElemAt”: [{“$split”: [“$ssn”,
“-“]}, 2]}
“-“]}, 2]}
]
}
}
}
];
db.createView(“people_contacts”, “people”, pipeline);
Question 7)
Using the air_alliances and air_routes collections, find which
alliance has the most unique carriers(airlines) operating between the
airports JFK and LHR, in both directions.
alliance has the most unique carriers(airlines) operating between the
airports JFK and LHR, in both directions.
Names are distinct, i.e. Delta != Delta Air Lines
src_airport and dst_airport contain the originating and terminating
airport information.
airport information.
-
- SkyTeam, with 4 carriers
- OneWorld, with 8 carriers
- OneWorld, with 4 carriers
- Star Alliance, with 6 carriers