February 21st, 2016

MongoDB Query: Select and Filter Child Array

If you are using embedded document to model one to many relationships in MongoDB, sometimes you might need to get child data only depends on the report. This post explains some experiments related to select and filter child array with MongoDB queries. Consider a collection school with the following documents:

{
    "_id" : 1,
    "name" : "Class 1",
    "students" : [  
        { "rollNo" : 10001, "name" : "Ram", "score" : 65 },
		{ "rollNo" : 10002, "name" : "Shyam", "score" : 90 },
		{ "rollNo" : 10003, "name" : "Mohan", "score" : 75 }       
		]
},
{
	"_id" : 2,
    "name" : "Class 2",
    "students" : [  
        { "rollNo" : 20001, "name" : "Krishna", "score" : 88 },
		{ "rollNo" : 20002, "name" : "Sohan", "score" : 91 },
		{ "rollNo" : 20003, "name" : "Radhika", "score" : 82 },
		{ "rollNo" : 20004, "name" : "Komal", "score" : 55 }		
		]
},
{
	"_id" : 3,
    "name" : "Class 3",
    "students" : [  
        { "rollNo" : 30001, "name" : "Monika", "score" : 77 },		
		{ "rollNo" : 30002, "name" : "Rahul", "score" : 81 }       
		]
}

Note: MongoDB 3.2 is used for this post.

If you have to write a query to select students only, probably, you would be executing following:

db.school.find({}, {_id:0, students: 1})

and below is the output:

{ "students" : [ 
	{ "rollNo" : 10001, "name" : "Ram", "score" : 65 }, 
	{ "rollNo" : 10002, "name" : "Shyam", "score" : 90 }, 
	{ "rollNo" : 10003, "name" : "Mohan", "score" : 75 } 
	] 
}
{ "students" : [ 
	{ "rollNo" : 20001, "name" : "Krishna", "score" : 88 }, 
	{ "rollNo" : 20002, "name" : "Sohan", "score" : 91 }, 
	{ "rollNo" : 20003, "name" : "Radhika", "score" : 82 }, 
	{ "rollNo" : 20004, "name" : "Komal", "score" : 55 } 
	] 
}
{ "students" : [ 
	{ "rollNo" : 30001, "name" : "Monika", "score" : 77 }, 
	{ "rollNo" : 30002, "name" : "Rahul", "score" : 81 } 
	] 
}

But you might want combined students data (NOT individual document wise student array).
To get list of student data (rollNo, name and score), use following aggregate query:

db.school.aggregate([
{ $unwind :'$students'},
{ $project : { _id:0, rollNo : '$students.rollNo', name : '$students.name', score : '$students.score' } }
])

Output:

{ "rollNo" : 10001, "name" : "Ram", "score" : 65 }
{ "rollNo" : 10002, "name" : "Shyam", "score" : 90 }
{ "rollNo" : 10003, "name" : "Mohan", "score" : 75 }
{ "rollNo" : 20001, "name" : "Krishna", "score" : 88 }
{ "rollNo" : 20002, "name" : "Sohan", "score" : 91 }
{ "rollNo" : 20003, "name" : "Radhika", "score" : 82 }
{ "rollNo" : 20004, "name" : "Komal", "score" : 55 }
{ "rollNo" : 30001, "name" : "Monika", "score" : 77 }
{ "rollNo" : 30002, "name" : "Rahul", "score" : 81 }

$unwind deconstructs an array field from the input documents to output a document for each element. Each output document is the input document with the value of the array field replaced by the element. To understand this, Let’s check output of $unwind only:

db.school.aggregate([
{ $unwind :'$students'}
])

Output:

{ "_id" : 1, "name" : "Class 1", "students" : { "rollNo" : 10001, "name" : "Ram", "score" : 65 } }
{ "_id" : 1, "name" : "Class 1", "students" : { "rollNo" : 10002, "name" : "Shyam", "score" : 90 } }
{ "_id" : 1, "name" : "Class 1", "students" : { "rollNo" : 10003, "name" : "Mohan", "score" : 75 } }
{ "_id" : 2, "name" : "Class 2", "students" : { "rollNo" : 20001, "name" : "Krishna", "score" : 88 } }
{ "_id" : 2, "name" : "Class 2", "students" : { "rollNo" : 20002, "name" : "Sohan", "score" : 91 } }
{ "_id" : 2, "name" : "Class 2", "students" : { "rollNo" : 20003, "name" : "Radhika", "score" : 82 } }
{ "_id" : 2, "name" : "Class 2", "students" : { "rollNo" : 20004, "name" : "Komal", "score" : 55 } }
{ "_id" : 3, "name" : "Class 3", "students" : { "rollNo" : 30001, "name" : "Monika", "score" : 77 } }
{ "_id" : 3, "name" : "Class 3", "students" : { "rollNo" : 30002, "name" : "Rahul", "score" : 81 } }

Each output document is the input document with the value of the students array field replaced by the element. Let’s see what would happen if we select without $unwind.

db.school.aggregate([
{ $project : { _id:0, rollNo : '$students.rollNo', name : '$students.name', score : '$students.score' } }
])

Output:

{ "rollNo" : [ 10001, 10002, 10003 ], "name" : [ "Ram", "Shyam", "Mohan" ], "score" : [ 65, 90, 75 ] }
{ "rollNo" : [ 20001, 20002, 20003, 20004 ], "name" : [ "Krishna", "Sohan", "Radhika", "Komal" ], "score" : [ 88, 91, 82, 55 ] }
{ "rollNo" : [ 30001, 30002 ], "name" : [ "Monika", "Rahul" ], "score" : [ 77, 81 ] }

Which is expected. So, to get combined student list, both $unwind and $project are used.

Now, let’s do one more thing. Suppose you have to get list of students having score > 80. To filter the document, $match is used. It allows to pass only the documents that match the specified condition(s) to the next pipeline stage.

db.school.aggregate([
{ $unwind :'$students'},
{ $match : {'students.score': { $gt : 80} }},
{ $project : { _id:0, rollNo : '$students.rollNo', name : '$students.name', score : '$students.score' } }
])

Output:

{ "name" : "Shyam", "rollNo" : 10002, "score" : 90 }
{ "name" : "Krishna", "rollNo" : 20001, "score" : 88 }
{ "name" : "Sohan", "rollNo" : 20002, "score" : 91 }
{ "name" : "Radhika", "rollNo" : 20003, "score" : 82 }
{ "name" : "Rahul", "rollNo" : 30002, "score" : 81 }

We got student information who scored greater than 80.

In this post, we have seen how to select and filter child array with aggregate method and different operators. In next post, we will do some more interesting puzzles. Stay tuned and enjoy MongoDB !!