Working with MongoDB Aggregate Functions

This is the second part of the tutorial on how to use NodeJS with MongoDB. Here we switch to using the regular MongoDB shell and commands to make the study of aggregate functions simpler.

To show how to use aggregate functions, we will first explain how to do basic queries. Then we will show how to do the WordCount program, which is what people start with when they are first learning, for example, Apache Spark.

Basically, there are two aggregate functions: aggregate and MapReduce. Aggregate functions are the same as the familiar SQL command:

SQL select xxx, count (xxx) from table group by xxx

The aggregate functions are count, sum, average, min, max, etc.

The data that we used in the first part was smoker survey date. It has a complex structure. Recall that the schema is this:

That is complicated to work with, since it has an array of embedded documents: tip:[{item, count}]. So we will first show how to query this data, then we will make simpler data to show the aggregate and mapreduce functions.

Notice that we use pretty() to format the output to JSON so that it is easier to read.

Notice too that we put quote marks around the field names. If you look at the instructions from MongoDB they leave then off. But that causes a syntax error when we use the mongo shell. There seems to be some confusion about that if you look at stackoverflow. We are using MongoDB version mongodb-linux-x86_64-debian81-3.4.9.

And we use the dot notation like cachedContents.largest to pull fields nested in our document.

To output specific fields we put them after the query and put the number 1 to indicate which ones to print. Notice here how the output looks when we leave off pretty().

Share This Post

Walker Rowe is an American freelance tech writer and programmer living in Chile. He specializes in big data, analytics, and cloud architecture. Find him on LinkedIn or at Southern Pacific Review, where he publishes short stories, poems, and news.