This installs version 3.6 of the MongoDB client. This version of the client
will not work with servers running MongoDB version 2.6 or earlier. To install a
different version of the MongoDB client, refer to the MongoDB documentation for that
version, following the instructions for Ubuntu 16.04. You only need to install
mongodb-org-shell, rather than the entire mongodb-org package.

To connect to a MongoDB database, you need to know:

its hostname: this is usually a string like customers.mydomain.com;

its port: the default port for MongoDB is 27017;

the name of the database that you want to connect to;

your username and password for the database server (note that this is
different to your SherlockML username). If you are unsure of these, you
should ask your database administrator.

Once you have the MongoDB client installed, connect to the database with:

$ mongo --host HOSTNAME

This will open a MongoDB shell, which you can use to authenticate:

// switch to the admin database for your MongoDB server// (this is probably 'admin', but if not, you should talk to your server administrator)>useadmin>db.auth('USERNAME','PASSWORD');// log in>usemy_org_db// switch to the database you want to use>db.customers.count();// explore your database

The official Python package for interacting with MongoDB databases is PyMongo. Install it on a
SherlockML server with:

$ pip install pymongo

You can then connect to a MongoDB database with:

frompymongoimportMongoClientclient=MongoClient('HOSTNAME',username='USERNAME',password='PASSWORD')# You can now connect to your database and do some data sciencedb=client.get_database('my_org_db')db['customers'].count()client.close()# Close the connection to free up space on the server

Note

We close the connection to allow the database server to reclaim
resources. This can be critical in a Jupyter notebook, since the
kernel remains alive for a long time.

Note

We recommend avoiding pasting database passwords and other connection
details in many notebooks in a project. Have a look at
Factoring connection details into a package for a recommended strategy for
managing database connection details.