The take home of this example was simple; using the @{} operator by itself created a System.Hashtable object, and the order of those keys are not guaranteed. Therefore not guaranteeing the order of the properties of the PSObject. However we could use [ordered] to make it an “ordered hashtable” (Shay’s words).

So I spent some time looking for the MSDN documentation for OrderedHashtable. I never found it so I decided to fire up my Windows 8 VM and see what its type is.

So I said to myself, “ok that’s cool, but can I cast from an OrderedDoctionary to a MongoDB.Bson.BsonDocument with the MongoDB .NET driver?” I say this because a while back I submitted some patches to improve the driver’s user experience in PowerShell. My main goal was to be able to use the HashTable notation to define a BsonDocument like so:

Readers of this blog know that I’ve been using MongoDB for a while, and I’ve recently become very excited about Powershell. Well recently I’ve been able to combine the two together for pure dynamically typed, schema-less, non-relational awesomeness. Such awesomeness is begging to be shared.

Since the Csharp Driver MSI is 32 bits, it creates the registry entries in the Wow6432Node. Therefore, we have to check to see if we are running in the 32 or 64 bit version of Powershell . Credit to an anonymous commenter on the msgoodies blog for providing this size of a pointer trick to determine if you are running a 32 or 64 bit system.

The next thing we want to do is to create a BSON document. This is surprisingly easy.

As you can see Powershell can convert a HashTable to a BsonDocument. This is because of the public constructor BsonDocument(IDictionary hashTable). Powershell can use these one parameter constructors to cast an object. You can use the same Hashtable trick for the QueryDocument and UpdateDocument classes.

Now that we have our BsonDocument, its time to perform basic crud operations.

As you can see, its not very hard to use the 10Gen MongoDB Csharp driver from within Powershell. Using Powershell with the MongoDb C-Sharp driver has many possibilities. First of all, adhoc mongodb queries from inside of powershell. Secondly, The code for this example is available in its entirety here.

The Short of It

The ebook weighs in at 64 pages. You can get a dead-tree copy via print on demand, but I recommend against it. Sharding in mongodb is a moving target. For example, right around publication time, the default chunk size of shards was changed from 200MB to 64MB. Therefore, if you rushed out and bought a print on demand copy, you would end up being stuck with a paper with that wrong piece of information. Ebook owners however can download an updated copy.

The writing style in the book is very matter of fact, and quite readable. It takes a certain talent to be able to use terms like cardinality and illustrate sharding with interval notation without sounding unnecessarily academic. Kristina demonstrates said talent here.

My one complaint about the book is that it does not discuss the problems with running the shard server on windows. However, as Kristina pointed out to me on twitter, the book also does not deal with writing init.d or upstart scripts for mongos on unix, so windows is not being singled out here.

Conclusion

Despite my one complaint, the book is quite comprehensive in its 64 pages. If I ever do need to use sharding in production, I will be reading the book very closely, and I’d recommend anyone else thinking about sharding mongodb do the same.

Recently I spent a few days implementing the BSON ObjectId data type used by MongoDB in javascript so I could generate ObjectIds in a a web browser. I originally had a specific problem to solve. However, I ended reworking my approach in that instance so I did not have to generate ObjectIds in the browser. Despite this, the code is still a valid approach to solving my original problem. The code is available on the justaprogrammer github org, The project is called ObjectId.js. Although the git repository contains a sample html file to illustrate its usage, all you need is the javascript file, and if you want to support IE6, json2.js.

How It Works

Originally, I wanted an implementation of ObjectId that could interact with the format that the WCF DataContractJsonSerializer serialized objects of the type MongoDB.Bson.ObjectId in the official 10gen MongoDB C# driver. This format looked like this:

Now, it was trivial a JSON object that looked like this. The real issue was filling those 4 values in a sensible manner. I ended up settling on the following:

timestamp: This is supposed to be seconds since the unix epoch. Javascript represents time as milliseconds since the UNIX epoch. Therefore all I had to do was set this to Math.floor(new Date().valueOf() / 1000)

machine: This is normally the first three bytes of the md5 hash of the hostname. Since I can’t access that from the browser I store a random number in html5 local storage and a cookie for fallback. This makes the machine id probably consistent for a given combination of machine, logon and browser. Naturally, I could use something like evercookie to make the machine part of the object id more sticky. However, I felt this was “good enough”

pid: Pid is generated every time ObjectId.js is executed, so usually once per page. This means that pid changes each page load, but remains consistent across Ajax calls on a single page reload.

increment: This was simple. pid++ every time I generate a new ObjectId.

So if you need to generate ObjectId’s in javascript, check out this class. If you find a bug or have an improvement, fork and send a pull request. Happy coding!.

Yesterday, Chuck Reeve’s tweeted an article from Daniel Lemire’s blog entitled Who will need database administrators in 2020?. The thesis is that with the advent of all these NoSQL technologies, the role of DBA will become unnecessary. I disagree with this for two reasons. First of all, NoSQL will not replace SQL. Secondly, your NoSQL data store probably needs a DBA, even if he has a different title.

Just a quick note, I’ve worked with enough SQL databases to make broad generalizations about them. The only NoSQL database I have experience with is MongoDB.

NoSQL will not completely replace RDBMSes

SQL databases are the primary practical implementation of the relational model. Most of the “trade school” explanations of the relational model and normal forms use SQL syntax as an example. Relational databases are great at storing data in an organized fashion. Through constraints you can enforce most business rules. Triggers will allow you to do the rest. Relational databases also usually have fine grained access control systems, and mechanisms for auditing changes. Finally, if you have to build a report from your data in a way you never did before or planned to, its usually nice to be able to start out with your data normalized.

Now there are a lot of things a NoSQL database like MongoDB does better than most RDBMSes. For example, MongoDB would be better suited for hosting a simple blog than MySQL. However, MongoDB has not been around all that long, and before MongoDB, relational databases did a good enough job. Now there are many things SQL is better at than MongoDB. For example, I would never use mongo for a complex inventory system. However, many technologists, like Daniel, have been focused on thing that NoSQL is good at, like blogs and simple ecommerce sites. These technologists recognize NoSQL as disruptive technology in the data management field. However, they make the mistake of assuming NoSQL will usurp the role of relational databases completely.

To put it another way, in the brave new MaybeSQL future, we will use SQL for some things and NoSQL for others. The things we will use SQL for, like complex inventory systems, will have complex schemas and need specialists to manage all that data. We already call those specialists DBAs.

Your NoSQL Database needs a DBA

Ok I lied. Your NoSQL database might not need a DBA, just like your relational database might not need one. In relational database shops without formal DBA positions, there are usually defacto DBAs, senior developers who’ve made it their business to manage the companies databases because management would not allocate a dedicated salary to that function. Currently, I am serving as a defacto DBA for some small databases.

Now I’ve also been playing with MongoDB a lot. I’ve contributed to mongo, spoke about mongo, and been to three mongo conferences. I’ve talked to a lot of people using mongo, and I’ve made a lot of observations. My primary observation is mongo tends to get used in startups. These startups don’t have dedicated DBAs. However, they do have well rounded senior developers that perform DBA and sysadmin functions. Many of these NoSQL programmers also know more about relational databases than I do, which is why they didn’t fight “the mongo way” tooth and nail before accepting it like I did. Now as is the nature of startups, most of the businesses these programmers work at will fail. However, a few will succeed and get big enough to have to hire technologists with more specialized roles. I expect to see a mongo specialists role thats part sysadmin and part programmer evolving at these companies. For companies that use a a combination of a relational databases and MongoDB, I expect a DBA to be hired, learn MongoDB, and take ownership of managing the data stored in that companies MongoDB instances.

Conclusion

NoSQL databases were designed for different problems than relational databases. Relational databases were not designed for things like blogs and massive sites lie facebook. They were used for this role because they were the best tool at the time for the job. MongoDB on the other hand was founded by a founder of doubleclick, who wanted to build a database that scaled the way a database for websites should scale. MongoDB is taking a piece of the pie from relational databases, but not all of it. Also, just like not all relational databases have a full time DBA to maintain them, not all NoSQL databases have a full time administrator. However, that does not mean that a role similar to DBA for NoSQL databases is unnecessary.

Until recently, I could accurately claim that I’ve spent more time hacking the source code to mongod, then writing code that made db calls to running instances of mongod. That was before I started my current project. For better or for worse, I’m approaching the point where I’m as comfortable with querying mongo collections as I am doing multi table joins in SQL server.

Naturally, as I use MongoDB I find myself asking a lot of “how do I do this in mongo” questions for tasks that I am able to do easily in SQL. More often than not, my main trouble in figuring out how to do the task in question is knowing what to ask google. Recently, my “How do I du jur” was cross database queries.

To define my problem more specifically, I had one document in one collection in my staging database that I wanted deployed to production. My staging and production databases lived on the same server. I realize this is not ideal, but it is the reality of my current situation. If I were to do the equivalent task in Microsoft SQL server, that is copy one row from a table in my staging database into one row in my production database, I’d use a query similar to the following:

A simple query for a simple task. It turns out the equivalent mongo query is about as simple. It just took me a while to find the right syntax, because the docs did not refer to it as a cross database query until I updated them. The shell command is db.getSisterDB(dbName). That functions returns an instance of another db on the server, which in turn contains collection objects that have the familar methods find(), findOne(), update(), save(), remove(), etc. So I did the following:

However, there is one caveat to be aware of. Most drivers allow you the full range of bson data types. The shell does not. For example a 32 bit int in a mongo document becomes a double in the shell. So the data is not copied perfectly. I discovered this issue while using a pre-release of the official 1ogen CSharp driver for MongoDB. After some update queries in the shell, objects were not being de-serialized. Luckily, the great people at 10gen made the driver more tolerant on deserialization so this is no longer a problem with current builds of the driver. There are open tickets to add shell support for the missing data types (int32s and GUIDs), so the deficiency of the shell will be addressed. However, until then, be aware of the caveat I mentioned.

Update: an older blog article exists on Chris Conway’s blog. The directions are out of date, but its an interesting read to get a historical perspective of the improvements mongod’s windows support.

Unix is an OS built around a worse is better philosophy. Part of that philosophy is defining things through convention. This has many advantages. One is thats its really easy to write a program that can run in both the console and as a daemon.

The mongo server, mongod, is a perfect example of this. If you run mongod in the console, it spews all its output to stdout. This is great for development and testing. However, if you want to run mongod all the time, its very simple to run it from an init script.

On the windows side of things, its more complicated. In windows, daemons run as services, except apparently if you are using an Azure instance. Services operate separately from interactive processes. Actually, thats not entirely true. You can have a service that interacts with the windows gui if you want to. As I said, its complicated.

In order for a windows program to operate as a service, you have to make certain API calls. Its actually not that hard in its most basic form, and there are well established patterns for doing it.

Furthermore, there is actually a wrapper program in the windows resource kit tool srvany.exe. It will allow you to turn almost any console program into a windows service. However, it is not an ideal solution. Luckily a programmer by the name of Alan Wright added proper windows service support to mongod. It was a well implemented service wrapper and I have made good use of it. I have also contributed some modifications to it that 10gen graciously accepted into their repo. The result is a really clean, but powerful service implementation built into mongod. I shall now demonstrate the power of this fully armed and operational death. . . I mean demonstrate the power of mongod’s windows service support.

Before we Begin

First, You are going to want to download the latest stable version of mongo. As of this writing that is mongo 1.6.3. Since mongo is evolving so rapidly, some of the more advanced features related to windows service support, not covered in this article, are only available in the unstable 1.7 series. Things move fast on the bleeding edge.

Second, you want to make sure you have mongod installed in a sane location. My definition of sane location is pretty much anyplace outside of C:Documents and Settings or C:Users. This also means on a hard drive permanently attached to your system. Theres nothing wrong with running mongod off an external hard drive if its always plugged in. Just keep in mind you won’t be able to unplug it while mongo is running, and the service will fail to start up if you boot your system without the drive plugged in.

Third, you want to be able to run mongod from a command prompt using the same switches as you wish the service to use. Please note that you are required to use logging when running mongod as a windows service. In my case I will run mongod like this:

What I am doing here is not overwriting the log every time I start mongo, and only listening for local connections. If mongod is running on the same machine as your web server, this is a good idea. I am also running with authentication.

Finally you want to make sure your command prompt has administrative access to your computer. Mongod will not raise a UAC prompt and elevate its own privileges. However, there’s a ticket for that. So make sure your do all the following steps from a command prompt running as an administrator. Also, note that in a future article I will talk about running mongod as a service using an unprivileged user.

And now we install the service

So you’ve worked out your particular command line options, made your data andlog folders, etc, etc. Double check your mongo log to make sure there are no errors. Now we are ready to install mongod as a service. To do this we simply append –install to the command prompt. So our install command looks like:

C:\Program Files\Microsoft SDKs\Windowsv7.1>mongod --auth --logpath c:datalogmongo.log --logappend --bind_ip localhost --install
all output going to: c:datalogmongo.log
Creating service MongoDB.
Service creation successful.
Service can be started from the command line via 'net start "MongoDB"'.

Now lets say to want to change the parameters. For example, you decide to run without authentication. If mongod is already installed as a service, and you want to change the command line parameters, then you have to use –reinstall instead of –install. So lets try that now:

C:\Program Files\Microsoft SDKs\Windowsv7.1>mongod --logpath c:datalogmongo.log --logappend --bind_ip localhost --reinstall
all output going to: c:datalogmongo.log
Deleting service MongoDB.
Service deleted successfully.
Creating service MongoDB.
Service creation successful.
Service can be started from the command line via 'net start "MongoDB"'.

So as you can see –reinstall removes the service and then installs it again. Pretty self explanatory.

Ok and finally we want to cleanup. To remove mongod as a service, we will use –remove. The output:

You will note that this command seems extra verbose. This is because messages that would normally be sent to the log are being sent to stdout.

Starting and Stopping the Service

There are several ways to start and stop a windows service. You can use the Service Control Manager or SCM of course. However, since we are on the command line already, we might as well use that. The command to start our mongo service is “net start mongodb.” Likewise, “net stop mingodb” stops our service. The service is configured to automatically startup at boot time, which is probably what you want. If not you can tune this behavior in the SCM.

Further Directions

This is only the tip of the mongo as a windows service iceberg. More options are available. I will be discussing them in depth in future articles.

One of the advantages of being a programmer in New York City, the greatest city in the world, is there are so many companies and individuals contributing to open source. One of these companies is 10gen, the creators of MongoDB, and two individuals that work as developers for 10gen are Kristina Chodorow and Mike Dirolf. I met Kristina at her NYPHP talk where I was introduced to MongoDB. I eventually became a user of and contributor to MongoDB. As a result I met Mike, and other members of the 10gen team.

Due to this convenience of geography I was able to get my copy of MongoDB: The Definitive Guide signed by its co-authors, the aforementioned Kristina and Mike.

My autographed copy of MongoDB: The Definitive Guide

So right now you’re probably saying, “wait this will probably be the most biased review ever!” Well, I’ll seriously consider re titling this post MongoDB The Definitive Guide: The Definitive Fanboy Review. Until then, on with the review.

When my book first arrived from bn.com, my first reaction was, “this is kinda thin for an O’Reilly book.” However, 192 pages makes sense for a book about MongoDB. MongoDB is a young small codebase. If you compare it to MySQL, or a more full featured database like Postgres, there are many full chapter topics, like triggers, that simple don’t have an equivalent in MongoDB. Finally, some topics are just plain simpler in MongoDB. For example, even though mongo has DBRefs (actually the drivers support this through convention), there are just fewer caveats about them than database joins. Therefore, this book is short for the same reason that The C Programming Language by K&R is short, there’s not a lot to talk about.

Now on to the content. Few people will read this book cover to cover, but its written in such a way that you can do so. However, if you want to go from mongo n00b to seasoned novice reading through the book is the way to go. Chapter 1 is the introductory chapter that explains how Mongo is so different. 2-7 covers the topic most programmers would be interested in, basic and advanced CRUD (Create, Insert, Update, and Delete) operations. Chapter 8, Administration, deals with the typical sysadmin things like starting, stopping, backing up, and monitoring mongo. 9 and10 deal with system architect topics, sharding and replication respectively. Finally chapter 11 gives you some example applications, and you have 3 appendices of reference information.

The book is really thorough about the material it covers. The syntax for CRUD operations is very thoroughly described, and performance implications of various operations are discussed. I noted the fact that the section on the “$or” operator did not mention it was new in 1.6. However, this book will be used long after “$or” or the 1.6 series is new, so my belief that this was an oversite will soon become obsolete.

The sharding and replication chapters cover real world implications and best practices. Of course one would expect this considering these are the two showcase features of MongoDB. Kristina and Mike do not disappoint here.

My only comment on the example chapter is that I wish the same example was used for all the languages. It would be much easier to compare and contrast features that way. Then again others might have found the repetition a little boring.

Update: Originally the Mongo Boston talk on Azure was supposed to be given by David Makogon (blog) (twitter). He talks about the session and some upcoming articles he will write in this post.

I was at the thoroughly awesome Mongo Boston conference at the Micrsoft NERD Center this weekend. I had a great time at the conference, as well as the surrounding activities. Of all the talks during the conference, one stood out. It was given by Mark Eisenberg who does sales for Microsoft Azure. That talk was on running MongoDB in Azure.

For those who do not know, Azure is Microsoft’s cloud offering. While they offer you virtual hosts, its not a traditional VM slice offering. You don’t get to run processes with administrative access, and you don’t get RDP access.

A few things stood out about this talk. First, it was a well executed “initial conversation” sales pitch. However, it was aimed perfectly at the audience in the room: programmers, architects, and technical decision makers. Mark knew his stuff. I asked some pretty deep technical questions, and got actual answers. It was also refreshing to hear, “its probably better to use Azure to write new apps than to port existing apps.” Having a salesman set expectations so frankly is unfortunately unusual in the IT industry.

The second thing that stood out was how you run a standalone exe like mongod.exe on Azure. Since you do not have administrative access to the machine, you cannot deploy via MSI, and you cannot run in the context of a windows service. The mongod process is basically running in a command prompt on a console you don’t have access to. Also, you cannot create different users to run different processes. The assumption, since this is the cloud, is that you spin up a new instance for each process. To be quite frank, I found this quite appalling at first. I spent a good chunk of my early career doing helpdesk and later system administration for a small ISP. Although titularly I was in charge of Unix and iSeries machines, I helped out on the windows side of the shop as well. Sometimes we had to run windows apps on the console of a server, requiring that a user was always logged into the console of that server. Knowing first hand the problems this caused, I declared a crusade against such programs. Also, this means all the windows improvements I contributed to mongo, windows service related improvements, served no purpose in the cloud.

Now, I’ve always been very dogmatic about my development and operations practices, so I’m still adjusting to what a more cool headed developer would accept instantly. However, my emotions will eventually come to accept what my intellect knows to be true. The cloud is here, I can get on it, or become the best buggy whip manufacturer there ever was.

The third thing that stood out had very little to do with the specifics of mongod and Azure. While its quite obvious Microsoft wants you to develop for Azure using .NET, they care about properly supporting all the third party technologies that Azure supports. I felt more like I was being sold by Lou Gerstner era IBM, than modern day Balmer lead Microsoft. Azure is being sold as a service. While Microsoft naturally wants to supplement that with the sale of their software products, they mainly want you to run your software on their platform. The threat of vendor lock-in is still there, but it always is on any platform.