Ross Lawleyhttp://rosslawley.co.uk/
Recent content on Ross LawleyHugo -- gohugo.ioen-gbFri, 04 Aug 2017 10:10:30 +0100MongoDB POJO Supporthttp://rosslawley.co.uk/mongodb-pojo-support/
Fri, 04 Aug 2017 10:10:30 +0100http://rosslawley.co.uk/mongodb-pojo-support/
<p>I&rsquo;m really pleased to announce that version 3.5.0 of the <a href="http://mongodb.github.io/mongo-java-driver/3.5/">MongoDB Java Driver</a> has been released with POJO (Plain Old Java Object) support!</p>
<h2 id="codecs:ea33a3f7b0c8cdb3fb02a66bec2ef879">Codecs</h2>
<p>MongoDB uses <a href="http://bsonspec.org">BSON</a>, a binary super set of JSON, for its wire protocol and storage format. The
3.0 series of the Mongo Java Driver introduced <code>Codecs</code> - an improved way of translating these BSON into the native Java objects eg: <code>Document</code> or <code>BsonDocument</code>.</p>
<p>Codecs are an abstraction that determine how BSON data is converted and into what type. As an abstraction, it can be quite verbose to write your own custom POJO Codecs. As each POJO requires a <code>Codec</code> implementation to be registered in the <code>CodecRegistry</code>. The amount of code required to support an application with tens of POJOs was often seen as a barrier to entry.</p>
<p>However, the benefits of using Codecs for handling your POJOs were numerous. It could easily simplify your main application code, as POJOs can map directly to the domain, making the code easier to reason. Another benefit is speed, Codecs can negate the need to use an intermediate map-like object before hyrdating your domain object. For this reason, it has been a long requested feature to make the creation of Codecs from POJOs automatic.</p>
<h2 id="pojocodecprovider:ea33a3f7b0c8cdb3fb02a66bec2ef879">PojoCodecProvider</h2>
<p>The mechanims for POJO support is via the <code>PojoCodecProvider</code> which provides a builder for configuring how and what POJOs to support. The builder allows registering of classes and packages, there is even a setting that means the provider to automatically try and handle any POJO it sees. The example below will create a <code>CodecRegistry</code> that will handle any POJO that meets the Java bean specification:</p>
<pre><code class="language-java">import org.bson.codecs.configuration.CodecProvider;
import org.bson.codecs.configuration.CodecRegistry;
import static org.bson.codecs.configuration.CodecRegistries.fromRegistries;
import static org.bson.codecs.configuration.CodecRegistries.fromProviders;
// Create a CodecRegistry containing the PojoCodecProvider instance.
CodecProvider pojoCodecProvider = PojoCodecProvider.builder().automatic(true).build();
CodecRegistry pojoCodecRegistry = fromRegistries(defaultCodecRegistry, fromProviders(pojoCodecProvider));
</code></pre>
<p><strong>Note:</strong> When using the automatic setting with <code>PojoCodecProvider</code> always ensure that its the last <code>CodecProvider</code> or <code>CodecRegistry</code>. Otherwise it will try and handle any type it sees that has at least one serializable or deserializable property.</p>
<h2 id="fun-with-pojos:ea33a3f7b0c8cdb3fb02a66bec2ef879">Fun with POJOs</h2>
<p>Once you&rsquo;ve configured your <code>CodecRegistry</code> it can be used when creating a <code>MongoClient</code>, a <code>MongoDatabase</code> or a <code>MongoCollection</code>. The following example gets a list of all <code>members</code> from MongoDB:</p>
<pre><code class="language-java">MongoDatabase database = mongoClient.getDatabase(&quot;members&quot;).withCodecRegistry(pojoCodecRegistry);
MongoCollection&lt;Person&gt; collection = database.getCollection(&quot;members&quot;, Person.class);
List&lt;Person&gt; members = collection.find().into(new ArrayList&lt;Person&gt;());
</code></pre>
<p>As you can see, using POJOs with MongoDB is super simple! And using the <code>automatic</code> setting you can use <em>any</em> Java bean!</p>
<h2 id="customising-pojos:ea33a3f7b0c8cdb3fb02a66bec2ef879">Customising POJOs</h2>
<p>There are two main ways to customise how POJOs are serialised / deserialised: <a href="http://mongodb.github.io/mongo-java-driver/3.5/bson/pojos/#conventions">Conventions</a> and ClassModels. The underlying abstractions used by the <code>PojoCodecProvider</code> are ClassModels and PropertyModels. As ClassModels are complex, its not recommended that users build and create them from scratch, but rather modify them via the Conventions mechanism.</p>
<p>Conventions are a handy abstraction that take and modify the <code>ClassModelBuilder</code> before its made into an immutable <code>ClassModel</code>. The default Conventions handle the <a href="http://mongodb.github.io/mongo-java-driver/3.5/bson/pojos/#annotations">default annotations</a> and have special handling for <code>id</code> properties. Writing a custom <code>Convention</code> is trivial so supporting alternative annotations is easy. Conventions can be registered on the <code>PojoCodecProvider</code> and they will be run in order they are supplied.</p>
<h2 id="available-now:ea33a3f7b0c8cdb3fb02a66bec2ef879">Available now</h2>
<p>POJO support is available now! It includes support for Java beans, as well as abstract classes, interfaces and nested generic types. So please try it out!</p>
<p>There&rsquo;s loads more information in the <a href="http://mongodb.github.io/mongo-java-driver/3.5/bson/pojos/">POJO documentation</a> as well as a <a href="http://mongodb.github.io/mongo-java-driver/3.5/driver/getting-started/quick-start-pojo/">quick-start guide</a> to get you started.</p>
<p>Enjoy!</p>
MongoDB Scala Driver 2.0 releasedhttp://rosslawley.co.uk/mongodb-scala-driver-2.0/
Fri, 31 Mar 2017 13:15:28 +0100http://rosslawley.co.uk/mongodb-scala-driver-2.0/
<p>The 2.0.0 version of the official Scala Driver for MongoDB has been released!</p>
<p><img style="max-width: 100%;" src="http://rosslawley.co.uk/images/scala.jpg"></p>
<h2 id="case-class-support:f6e2133bab10204926d960626000d5d7">Case Class support</h2>
<p>I&rsquo;m really happy to announce the introduction of case class support, making it much easier to use your domain models with MongoDB. Internally Codecs are used to convert datatypes to and from <a href="http://bsonspec.org">BSON</a> - the internal data format for MongoDB. The 2.0 release includes a Macro that can create a codecs from case classes. The encoding and decoding the values of each field still uses the codec registry, so any users with custom codecs can still happily use these Macro based codecs.</p>
<p>To show how simple it is we can use the following <code>Person</code> case class as an example:</p>
<pre><code class="language-scala">case class Person(_id: ObjectId, firstName: String, lastName: String)
</code></pre>
<p>Notice, the <code>_id</code> field, this is a special field in MongoDB because it represents the primary key. It&rsquo;s advisable to include <code>_id</code> field in your case classes as it gives access to the primary key. When inserting a BSON document into MongoDB if it doesn&rsquo;t contain the <code>_id</code> field one is added automatically. By adding a companion object an <code>_id</code> can be automatically generated:</p>
<pre><code class="language-scala">object Person {
def apply(firstName: String, lastName: String): Person =
Person(new ObjectId(), firstName, lastName)
}
</code></pre>
<p>Creating a CodecProvider for <code>Person</code> is simple:</p>
<pre><code class="language-scala">val personCodecProvider = Macros.createCodecProvider[Person]()
</code></pre>
<p>There is also an implicit helper that will create codec providers for your case classes by just passing in the class eg:</p>
<pre><code class="language-scala">import org.mongodb.scala.bson.codecs.Macros._
import org.mongodb.scala.bson.codecs.DEFAULT_CODEC_REGISTRY
import org.bson.codecs.configuration.CodecRegistries.{fromRegistries, fromProviders}
val codecRegistry = fromRegistries(fromProviders(classOf[Person], classOf[MyOtherCaseClass]), DEFAULT_CODEC_REGISTRY)
</code></pre>
<p>Inserting an instance into MongoDB is simple:</p>
<pre><code class="language-scala">val collection = database.getCollection[Person](&quot;People&quot;).withCodecRegistry(codecRegistry)
val person: Person = Person(&quot;Ada&quot;, &quot;Lovelace&quot;)
collection.insertOne(person).results() // results() is the custom blocking implicit used in the quick tour.
</code></pre>
<p>Querying and retrieving <code>Person</code> instances is also super simple:</p>
<pre><code class="language-scala">collection.find().printResults() // printResults is the helper also used in the quick tour.
</code></pre>
<h3 id="sealed-classes-and-adts:f6e2133bab10204926d960626000d5d7">Sealed classes and ADTs</h3>
<p>Hierarchical class structures are supported via sealed classes. Each subclass is handled specifically by the generated codec, so you only
need create a <code>CodecProvider</code> for the parent sealed class. Internally an extra field (<code>_t</code>) is stored alongside the data so that
the correct subclass can be hydrated when decoding the data. Below is an example of a tree like structure containing branch and leaf nodes:</p>
<pre><code class="language-scala">sealed class Tree
case class Branch(b1: Tree, b2: Tree, value: Int) extends Tree
case class Leaf(value: Int) extends Tree
val codecRegistry = fromRegistries( fromProviders(classOf[Tree]), DEFAULT_CODEC_REGISTRY )
</code></pre>
<h2 id="breaking-changes:f6e2133bab10204926d960626000d5d7">Breaking changes</h2>
<p>The scala driver follows semantic versioning, so the 2.0.0 release indicidates there have been some API breaking changes. However, they really are minimal and shouldn&rsquo;t impact most users of the driver.</p>
<p>The implicit default type for various methods in the <code>MongoCollection</code> class in 1.0 was <code>Document</code>. This was a bug as they should have been
the same type as the collection itself. For example with <code>MongoCollection[Person].find()</code> in 1.0 it would have returned an <code>Observable[Document]</code>. This is obviously incorrect and has been fixed in 2.0 to return an <code>Observable[Person]</code>. Not many people have been impacted by this as it was only an implicit type and could explicity be declared: <code>MongoCollection[Person].find[Person]()</code>.</p>
<p>The other potentially breaking change is the introduction of a <code>SingleObservable</code> which represents an <code>Observable</code> containing only a single item. The implicit <code>SingleObservable[T].toFuture()</code> method returns <code>Future[T]</code>, whereas <code>Observable[T].toFuture()</code> returns <code>Future[Seq[T]]</code>. This may catch some users out however, most users of these single result Observables used the <code>head()</code> method to get a single item future, so won&rsquo;t be impacted.</p>
<h2 id="changable-executioncontexts:f6e2133bab10204926d960626000d5d7">Changable ExecutionContexts</h2>
<p>The other main change is the introduction of the <code>Observable[T].observeOn(context: ExecutionContext)</code> implicit. This allows for computation to take place on alternative ExecutionContexts, handy for some long running or computationally heavy Observables.</p>
<h2 id="feedback-wanted:f6e2133bab10204926d960626000d5d7">Feedback wanted</h2>
<p>The 2.0 driver is available from: <code>&quot;org.mongodb.scala&quot; %% &quot;mongo-scala-driver&quot; % &quot;2.0.0&quot;</code>. For more information and examples see the <a href="http://mongodb.github.io/mongo-scala-driver/2.0/">driver documentation</a>.</p>
<p>We would love to have your feedback on the new driver, so please feel free to post to the <a href="https://groups.google.com/forum/#!forum/mongodb-user">MongoDB User</a> mailing list. For feature requests or bug reports please use <a href="https://jira.mongodb.org/browse/SCALA/">Jira project</a>.</p>
<p>Enjoy!</p>
MongoDB Scala Connector releases!http://rosslawley.co.uk/mongodb-spark-connector-1.1.0/
Tue, 06 Sep 2016 17:10:30 +0100http://rosslawley.co.uk/mongodb-spark-connector-1.1.0/
<p>Version 1.1.0 of the <a href="https://docs.mongodb.com/spark-connector/">MongoDB Spark connector</a> has been released. As well as the
MongoDB Spark Connector 2.0.0-rc0, bring Spark 2.0 support.</p>
<h2 id="1-1-0:989f2bc3ab8dd607253fcfdcaa7b362d">1.1.0</h2>
<p>This is the first release after the 1.0.0 driver and contains some API improvements and updates based on feedback from users.
Many thanks to all those that have provided feedback either through the MongoDB User <a href="https://groups.google.com/forum/#!forum/mongodb-user">mailing list</a>,
via <a href="stackoverflow.com/questions/tagged/apache-spark+mongodb">StackOverflow</a> or via the <a href="https://jira.mongodb.org/browse/SPARK/">Spark Jira project</a>.</p>
<p>It&rsquo;s been thrilling to get such great feedback and find out about some of the real world scenarios the connector has been used for. One of my
favourites so far has been about how <a href="https://www.mongodb.com/blog/post/mongodb-and-apache-spark-at-china-eastern-airlines">China Eastern Airlines</a> and how they
use the connector to save time and money. But wether you&rsquo;re a big or small user of the connector, I&rsquo;d really appreciate your feedback and comments. It really is central to making this connector
even better and more accessible.</p>
<h3 id="improvements-in-1-1-0:989f2bc3ab8dd607253fcfdcaa7b362d">Improvements in 1.1.0</h3>
<ul>
<li>Saving DataFrames with an <code>_id</code> field will updated in place, rather than error.</li>
<li>You can now use SQL to <code>INSERT INTO</code> a collection.</li>
<li>Added support for Spark MapTypes in schemas.</li>
<li>IsNotNull filter improved so that it also checks the field exists</li>
<li>Added helpers for defining the schemas and querying unsupported MongoDB datatypes.</li>
</ul>
<p>See the full <a href="https://github.com/mongodb/mongo-spark/blob/1.x/doc/7-Changelog.md">changelog</a> for detailed information and links to the Jira tickets.</p>
<p>The new</p>
<pre><code class="language-shell">&gt; $SPARK_HOME/bin/spark-shell --packages org.mongodb.spark:mongo-spark-connector_2.10:1.1.0
</code></pre>
<h2 id="spark-2-0-support:989f2bc3ab8dd607253fcfdcaa7b362d">Spark 2.0 support</h2>
<p>The 2.0.0.rc-0 connector is available from <a href="http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.mongodb.spark%22">maven central</a> and provides support for Spark 2.0, as well as
all the improvements from the 1.1.0 generation of the driver.</p>
<p>There were a few minor API changes required to support Spark 2.0:</p>
<ul>
<li>DataFrame and Dataset are now unified.
In Scala and Java, DataFrame and Dataset have been unified, i.e. DataFrame is just a type alias for Dataset of Row.</li>
<li>SparkSession
The new entry point that replaces the old SQLContext and HiveContext for DataFrame and Dataset APIs.</li>
</ul>
<p>The actual code changes to interact with MongoDB should be minimal and are designed to be as unobtrusive as possible.</p>
<h2 id="feedback-wanted:989f2bc3ab8dd607253fcfdcaa7b362d">Feedback wanted</h2>
<p>We would love to have your feedback on the connector, so please feel free to post to the <a href="https://groups.google.com/forum/#!forum/mongodb-user">MongoDB User</a> mailing list or
add feature requests to the <a href="https://jira.mongodb.org/browse/SPARK/">Jira project</a>.</p>
<p>Enjoy!</p>
MongoDB Scala Connector Releasedhttp://rosslawley.co.uk/mongodb-spark-connector-released/
Mon, 27 Jun 2016 22:10:30 +0100http://rosslawley.co.uk/mongodb-spark-connector-released/
<p><img style="max-width: 100%;" src="http://rosslawley.co.uk/images/sparks.jpg"></p>
<p>The new MongoDB Spark connector has been released!</p>
<p>Last month I <a href="http://rosslawley.co.uk/introducing-a-new=mongodb-spark-connector/">announced</a>
that the new Spark connector for <a href="http://mongodb.org">MongoDB</a> was in beta. After some invaluable
testing by the community, I&rsquo;m excited to announce that the first official release is now available from
<a href="https://spark-packages.org/package/mongodb/mongo-spark">spark-packages</a>:</p>
<pre><code class="language-shell">&gt; $SPARK_HOME/bin/spark-shell --packages org.mongodb.spark:mongo-spark-connector_2.10:1.0.0
</code></pre>
<h2 id="a-clean-simple-connector:8a624a4e7030bfeca46590a8e6ef01b3">A clean, simple connector.</h2>
<p>At <a href="http://mongodb.org/">MongoDB</a> we&rsquo;ve been listening to your feedback about what you would like from a new mongodb connector.
With that in mind we&rsquo;ve written a totally new idiomatic connector for spark:</p>
<pre><code class="language-scala">import com.mongodb.spark._
import com.mongodb.spark.sql._
// Loading data is simple:
val rdd = sc.loadFromMongoDB() // Uses the SparkConf for configuration
println(rdd.count)
println(rdd.first.toJson)
// DataFrames and DataSets made simple:
// Infers the schema (samples the collection)
val df = sqlContext.loadFromMongoDB().toDF()
df.filter(df(&quot;age&quot;) &lt; 100).show() // Passes filter to MongoDB
// Schema provided via a Case Class
val dataframeExplicit = sqlContext.loadFromMongoDB().toDF[Character]()
val dataSet = sqlContext.loadFromMongoDB().toDS[Character]()
// Writing data to MongoDB is also easy:
val centenarians = sqlContext.sql(&quot;SELECT name, age FROM characters WHERE age &gt;= 100&quot;)
centenarians.write.option(&quot;collection&quot;, &quot;hundredClub&quot;).mongo()
</code></pre>
<p>More examples and full documentation can be found on the <a href="https://docs.mongodb.com/spark-connector/">documentation</a> site.</p>
<h2 id="feedback-wanted:8a624a4e7030bfeca46590a8e6ef01b3">Feedback wanted</h2>
<p>We would love to have your feedback on the new driver, so please feel free to post to the <a href="https://groups.google.com/forum/#!forum/mongodb-user">MongoDB User</a> mailing list or
add feature requests to the <a href="https://jira.mongodb.org/browse/SPARK/">Jira project</a>.</p>
<p>Enjoy!</p>
Introducing a new MongoDB Spark Connectorhttp://rosslawley.co.uk/introducing-a-new=mongodb-spark-connector/
Wed, 18 May 2016 13:43:25 +0100http://rosslawley.co.uk/introducing-a-new=mongodb-spark-connector/
<h2 id="update:b4450e2a5da78ad49e7021b6ad22ca92">Update!</h2>
<p>The MongoDB Spark connector has been <a href="http://rosslawley.co.uk/mongodb-spark-connector-released/">released</a>! See the official
<a href="https://docs.mongodb.com/spark-connector/">documentation</a> for more information on getting started!</p>
<hr />
<p>Following on from the <a href="https://www.mongodb.com/blog/post/mongodb-connector-for-apache-spark-announcing-early-access-program-and-new-spark-training">official announcement</a> yesterday, I&rsquo;m really excited to write a few words about new <strong>MongoDB Spark Connector</strong>.</p>
<h2 id="getting-started:b4450e2a5da78ad49e7021b6ad22ca92">Getting started</h2>
<p>Before I go into detail about the hows and whys, first have a look at a quick usage example:</p>
<pre><code class="language-scala">import com.mongodb.spark._
import com.mongodb.spark.sql._
// Loading data is simple:
val rdd = sc.loadFromMongoDB() // Uses the SparkConf for configuration
println(rdd.count)
println(rdd.first.toJson)
// DataFrames and DataSets made simple:
// Infers the schema (samples the collection)
val df = sqlContext.loadFromMongoDB().toDF()
df.filter(df(&quot;age&quot;) &lt; 100).show() // Passes filter to MongoDB
// Schema provided via a Case Class
val dataframeExplicit = sqlContext.loadFromMongoDB().toDF[Character]()
val dataSet = sqlContext.loadFromMongoDB().toDS[Character]()
// Writing data to MongoDB is also easy:
val centenarians = sqlContext.sql(&quot;SELECT name, age FROM characters WHERE age &gt;= 100&quot;)
centenarians.write.option(&quot;collection&quot;, &quot;hundredClub&quot;).mongo()
</code></pre>
<p>The MongoDB Spark Connector supports Spark 1.6.1 and Scala 2.10 or 2.11. You can download it from Sonatype with these coordinates:</p>
<pre><code class="language-scala">&quot;org.mongodb.spark&quot; %% &quot;mongo-spark-connector&quot; % &quot;0.2&quot;
</code></pre>
<h2 id="backstory:b4450e2a5da78ad49e7021b6ad22ca92">Backstory</h2>
<p>Since January writing a new shiny Spark Connector designed from the ground up. Having initially played with Spark during one of our Skunkworks days over a year ago, I knew we could make a great connector to combine these two wonderful technologies. Last summer we welcomed Marko Vojvodic to the JVM team and during his internship he worked on prototyping a connector in Java. Marko looked at some of the hard problems when writing a great connector; type cohersion, data partitioning and data locality to name a few.</p>
<p>We have a few JVM projects keeping us busy at <a href="http://www.mongodb.com">MongoDB</a>, but in January I got time to start focusing on building the Spark connector. I started with Scala and ported some of Marko&rsquo;s code, wrote new code and built a new API from the ground up.</p>
<p>In April we quietly released the first beta version and solicited feedback from a select group of MongoDB power users. Since then a number of kinks have been ironed out resulting in the 0.2 release. Now we&rsquo;re opening up the beta and asking the wider community for feedback, before we release a 1.0 version.</p>
<h2 id="why-build-a-new-mongodb-spark-connector:b4450e2a5da78ad49e7021b6ad22ca92">Why build a new MongoDB Spark Connector?</h2>
<p>I&rsquo;ve been asked this a few times, after all the Hadoop connector supports Spark. The reason for a native connector is simple; Spark has quickly grown in popularity and use. It&rsquo;s growth reminds me of MongoDBs and naturally users want to combine both products. So it&rsquo;s only logical to create a <em>first class experience</em> and let these users get the most out of combining both technologies.
I really hope we have gone a long way to achieving that with this new connector.</p>
<h2 id="language-support:b4450e2a5da78ad49e7021b6ad22ca92">Language support</h2>
<p>The MongoDB Spark Connector supports all the languages Spark supports: Scala, Java, Python and R but under the hood it&rsquo;s written in Scala. This helped keep the API clean because we can make full use of Scala magic like implicits. To keep Java folk happy there&rsquo;s also a special Java API that hides some of the &ldquo;Scala-ness&rdquo; such as strange method names from Java users. Hat-tip to the Databricks <a href="https://github.com/databricks/scala-style-guide#java-interoperability">Java interoperability</a> guide, it&rsquo;s super helpful when considering how to consume a Scala API from Java.</p>
<h2 id="feedback-wanted:b4450e2a5da78ad49e7021b6ad22ca92">Feedback wanted!</h2>
<p>I hope that has got you interested, I would love to have your feedback on the new connector good or bad. So please feel free to email me directly or post to the <a href="https://groups.google.com/forum/#!forum/mongodb-user">MongoDB User</a> mailing list. If you are currently using an alternative connector for MongoDB and Spark, I&rsquo;d <strong>really interested</strong> in any feedback.</p>
<p>The connector is still in Beta, so there maybe changes to the API, but I&rsquo;m hoping it&rsquo;s stable now. If you do encounter any problems or find a bug please report it by opening an issue at <a href="https://jira.mongodb.org/browse/SPARK">jira.mongodb.org/browse/SPARK</a>.</p>
<p>The quickest way to get up and running with the new connector is via the <a href="https://github.com/mongodb/mongo-spark/blob/master/doc/0-introduction.md">introduction</a> on github. There is also the <a href="https://university.mongodb.com/courses/M233/about">M233: Getting Started with Spark and MongoDB</a> course over at the MongoDB University.</p>
<p>Happy Big Data Computing.</p>
El Capitan and key_load_public: invalid formathttp://rosslawley.co.uk/key_load_public/
Fri, 20 Nov 2015 12:06:46 +0000http://rosslawley.co.uk/key_load_public/<p>I upgraded to El Capitan on my Mac and it all went smoothly apart from one hitch. When pulling or pushing to Github I would get the following error: <code>key_load_public: invalid format</code>. Everything worked but opaque error messages are annoying.</p>
<p>I tried googling and all I found was this <a href="http://rcmdnk.github.io/blog/2015/10/09/computer-github-mac/">article</a>. It seemed to solve the issue but I didn&rsquo;t like the solution, mainly because it didn&rsquo;t help me understand the problem! So I persevered and <em>ignored the message</em>, until it became too annoying.</p>
<p>After more investigation it turns out it was caused by my public key: <code>~/.ssh/id_rsa.pub</code>, somehow it been corrupted during the upgrade. The good news is you can regenerate it easy enough with <code>ssh-keygen</code> like so:</p>
<pre><code> ssh-keygen -f ~/.ssh/id_rsa -y &gt; ~/.ssh/id_rsa.pub
</code></pre>
<p>And now I don&rsquo;t get <code>key_load_public: invalid format</code> messages anymore. Happy days.</p>
MongoDB Scala Driver Releasedhttp://rosslawley.co.uk/mongodb-scala-driver-released/
Tue, 20 Oct 2015 13:10:30 +0100http://rosslawley.co.uk/mongodb-scala-driver-released/
<p><img style="max-width: 100%;" src="http://rosslawley.co.uk/images/starNebula.jpg"></p>
<p>The new Scala Driver for MongoDB has been released!</p>
<p>Last month I <a href="http://rosslawley.co.uk/introducing-mongodb-scala-driver/">announced</a>
the first release candidate of a new idiomatic Scala Driver for <a href="http://mongodb.org">MongoDB</a> and I&rsquo;m excited to announce that the first official release is now available on sonatype for Scala 2.11:</p>
<pre><code class="language-scala"> &quot;org.mongodb.scala&quot; %% &quot;mongo-scala-driver&quot; % &quot;1.0.0&quot;
</code></pre>
<h2 id="a-clean-simple-scala-driver:0667d62b0610317a357b6826680f4d5a">A clean, simple Scala driver.</h2>
<p>At <a href="http://mongodb.org/">MongoDB</a> we&rsquo;ve been listening to your feedback about what you would like from a new Scala driver. With that in mind we&rsquo;ve written a totally new Scala driver. Here are some of the highlights:</p>
<p><img style="float:right;" src="http://mongodb.github.io/mongo-scala-driver/s/img/mongoScalaLogo.png"></p>
<ul>
<li>A modern idiomatic Scala driver with asynchronous and non-blocking IO.</li>
<li>A clean modern API following the latest MongoDB driver <a href="https://github.com/mongodb/specifications">specifications</a>.</li>
<li>A new namespace for Scala <code>org.mongodb.scala</code>. No more confusion about what classes required for Scala.</li>
<li>A new <a href="http://mongodb.github.io/mongo-scala-driver/1.0/reference/observables/"><code>Observable</code></a> type that is both composable and flexible enough to handle streams of data from MongoDB.</li>
<li>New immutable and mutable type safe <a href="http://mongodb.github.io/mongo-scala-driver/1.0/bson/documents/"><code>Document</code></a> classes with all the convenience of a <code>Map</code>.</li>
<li>Comprehensive <a href="http://mongodb.github.io/mongo-scala-driver/">documentation</a> site to help get you started.</li>
<li>Easy <a href="http://mongodb.github.io/mongo-scala-driver/1.0/integrations/">integration</a> with other Reactive libraries such as <a href="http://reactivex.io/rxscala/">RxScala</a> and <a href="http://www.reactive-streams.org/">Reactive Streams</a>.</li>
</ul>
<p>Below is a quick example to whet your appetite:</p>
<pre><code class="language-scala">// Connect to the users collection in mydb
val mongoClient: MongoClient = MongoClient()
val database: MongoDatabase = mongoClient.getDatabase(&quot;mydb&quot;)
val collection: MongoCollection[Document] = database.getCollection(&quot;users&quot;)
// The Document ADT enforces type safety and can implicitly box native scala types to BSON types
val martin = Document(&quot;user&quot; -&gt; &quot;Martin&quot;) // &quot;Martin&quot; becomes BsonString(&quot;Martin&quot;)
// Alternatively, create Documents from Json
val query = Document(&quot;&quot;&quot;{user: &quot;Martin&quot;}&quot;&quot;&quot;)
// Lets run a query for all Martins and print out the json representation of each document
collection.find(query).subscribe(
(user: Document) =&gt; println(user.toJson()), // onNext
(error: Throwable) =&gt; println(s&quot;Query failed: ${error.getMessage}&quot;), // onError
() =&gt; println(&quot;Done&quot;) // onComplete
)
// Want Futures? No problems!
val futureUsers: Future[Seq[Document]] = collection.find(query).toFuture()
val firstMartin: Future[Document] = collection.find(query).first().head()
</code></pre>
<p>More examples and full documentation can be found on the <a href="http://mongodb.github.io/mongo-scala-driver">documentation</a> hub, including a full <a href="http://mongodb.github.io/mongo-scala-driver/1.0/getting-started/">getting started</a> guide.</p>
<h2 id="feedback-wanted:0667d62b0610317a357b6826680f4d5a">Feedback wanted</h2>
<p>We would love to have your feedback on the new driver, so please feel free to post to the <a href="https://groups.google.com/forum/#!forum/mongodb-user">MongoDB User</a> mailing list or add feature requests to the <a href="https://jira.mongodb.org/browse/SCALA/">Jira project</a>. There are a number of items on the roadmap such as; MongoDB Server 3.2 and Case Class support but all feature requests are welcome.</p>
<p>Enjoy!</p>
Introducing a new MongoDB Scala Driverhttp://rosslawley.co.uk/introducing-mongodb-scala-driver/
Wed, 23 Sep 2015 13:00:00 +0100http://rosslawley.co.uk/introducing-mongodb-scala-driver/
<h2 id="update-now-released-hugoshortcode-1:30e1b2e2e4927e03c8ac09809618f25b">Update - <a href="http://rosslawley.co.uk/mongodb-scala-driver-released/">now released!</a></h2>
<p>I&rsquo;m really pleased to announce the first release candidate of a new MongoDB Scala Driver!</p>
<p><img style="float:right;" src="http://mongodb.github.io/mongo-scala-driver/s/img/mongoScalaLogo.png"></p>
<h2 id="insider-information:30e1b2e2e4927e03c8ac09809618f25b">Insider information</h2>
<p>At <a href="http://mongodb.org/">MongoDB</a> we&rsquo;ve been really busy, back in April we <a href="https://www.mongodb.com/blog/post/introducing-30-java-driver">introduced</a> the 3.0 Java driver. It was a massive undertaking that included numerous improvements and updates. What got me most excited with the 3.0 release was the introduction of a new fully asynchronous, non-blocking driver. Using this asynchronous driver as a base we also released an <a href="mongodb.github.io/mongo-java-driver-rx">RxJava</a> driver and a <a href="http://mongodb.github.io/mongo-java-driver-reactivestreams">Reactive Streams</a> driver.</p>
<p>Today we are announcing a new MongoDB Scala Driver, which also builds upon the asynchronous driver, whilst still providing a first class Scala experience.</p>
<h2 id="scala-specifics:30e1b2e2e4927e03c8ac09809618f25b">Scala specifics</h2>
<p>This new Scala driver required much more than a simple wrapping of the Java driver. Here are some of the highlights:</p>
<ul>
<li>A modern idiomatic Scala driver with asynchronous and non-blocking IO.</li>
<li>A new <a href="http://mongodb.github.io/mongo-scala-driver/1.0/reference/observables/"><code>Observable</code></a> type that is both composable and flexible enough to handle streams of data from MongoDB.</li>
<li>New immutable and mutable type safe <a href="http://mongodb.github.io/mongo-scala-driver/1.0/bson/documents/"><code>Document</code></a> classes with all the convenience of a <code>Map</code>.</li>
<li>A clean modern API following the latest MongoDB driver <a href="https://github.com/mongodb/specifications">specifications</a>.</li>
<li>A new namespace for Scala <code>org.mongodb.scala</code>. No more confusion about what classes required for the Scala driver.</li>
<li>Comprehensive <a href="http://mongodb.github.io/mongo-scala-driver/">documentation</a> site to help get you started.</li>
<li>Easy <a href="http://mongodb.github.io/mongo-scala-driver/1.0/integrations/">integration</a> with other Reactive libraries such as <a href="http://reactivex.io/rxscala/">RxScala</a> and <a href="http://www.reactive-streams.org/">Reactive Streams</a>.</li>
</ul>
<p>Below is a quick example to whet your appetite:</p>
<pre><code class="language-scala">// Connect to the users collection in mydb
val mongoClient: MongoClient = MongoClient()
val database: MongoDatabase = mongoClient.getDatabase(&quot;mydb&quot;)
val collection: MongoCollection[Document] = database.getCollection(&quot;users&quot;)
// The Document ADT enforces type safety and can implicitly box native scala types to BSON types
val query = Document(&quot;user&quot; -&gt; &quot;Martin&quot;) // &quot;Martin&quot; becomes BsonString(&quot;Martin&quot;)
// Lets run a query for all Martins and print out the json representation of each document
collection.find(query).subscribe(
(user: Document) =&gt; println(user.toJson()), // onNext
(error: Throwable) =&gt; println(s&quot;Query failed: ${error.getMessage}&quot;), // onError
() =&gt; println(&quot;Done&quot;) // onComplete
)
// Want Futures? No problems!
val futureUsers: Future[Seq[Document]] = collection.find(query).toFuture()
</code></pre>
<p>Available on sonatype for Scala 2.11:</p>
<pre><code class="language-scala">&quot;org.mongodb.scala&quot; %% &quot;mongo-scala-driver&quot; % &quot;1.0.0-rc0&quot;
</code></pre>
<h2 id="feedback-wanted:30e1b2e2e4927e03c8ac09809618f25b">Feedback wanted</h2>
<p>We would love to have your feedback on the new driver, so please feel free to email me directly or post to the <a href="https://groups.google.com/forum/#!forum/mongodb-user">MongoDB User</a> mailing list.</p>
<p>The best place to get up and running with the new driver is the <a href="http://mongodb.github.io/mongo-scala-driver/1.0/getting-started/">getting started</a> guide.</p>
How to: Handle multiple Scala versionshttp://rosslawley.co.uk/how-to-handle-multiple-scala-versions/
Wed, 23 Apr 2014 00:00:00 +0000http://rosslawley.co.uk/how-to-handle-multiple-scala-versions/
<p>I recently upgraded <a href="http://mongodb.github.io/casbah/">Casbah</a> to support the latest Scala 2.11 release and for the first time when supporting multiple Scala versions I hit a stumbling block. If you&rsquo;re writing a library that wants to support multiple versions of Scala in a single code base, it&rsquo;s generally easy isn&rsquo;t it? Thankfully, it is as <a href="http://www.scala-sbt.org/">sbt</a> can do the heavy lifting for you.</p>
<h2 id="three-steps-to-success:7a5c4c34d420fcc6185eaa05487dbc11">Three steps to success</h2>
<p>The sbt documentation covers the basics nicely in their <a href="http://www.scala-sbt.org/release/docs/Detailed-Topics/Cross-Build.html">cross build</a> documentation. But what&rsquo;s the path to success?</p>
<ol>
<li><p>Choose the Scala&rsquo;s you are going to support.</p>
<p>Set the <code>crossScalaVersions</code> setting in your build file to define the Scala versions you want to support. In Casbah we support: <code>crossScalaVersions := Seq(&quot;2.11.0&quot;, &quot;2.10.4&quot;, &quot;2.9.3&quot;)</code></p></li>
<li><p>Configure libraries</p>
<p>Here it can get hairy because third party libraries may not support all your favoured scala builds in their latest release. If they do, then happy days but if not you may have to pick out each library version as needed. Here is an example from the Casbah build:</p>
<pre><code>// In the build settings append to the library dependencies
libraryDependencies &lt;++= scalaVersion (sv =&gt; Seq(specs2(sv), scalatime(sv)))
// Helper method to pattern match against the scala version and return the correct specs version
def specs2(scalaVersion: String) =
(scalaVersion match {
case &quot;2.9.3&quot; =&gt; &quot;org.specs2&quot; %% &quot;specs2&quot; % &quot;1.12.4.1&quot;
case _ =&gt; &quot;org.specs2&quot; %% &quot;specs2&quot; % &quot;2.3.11&quot;
}) % &quot;test&quot;
</code></pre>
<p>As the latest Specs build <code>2.3.11</code> doesn&rsquo;t support Scala 2.9.3 we have to use the older <code>1.12.4.1</code> version for 2.9.3. The downside is we can&rsquo;t yet use some of the nicer newer features of the Specs library in our test suite.</p></li>
<li><p>Scala version specific code.</p>
<p>This is the real challenge, what do you do if you need specific code for a specific version of Scala? Hopefully, you&rsquo;ll never need to but this isn&rsquo;t always the case as I found with Casbah. We use <code>BeanInfo</code> annotation and it lives in <code>scala.reflection</code> in Scala 2.9.3 and in <code>scala.beans</code> in 2.11.</p>
<p>So how can we get round this? Luckily, Scala 2.10 gave me a hint as <code>BeanInfo</code> had been depreciated in <code>scala.reflection</code> but they had mirrored it in a <a href="https://github.com/scala/scala/blob/v2.10.4/src/library/scala/reflect/package.scala#L55-L56">package object</a>. I could use the same trick in Casbah! However, I would need a version for Scala 2.9.3 to point to <code>scala.reflect.BeanInfo</code> and a version for 2.10 &amp; 2.11. Then update all the code to point to the scala specific package object.</p>
<p>As the code would be version specific it couldn&rsquo;t live in the <code>src/main/scala</code> directory as it wouldn&rsquo;t compile. So instead I created specific Scala directories in <code>src/main</code> like so:</p>
<pre><code>./scala
./scala_2.9.3 // Scala 2.9.3 specific code
./scala_2.10 // Scala 2.10 specific code
./scala_2.11 // Scala 2.11 specific code
</code></pre>
<p>As Scala 2.10 &amp; 2.11 point versions will be binary compatible I only need a top level directory for them. Then alls that&rsquo;s needed is to add these source directories to the build:</p>
<pre><code>unmanagedSourceDirectories in Compile &lt;+= (sourceDirectory in Compile, scalaBinaryVersion){
(s, v) =&gt; s / (&quot;scala_&quot;+v)
}
</code></pre>
<p>This adds extra source directories to the compile step and using <code>scalaBinaryVersion</code> I get the binary compatible versions (so it&rsquo;s 2.11 for all 2.11 releases but would be 2.9.3, 2.9.2, 2.9.1 for the non compatible point releases from the 2.9 series).</p>
<p>The fix in Casbah was to simply add a <code>beans.scala</code> for the scala specific versions and create a <code>BeanInfo</code> type which points to the correct scala package. Problem solved.</p></li>
</ol>
<p>Sbt allow you to run any command against multiple versions of Scala. <code>./sbt +test</code> will test against all your <code>crossScalaVersions</code> of Scala and hopefully confirm the code works as expected. If I want to test a specific version use a double plus sign and add the version string to the arguments eg: <code>./sbt ++ 2.9.3 test</code>.</p>
<h2 id="final-thoughts:7a5c4c34d420fcc6185eaa05487dbc11">Final thoughts</h2>
<p>Scala is a relatively fast moving ecosystem with new major releases every 13 months or so. When I took over Casbah it supported 6 binary incompatible versions of Scala! From 2.8.1 to 2.9.3. Happily, Scala patch versions from 2.10.0 on have become binary compatible making supporting multiple versions of Scala easier. However, one of the key issues for maintaining a library is to know when to end of life an old Scala version, so to keep the library fresh.</p>
<p>In Casbah supporting the current and last major versions of Scala gives users support for the latest MongoDB version and allows them some flexibility when it comes to updating Scala. Generally, this has been a trouble free process and as shown above problems are solvable in just three easy steps.</p>
RxJava - understandably reactivehttp://rosslawley.co.uk/rx-java/
Wed, 06 Nov 2013 00:00:00 +0000http://rosslawley.co.uk/rx-java/<p>Reactive programing is hot stuff at the moment and the
<a href="https://www.coursera.org/course/reactive">Coursera Principles of Reactive Programming</a>
course has <strong>just</strong> started (its not too late to enroll).</p>
<p>Recently, I&rsquo;ve been hearing good things about <a href="https://github.com/Netflix/RxJava">RxJava</a> (
a port of .Net&rsquo;s <a href="http://msdn.microsoft.com/en-gb/data/gg577609.aspx">Reactive extensions</a>
) so I wanted to learn some more. Then I stumbled upon a video from a recent
<a href="http://www.meetup.com/SF-Scala/">SF Scala</a> meetup
which covered what it is and how they implemented the core and then added
support for other JVM languages.</p>
<p>Two things immediately struck me:</p>
<ol>
<li><p>Observables are not opnionated about how the backend works. It could be
concurrent or swapped out with a thread pool, an actor
or an nio event &amp; event loop&hellip; Pretty cool, this means there is a single way
of handling the code no matter if the backend is synchronus or asynchronus.</p></li>
<li><p>The methods for manipulating multiple observers, chaining or nesting
is extremely powerful, yet easy to understand.</p></li>
</ol>
<p>I&rsquo;m really looking forward to using it, to me its an easier abstraction to
understand for handling streams of data and seemly less complex than using Plays
excellent <code>Iteratee.Concurrent</code> library.</p>
<p>Enjoy:</p>
<p><div class="embed video-player">
<iframe class="youtube-player" type="text/html"
width="640" height="385"
src="http://www.youtube.com/embed/tOMK_FYJREw"
allowfullscreen frameborder="0">
</iframe>
</div></p>
Typesafe's Activatorhttp://rosslawley.co.uk/typesafe-activator/
Tue, 24 Sep 2013 00:00:00 +0000http://rosslawley.co.uk/typesafe-activator/
<p>Yesterday the <a href="http://typesafe.com/blog/announcing-activator-10-create-reactive-apps-in-minutes">Typesafe Activator</a>
hit 1.0. If you haven&rsquo;t heard about it and use scala or the jvm then take five
minutes and check it out - its worth it.</p>
<h1 id="what-is-it:8f899c775a0faeb02adf96bdfce96cff">What is it?</h1>
<p>My recommendation alone not enough and you what to know more about it before
installing it? No problems, let me convince you.</p>
<blockquote>
<p>Typesafe Activator is a local web &amp; command-line tool that helps developers
get started with the Typesafe Platform.</p>
</blockquote>
<p>Whats that give you? Basically, a real nice UI in the browser for creating web
applications from templates. The <a href="http://typesafe.com/activator/templates">templates</a>
cover &ldquo;hello world&rdquo; in scala to using <a href="http://akka.io">Akka</a>, <a href="http://playframework.com">
Play</a> and <a href="http://www.scala-lang.org/">Scala</a> to create
modern scalable reactive web sites.</p>
<p>Its quick and easy to get going - I chose the
<a href="http://typesafe.com/activator/template/play-mongo-knockout">Play Reactive Mongo and knockout.js</a>
template</p>
<p class="text-center">
<img src="http://rosslawley.co.uk/images/activator-loading.png">
</p>
<p>In the background it downloads all the resources you need to create the project
from the template. Starting a new project is extremely simple and once the
project is loaded you get to use the Web UI in anger.</p>
<p class="text-center">
<img src="http://rosslawley.co.uk/images/activator-orientation.png">
</p>
<h1 id="the-tutorial:8f899c775a0faeb02adf96bdfce96cff">The tutorial</h1>
<p>Once loaded, you get the tutorial for this template:</p>
<blockquote>
<p>The world is going reactive</p>
<p>Not long ago, response times in the seconds were considered appropriate. Browser
refreshes were the norm in web applications. Systems would go down for hours of
maintenance, or even be rebooted nightly, and this was ok because people only
expected the systems to be up during business hours..</p>
</blockquote>
<p><em>Exciting stuff!</em> So not only can you code, compile and test your application
all in the browser - there&rsquo;s a tutorial to guide you through the various new
concepts as well.</p>
<p>The reactive mongo tutorial, gets you up and running using creating a reactive
Play web application with a rich front end and scalable backend. It uses
<a href="http://reactivemongo.org">Reactive Mongo</a> an asynchronous non blocking scala
mongodb driver for the database. The play framework for the webserver,
<a href="http://coffeescript.org">coffeescript</a> and <a href="http://knockoutjs.com">knockout.js</a>
for the frontend.</p>
<p>The tutorial walks you through how the various parts of the app work together and
links through to the code. Next it sets tasks to update parts of the app, extending
it and adding functionality.</p>
<p>This quickly, gets you up to speed, so if you are interested in any of the
Typesafe stack then download activator now and you&rsquo;ll be up and running in minutes!</p>
Switching to Sublime Text 3http://rosslawley.co.uk/switching-to-sublimetext-3/
Wed, 31 Jul 2013 00:00:00 +0000http://rosslawley.co.uk/switching-to-sublimetext-3/
<p>For a long while I&rsquo;ve been a huge fan of <a href="http://www.sublimetext.com/">Sublime Text 2</a>,
its easy to use and has a wealth of features that can be installed by the
awesome <a href="http://wbond.net/sublime_packages/package_control">package control</a>
system. I use it for all python development and any adhoc file work, infact I
only abandon it for my scala work - where IntelliJ wins.</p>
<p>Sublime Text 3 (ST3) has been in beta for six months, and over that time its
plugin ecosystem has matured. After experiencing some slow downs in
Sublime Text 2 (ST2) I thought it was time to upgrade to an even more
sublime experience.</p>
<h2 id="5-step-migration-to-sublime-text-3:1a9591199d94248b300fa0f7455d5c0a">5 step migration to Sublime Text 3</h2>
<p>First off there is <strong>no risk</strong> in trying to upgrade, it has a different app
name to ST2 and its configurations are in a different directory. In short you
can have both installed and living happily.</p>
<h3 id="1-can-i-upgrade:1a9591199d94248b300fa0f7455d5c0a">1. Can I upgrade?</h3>
<p>Goto the awesome <a href="http://www.caniswitchtosublimetext3.com/">Can I Switch To Sublime Text 3?</a>
website and check your packages. Its probably not a goer if your most used
plugin doesn&rsquo;t work in ST3, but do check out alternatives as many packages
have already been updated and do now work with ST3. My results were as
follows:</p>
<p class="text-center">
<img src="http://rosslawley.co.uk/images/ready_for_sublime.png">
</p>
<p>I had one fail but when looking at the package reposititory it suggested an
ST3 alternative for a python code intel tool that would work (SublimeJEDI) -
so I was good to go.</p>
<h3 id="2-install-package-control-and-packages:1a9591199d94248b300fa0f7455d5c0a">2. Install package control and packages</h3>
<p>Its easy to install package control, just checkout the
<a href="http://wbond.net/sublime_packages/package_control/installation#ST3">instructions</a>.
Once installed, open package control and install all the green packages
mentioned on the can I switch website.</p>
<p>Don&rsquo;t be tempted to use package control for the yellow packages - as it will
install the ST2 versions which won&rsquo;t work. Instead you will need to
manually install - this usually is simple and only involves cloning the
plugin repository into your packages directory and then checking out an
ST3 compatible branch. Its no real hardship - so go for it.</p>
<h3 id="3-migrate-settings:1a9591199d94248b300fa0f7455d5c0a">3. Migrate settings</h3>
<p>Once all the packages are installed, its time to get ST3 as comfy as you had
ST2. To do this copy and paste your user settings from:
<code>Preferences -&gt; Settings -&gt; User</code>.</p>
<p>Then copy any language specific settings and snippets you have from the User
directory of ST2 into the User directory of ST3 eg:</p>
<pre><code>cp ~/Library/Application\ Support/Sublime\ Text\ 2/Packages/User/Python.sublime-settings ~/Library/Application\ Support/Sublime\ Text\ 3/Packages/User/Python.sublime-settings
</code></pre>
<h3 id="4-update-the-subl-helper:1a9591199d94248b300fa0f7455d5c0a">4. Update the subl helper:</h3>
<p>Remove / rename any symlinked helpers then:</p>
<pre><code>sudo ln -s &quot;/Applications/Sublime Text.app/Contents/SharedSupport/bin/subl&quot; /usr/bin/subl
</code></pre>
<h3 id="5-profit:1a9591199d94248b300fa0f7455d5c0a">5. Profit!</h3>
<p>Easy eh? You now have ST3 configured as you are used to and if anything
is buggy (<em>it is still in a beta</em>) you can easily switch back to ST2.</p>
<p>My initial impressions are its <strong><em>much faster</em></strong> and I haven&rsquo;t hit any bugs yet,
so I&rsquo;m a happy bunny. But if you need more convincing - heres some extra reasons
to make the switch:</p>
<ul>
<li><p>Speed.
Startup time, file load time, and Replace All have all been
significantly improved.</p></li>
<li><p>Improved Project and Pane management.
Working with multiple panes is now more efficient, with commands to
create and destroy panes, and quickly move files between panes. Also
you can now have multiple workspaces per project.</p></li>
<li><p>Better Symbol Indexing.
Allows goto definitions and goto symbol in a project, as well as new
jump forward and jump back features.</p></li>
</ul>
<p>Need some more reasons? Check out the original
<a href="http://www.sublimetext.com/blog/articles/sublime-text-3-beta">Sublime Text 3 announcement</a>.</p>
<p>So if you are on Sublime Text 2 then download Sublime Text 3 and see if
you can make the switch today.</p>
Using MongoDB to find the most popular pub names!http://rosslawley.co.uk/the-most-popular-pub-name/
Tue, 23 Jul 2013 00:00:00 +0000http://rosslawley.co.uk/the-most-popular-pub-name/
<p>Earlier in the year I gave a talk at <a href="http://www.10gen.com/events/mongodb-london-2013">MongoDB London</a>
about the different aggregation options with MongoDB. The topic recently came up again
in conversation at a user group, so I thought it deserved a blog post.</p>
<h2 id="gathering-ideas-for-the-talk:e8cc9569ac7e5d9e066aee5378affb29">Gathering ideas for the talk</h2>
<p>I wanted to give a more interesting aggregation talk than the standard
&ldquo;counting words in text&rdquo;, and as the <a href="http://docs.mongodb.org/manual/core/aggregation/">aggregation framework</a> gained shiny <a href="http://docs.mongodb.org/manual/core/2dsphere/">2dsphere geo</a> support in
<a href="http://docs.mongodb.org/manual/release-notes/2.4/#new-geospatial-indexes-with-geojson-and-improved-spherical-geometry">2.4</a>, I figured I&rsquo;d use that. I just needed a topic&hellip;</p>
<h3 id="what-are-us-brits-focused-on:e8cc9569ac7e5d9e066aee5378affb29">What are us Brits focused on?</h3>
<p>Two things immediately sprang to mind: <strong>weather</strong> and <strong>beer</strong>.</p>
<p>I opted to focus on something close to my heart: <strong>beer</strong> :)
But what to aggregate about beer? Then I remembered an old pub quiz favourite&hellip;</p>
<p><strong>What is the most popular pub name in the UK?</strong></p>
<p>I know there is some great open data, including a wealth of information
on pubs available from the awesome <a href="http://www.openstreetmap.org/">open street map</a>
project. I just need to get at it and happily the <a href="http://www.overpass-api.de">Overpass-api</a>
provides a simple &ldquo;xapi&rdquo; interface for OSM data.
All I needed was anything tagged with <code>amenity=pub</code> within in the
bounds of the UK and with their xapi interface this is as simple as a wget:
<code>http://www.overpass-api.de/api/xapi?*[amenity=pub][bbox=-10.5,49.78,1.78,59]</code></p>
<p>Once I had an osm file I used the <a href="http://imposm.org/">imposm python library</a> to
parse the xml and then convert it to following GeoJSON format:</p>
<pre>
<code class="javascript">{
"_id" : 451152,
"amenity" : "pub",
"name" : "The Dignity",
"addr:housenumber" : "363",
"addr:street" : "Regents Park Road",
"addr:city" : "London",
"addr:postcode" : "N3 1DH",
"toilets" : "yes",
"toilets:access" : "customers",
"location" : {
"type" : "Point",
"coordinates" : [-0.1945732, 51.6008172]
}
}
</code></pre>
<p>Then it was a case of simply inserting it as a document into MongoDB. I
quickly noticed that the data needed a little cleaning, as I was seeing duplicate
pub names, for example: &ldquo;The Red Lion&rdquo; and &ldquo;Red Lion&rdquo;. Because I wanted to make
a wordle I normalised all the pub names.</p>
<p>If you want to know more about the importing process, the full loading code is
available on github: <a href="https://github.com/rozza/pubnames/blob/master/osm2mongo.py">osm2mongo.py</a></p>
<h2 id="top-pub-names:e8cc9569ac7e5d9e066aee5378affb29">Top pub names</h2>
<p>It turns out finding the most popular pub names is very simple with the
aggregation framework. Just group by the name and then sum up all the occurrences.
To get the top five most popular pub names we sort by the summed value and then
limit to 5:</p>
<div class="row">
<div class="col-md-6 col-lg-6">
<pre>
<code class="javascript">db.pubs.aggregate([
{"$group":
{"_id": "$name",
"value": {"$sum": 1}
}
},
{"$sort": {"value": -1}},
{"$limit": 5}
]);
</code></pre>
</div>
<div class="col-md-6 col-lg-6">
For the whole of the UK this returns:
<ol>
<li>The Red Lion</li>
<li>The Royal Oak</li>
<li>The Crown</li>
<li>The White Hart</li>
<li>The White Horse</li>
</ol>
</div>
</div>
<p class="text-center">
<img src="http://rosslawley.co.uk/images/pubs_wordle.png"<br>
</p>
<h2 id="top-pub-names-near-you:e8cc9569ac7e5d9e066aee5378affb29">Top pub names near you</h2>
<p>At MongoDB London I thought that was too easy, so filtered to find the top pub
names near the conference and showing off some of the geo functionality that became
available in MongoDB 2.4. To limit the result set match and ensure the
location is within a 2 mile radius by using <code>$centreSphere</code>. Just provide the
coordinates <code>[ &lt;long&gt;, &lt;lat&gt; ]</code> and a radius of roughly 2 miles
(3959 is approximately the radius of the earth, so divide it by 2):
<pre>
<code class="javascript">db.pubs.aggregate([
{ &ldquo;$match&rdquo; : { &ldquo;location&rdquo;:
{ &ldquo;$within&rdquo;:
{ &ldquo;$text-centerSphere&rdquo;: [[-0.12, 51.516], 2 / 3959] }}}
},
{ &ldquo;$group&rdquo; :
{ &ldquo;_id&rdquo; : &ldquo;$name&rdquo;,
&ldquo;value&rdquo; : { &ldquo;$sum&rdquo; : 1 } }
},
{ &ldquo;$sort&rdquo; : { &ldquo;value&rdquo; : -1 } },
{ &ldquo;$limit&rdquo; : 5 }
]);
</code></pre></p>
<h2 id="what-about-where-i-live:e8cc9569ac7e5d9e066aee5378affb29">What about where I live?</h2>
<p>At the conference I looked the most popular pub name near the conference. Thats
great if you happen to live in the centre of London but what about everyone else
in the UK? So for this blog post I decided to update the demo code and make it
dynamic based on where you live.</p>
<p>See: <a href="http://pubnames.rosslawley.co.uk">pubnames.rosslawley.co.uk</a> <br>(It may take
some time for the heroku dyno to wake, so please be patient!)</p>
<p>Apologies for those outside the UK - the demo app doesn&rsquo;t have data for the
whole world - its surely possible to do, I just lacked the patience to download
it all!</p>
<h2 id="cheers:e8cc9569ac7e5d9e066aee5378affb29">Cheers</h2>
<p class="text-center">
<img src="http://rosslawley.co.uk/images/almurray.jpg" title="Original image: http://www.flickr.com/photos/bradfordtheatres/3063899946" /> <br>
<small> Source: <a href="http://www.flickr.com/photos/bradfordtheatres/3063899946"/>http://www.flickr.com/photos/bradfordtheatres/3063899946</a></small>
</p>
<p>All the code is available in my repo on <a href="https://github.com/rozza/pubnames">github</a>
including the bson file of the pubs and the wordle code - so fork it and start
playing with MongoDB&rsquo;s great geo features!</p>
mongoengine 0.8.0http://rosslawley.co.uk/mongoengine-0-dot-8-dot-0/
Mon, 20 May 2013 00:00:00 +0000http://rosslawley.co.uk/mongoengine-0-dot-8-dot-0/
<p>I&rsquo;m really pleased to annouce the release of
<a href="https://pypi.python.org/pypi/mongoengine/0.8.0">MongoEngine 0.8.0</a>.
Its been a <em>long process</em> due to work and life commitments but the latest version
of MongoEngine is here.</p>
<h2 id="whats-changed:21b8a771bb0f6c9b7111f39346dd1702">Whats changed?</h2>
<p>There have been loads of fixes, improvements and changes in 0.8.
The headliners are:</p>
<ul>
<li>Minimum requirements are python 2.6+ and pymongo 2.5+</li>
<li>Inheritance now off by default</li>
<li>Inherited documents store <code>_cls</code> not <code>_types</code> (preventing auto creation of
multikey indexes)</li>
<li>Querysets are immutable and now return clones</li>
<li>New Geo Fields supporting mongodb 2.4 new &ldquo;2dSphere&rdquo; indexes</li>
<li>New context managers for switching collections or databases on the fly as
well as turning off dereferencing</li>
<li>Django support improved (now supports Django 1.5.1, groups and permissions)</li>
<li>Performance improvements, back to the same performance as 0.4</li>
</ul>
<p>And much, much more!</p>
<p>See the full <a href="http://docs.mongoengine.org/en/latest/changelog.html#changes-in-0-8-0">changelog</a>
for all changes or check <a href="https://github.com/MongoEngine/mongoengine/">github</a> to see the
<a href="https://github.com/MongoEngine/mongoengine/issues?milestone=2&amp;page=1&amp;state=closed">96 tickets</a>
that have been fixed as part of this release.</p>
<h2 id="whats-next:21b8a771bb0f6c9b7111f39346dd1702">Whats next?</h2>
<p>There are a number of ideas on <a href="https://github.com/MongoEngine/mongoengine/issues?milestone=6">github</a> for 0.9. I&rsquo;m looking for people to help with the administration and coding
of MongoEngine, so please get involved and help drive MongoEngine forward!</p>
MongoEngine 0.8 RC1 Releasedhttp://rosslawley.co.uk/mongoengine-0-dot-8-rc1-released/
Wed, 01 May 2013 00:00:00 +0000http://rosslawley.co.uk/mongoengine-0-dot-8-rc1-released/
<p><strong>Notice:</strong> 0.8.0RC3 has been released fixing a couple of small bugs and extra improvements.</p>
<p><strong>Notice:</strong> 0.8.0RC2 has been released fixing an obscure queryset cursor cloning issue.</p>
<p>I&rsquo;m really pleased to annouce the release candidate for <a href="https://pypi.python.org/pypi/mongoengine/0.8.0RC1">MongoEngine 0.8</a>!
Its been a <em>long process</em> due to work and life commitments but the latest version
of MongoEngine is ready for testing and feedback.</p>
<h2 id="why-a-release-candidate:381e651998269fc0201485acd98bed22">Why a release candidate?</h2>
<p>There have been massive changes in the internals, requiring thought and
<strong>testing</strong> before upgrading and releasing to the wild - so please
read the <a href="http://docs.mongoengine.org/en/latest/upgrade.html#to-0-8">upgrade</a>
docs carefully!</p>
<p>The changes are worth it and make using <a href="http://mongoengine.org">MongoEngine</a>
even better.</p>
<p>Please test 0.8.0RC1 on your test systems and staging systems and any feedback
please email the <a href="https://groups.google.com/group/mongoengine-users">user group</a>
or if you&rsquo;d prefer to message me directly you can via
<a href="https://github.com/rozza">https://github.com/rozza</a></p>
<h2 id="whats-changed:381e651998269fc0201485acd98bed22">Whats changed?</h2>
<p>There have been loads of fixes, improvements and changes in 0.8.
The headliners are:</p>
<ul>
<li>Minimum requirements are python 2.6+ and pymongo 2.5+</li>
<li>Inheritance now off by default</li>
<li>Inherited documents store <code>_cls</code> not <code>_types</code> (preventing auto creation of
multikey indexes)</li>
<li>Querysets are immutable and now return clones</li>
<li>New Geo Fields supporting mongodb 2.4 new &ldquo;2dSphere&rdquo; indexes</li>
<li>New context managers for switching collections or databases on the fly as
well as turning off dereferencing</li>
<li>Django support improved (now supports Django 1.5.1, groups and permissions)</li>
<li>Performance improvements, back to the same performance as 0.4</li>
</ul>
<p>And much, much more! See the full <a href="http://docs.mongoengine.org/en/latest/changelog.html#changes-in-0-8-0">changelog</a>
for all changes or check <a href="https://github.com/MongoEngine/mongoengine/">github</a> to see the
<a href="https://github.com/MongoEngine/mongoengine/issues?milestone=2&amp;page=1&amp;state=closed">84 tickets</a>
that have been fixed as part of this release.</p>
<h2 id="try-it-now:381e651998269fc0201485acd98bed22">Try it now!</h2>
<p>Please test it, try it out and report any issues back!
All being well 0.8.0RC1 will be released as 0.8.0 in a week.</p>
<p><small>* Want to get involved in MongoEngine? We&rsquo;re looking for help so please ping me!</small></p>