http://blog.paulbird.co/Ghost 0.5Tue, 14 Aug 2018 21:49:23 GMT60I've never really ever tried to make a proper database. I suppose my biggest venture into it was in college nearly 3 years ago now. It was a good thing to learn but there will always be one thing that I will always hate. Normalization.

For me it was rollercoaster ride of fustration. I could code! i'd started in secondary school at the age of 14 so surely I could normalize a database!

It could not of been further from the truth. Constantly I would find myself with what I thought was the correct answer. Only to then find I was wrong, I didn't even understand why! I think sometimes I only made it through that module because my peers helped me out alot. In turn I helped with the code (it wasn't all one-sided).

We never actually implemented a database in mySQL but it is the one I've had the most contact with, this comes from custom php applications i've written or just working with Wordpress.

Introducing MongoDB

When I first saw mongodb I thought it was awesome. I work with javascript quite often and so storing documents like JSON wwas easily the next logical step. I could dump into a document and then like magic I can access and query with the simplest of APIs.

I fiddled and I played over time and because I knew and was super comfotable with it. I decided it would be great for using with braid for storing everything. Now everything I am going to be writing is either javascript, or very javascript-esque. Brilliant!

Planning my database

I knew from my college days that I should plan my database, lucky I remeber enough to know that I need to try and eliminate as many many to many relationships as possible and that maybe I should try and encapsulate and normalize my database. I decided to plan it on paper first.

I struggled. I couldn't wrap my head round it (it's college all over again!). After a while I got some fresh eyes on it both my own and a person who is much better at this stuff than me.

I'd come up with a diagram on how it should be working together. I mapped it based on what I learned on college. Most of them were easily one to many relationships which is fine. However I ran into a problem when I had a many-to-many relationship between my 'Threads' and 'Modifiers'

Here is how it ended up looking, you can see that they all roughly had one-to-many relationships but not many-to-many relationships.

After talking it through, I learned it was quite simple and that each of the elements of the database should be their own collection, this maps to the same as their own table in a more traditional mySQL fashion.

Learning MongoDB's Concepts

The problem I was facing when trying translate these between the two philosophies was that MongoDB gets it's speed from being able to read flat documents.

After struggling and really thinking though, I decided to look on MongoDB's website only to find explanations of what I was struggling with. I wasn't sure how to model my data in a way so that it's easy to handle one to many or many to many relationships. Luckily it was easily explained through two concepts of embedded documents and referenced documents.

They explained clearly which direction I must take. Given how much data I'm going to be collecting from other services. I've decided that every element needs to become it's own collection and then store references to any other collections. This means when I grab the data, if I want to populate it with data from another collection then I need to perform another query to fetch that data.

The website also explained how to model data based on tree structures. So I decided that the following is how the database should look based on the things I learned from properly looking through MongoDB's website resources.

You can see above how the relatonships between them work. If this was to be one big document then we would have all these nested documents inside of one another, of course also this isn't good because documents have a size limit before they become too large and need to be supported in a different format. By storing each documents in their own collection and keeping references to them we stop this. The downside is while we have to perform more queries, as a way of maintaining the database, this is a much better of doing it.

Wrapping up

Normalizations sucks and I didn't escape it. Next on my list is leading to the point that I actually need to implement this database into MongoDB and try it all out to see if it really works. From a logic point of view though. I think this is the right way to go.

I've already planned the schemas that I need to write. However, I also need a way of storing and creating models of schemas for each service I plan to support. Once I've made these service specific schemas along with these other generalised schemas. I'm hoping I will have a system flexible enough to allow me to quickly and easily add new services. This is partly due to the power of MongoDB, because it is schemaless by design. This means I can virtually enforce schemas but in reality MongoDB will handle most things I give it and store it in a document.

Before I just start speaking and nattering on about how we should concern ourselves with the ability to add our own data to the content we distribute across different services. I think it would help that you, the reader should have a rough idea of what I am trying to achieve.

As a university student, I am currently tasked with making a final year project. This project can be on anything we want. I've proposed a system that allows people to pull in content from different content service providers. Our content is always distributed across lots of various places. This isn't always an issue, in fact it's good from a back-up and data storing perspective.

As a person who sometimes has to use other people's services is that I'm locked into the data and content that these services provide.

As an example Youtube is very cool. It does all the hard work for me and developers like myself can request the videos. Youtube deals with all the infastructure and hardwork, then gives me an api to pull them into my own applications. I once found myself in a situation where I wish I could easily categorise and pull in videos based on site specific categories. This doesn't exist on Youtube.

Why not roll, your own system?

If I had the time maybe I would of. I could of pulled them all into a database and then added the categories myself. I never forgot though.

Humble Beginnings

At first I wanted to make somesort of Youtube CMS. You would give it a channelID and then it would store references to them and then give you the ability to add extra types of data to the videos.

After some review and feedback, it was realised that maybe I was being too boring. My idea was called 'rather pedastrian' and people of reddit thought why make another CMS, why not integrate into an already existing CMS. Of course they were right. So with a glass of wine and a whiteboard I decided to re-think everything. This is how Braid.io was born.

Threads of content, all inter-weaved with your own data

Braid was made from the thinking that, actually we should consider all the different types of devices we have, so really it would be smart to built it on the basis that all of the web works. The HTTP protocol, or more specifically a HTTP REST API. By doing this, any device now, or in the future can use it. Any website can integrate it. XHR requests can fetch data, IOS SDKs can be made, Node.js Modules can be written all based around this central, and fundamental technology.

We should be able to store threads of content from different services. Youtube, Soundcloud and more. Rally a community behind. Give people the tools and see what the people will make, not just me. Will people combine threads of service and then easily add their own to it.

Who is the audience

This is really a tough one, an api is aimed at developers. This means not-technical people who don't develop applications and websites won't really know how to use it. They might be able to learn, but they may not be interested in learning development.

First and fore-most my audience is developers. I have infastructure to write, documentation to write and APIs to develop. After this, if I can rally a community around this and try and make the service community driven then it could be used to fill the gripes that potentially lots of other developers have too with certain services. Instead of me deciding what service people should be made available next to integrate. it should be up to the community.

It also means a, kind-of, recipe book can be made of common ways certain content services can be combined together to create cool things. To finally bring the entire thing to the non-technical audience a library of widgets can be made that will be made from the very API I am planning to make. This will then allow the everyday developer or a small-site owner with minimal knowledge of how to program for the web to embed functionality into a website or application without the extensive technical know-how.

How to add this extra data

To add these extra layers of data ontop of content that is pulled from different services. I have used the term modifier as a way of plugging new data into an entry from a stream of content. A thread will know what modifiers are attached to it, The system will then know that it should leave some extra fields blank that can be filled in by the user on a type of admin for managing the streams that come into the braids and the modifers they wish to attach to it.

The idea of a modifier is still very much in the works, the thinking right now is that there should be modifier types, such as a collection modifier that allows you to collect and group the entries based on a certain term. This is similiar to how categories or tags work. Another is a content modifier type that allows you to add extra text based content to an entry. There could be much more and hopefully many more in the future too. Supplying modifer types means you could add certain types of functionality that the services provided normally can not. It would be an understatement if I said modififers were not easily the most fundamental part of this entire project, if it can't be built in a flexible and easily maintained way then this project could easily come crashing to it's knees.

TL;DR

I want to make a system where you pull in lots of services dubbed a 'thread'. Add extra data through the use of 'modifiers' and then provide api endpoints for all the threads with the new data from the modifiers attached. This is known as a 'braid'.

This is so that we can plug the functionality we wish some services provided.