And I’ll add on the two parts that “bookend” that process — the requirements for the application and the physical statements for creating the database — normally referred to as the “Physical Model.”

Business Requirements

In the United States, whenever you look for a new job you create a “Resume.” In the simplest form of resume, you list out the jobs you’ve previously had, and your education.

In other countries, and indeed in some companies here in the U.S., There’s a larger document called a Curriculum Vitae, or CV. In my experience, a CV is longer and includes more material than a U.S. Resume. These are catching on in the U.S. because businesses and organizations use them for more than just granting a job — they are using them once you get a job to determine whether you fit for a certain promotion or other position. They can even be used by an editor to determine if you’re qualified to write a book, by a group to see if they would like to hear you present a topic to them and so on.

My “CV” is on line, and available to anyone who wants to look at it. It’s just static text and pictures right now, but I would actually like to be able to query the data and present it in multiple ways. For instance, I would like to be able to show, across all jobs, schools and so on, how much writing experience I have. Or perhaps I would like to show how long I’ve been formally teaching classes, or managing people, or working with databases, and so on. Right now you have to manually read each section to find that out.

Also, I would like to be able to hand that database over to a prospective employer to show them my database design skills and that I actually know how to do what they are looking for. In fact, in older times a “finish carpenter” (one who does the more detailed woodwork in a project) would build his own toolbox. As he showed up on the job site to apply for a job, the foreman could simply look at the result of the skill of that carpenter to see if they wanted to hire them. The carpenter quite literally “carried his resume” where everyone could see it. That’s the concept I’m going for in this project. Not only the data in my CV will represent me, but the process of building the CV itself.

What is NOT inside the scope of this project is the presentation. Right now I plan to keep that as the plain text and graphics on my website, but in the future that might go to excel, a dynamic web page, or any other construct that can read from a database. I might even make it a “web service” such that it can be polled from an API asking for all people familiar with Cloud Computing or databases.

Note

I keep my CV on the web, but that doesn’t mean you should. In fact, I have lots of things on LinkedIN, Facebook and Twitter. I definitely “live above the radar,” but even so I don’t put things like any kind of salaries or other sensitive information in public. If you’re concerned about putting what you consider confidential information on the web, then don’t. That still doesn’t preclude you from creating a database similar to this one, or even using the concepts to design a different database entirely.

Define Nouns

That’s the driving reason or “business requirements” for this database, and the next part of the process is to ask the “customer” (in this case, myself) what kind of information I want to capture. From the Resume I currently have, and from other resume’s and CV’s I find on the web I come up with the following general areas of information I would like to track:

Organizations I’ve worked with and for

The skills I’ve gained from those firms — and other locations like hobbies

My education history — both formal public education and private classes

Publications I’ve written or contributed to

The certifications I have held or currently hold

The locations of where I’ve worked — some organizations are interested in a global presence, and like to see where I’ve lived

Refine and Group Nouns

As a data professional I would begin to question the user on the details of those nouns from that broad list. I’ll start by teasing out the nouns I really want to track, and changing them to a singular tense — that’s just my convention, you can use whatever you like:

Organization

Skill

Education

Publication

Certification

Location

Reference

Notice I’ve added one more — “Reference." The reason I added this will become clearer as I explain the design, but the background is that each of these areas may have one or more resources that point to them. For instance, the organization I worked at may have a web site, a balance sheet, and so on. I may have the names of people who know my work or teaching at a location, and so on. Those references were varied, and could be queried alone, so I decided to break them out into their own entity.

Next I need to add some detail to those nouns. I added attributes that I wanted to track about each noun, and then grouped them this way:

Organization

DateStart

DateEnd

OrganizationName

OrganizationType

OrganizationSize

OrganizationIndustry

Title(s)

Skill(s)

Education

DateEarned

Institution

Publication

DatePublished

Location

Topic

Description

Certification

DateGranted

GrantingOrganization

Location

StartDate

EndDate

City

State

Country

Reference

ReferenceType

ReferenceDetail

Not too bad a start. You’ll notice in some areas I include a date field, and in others I have both a start and end date. When an item doesn’t “expire” (such as an education event like a college degree) I just need the one date. But when I want to know the duration of an event (such as the amount of time I’ve been writing programs) I’ll include both dates so that I can use some math to show how long I’ve had that skill or that I’ve lived in a certain area.

Remove Repeating Nouns

You’ll notice I have some “s’s” in my list. For instance, at one Organization I may have held several titles or titles. In fact, that happens everywhere I go. The “s” means that I need to break those entities into “child” entities pointing back to their “parent” entity:

Organization

DateStart

DateEnd

OrganizationName

OrganizationType

OrganizationSize

OrganizationIndustry

OrganizationDetail

OrganizationTitles

TitleName

TitleDescription

Skill

SkillID

StartDate

EndDate

SkillArea

SkillName

SkillDescription

Education

DateEarned

Institution

EducationDescription

Publication

DatePublished

Location

PublicationTopic

PublicationDescription

Location

StartDate

EndDate

City

State

Country

Reference

ReferenceType

ReferenceDetail

Notice the naming convention I use here. In a child entity, I include the parent name in the table. When I do that, the parent is in the singular, and the child is in the plural. Again, you don’t have to do that, it’s just the way I built my toolbox. :)

Creating Joining Entities

Now I need to join everything up. To do that, I need Primary and Foreign Keys, and I tend to use surrogate or artificial keys — numbers or sequences that don’t have anything to do with the data. In a small database like this, I could certainly find one or two fields in each table that uniquely identify a row and use those, but with a surrogate key I’m guaranteed that I have a unique number.

I also change the design slightly to have a few more attributes that are needed to handling things like putting the Certifications in with the Education entity.

I also have some “many to many” relationships — in specific, one skill or title may be present in many companies, and I may have held those titles multiple times. So I need to create some “tertiary joining tables." I’ve been working in text, but to show others what I’ve done it makes sense to include a graphical representation of my database design. For that I’ll use an Entity Relationship Diagram, or an ERD as I’ve explained previously.

Define Data Types

At this point I’ll switch to using an ERD, although in larger designs I actually start with an ERD for the entire design. It’s a pretty easy decision for most of the data types, with one exception. Some items seem to want a lot of “detail” or explanation, so a binary type field could hold an entire Microsoft Word document to a VARCHAR(MAX) could be used to hold a lot of text. But I chose against this for a couple of reasons, the most important one being that it doesn’t directly address the use of this data. This isn’t intended to be a life story or full narrative — it’s meant to show others what I have done to support what I can do.

The ERD shows the data types I’ve selected, along with the Keys, relationships, and refined columns:

Apply Constraints

My only constraints so far are the Primary and Foreign Keys, and the data types. Those enforce the fact that I can’t have a child entity without a parent.

Another constraint is that I have a data type for each value. That means you can’t put a description in a date field, or dates in number values. In addition, I have certain fields designated as NULL (meaning they don’t need a value) like the “End Date” in an Organization, and others set to not allow blank values.

Beyond that, I don’t have any other glaring needs for constraints, although I could add in CHECK or other constraints, I don’t think I need any at this time. I can always add them later if my initial tests show I need them.

The Final Script

I’ve decided to create this database in Microsoft’s SQL Azure — a SQL Server running in another location. That will help me develop a front-end for the data that can be used just about anywhere, and even become a service later on.

Because of that choice, I have a few constraints I need to consider. The size of this data is quite small, so I’ll keep it to a Web edition of only 1GB. I’ll also need a clustered index on each table, and a primary key — something I normally do anyway.

Here is the completed script — note that it will run on most any edition of SQL Server 2008, because of the DATE type it will not run on earlier versions successfully. If you change that to a DATETIME it will, however.