Friday, April 23, 2010

I’ve come to realize that I have lots of posts around startup software development scattered around in different posts. Thought it would be good to capture them in one spot and also include links to related posts from other sources.

Thursday, April 22, 2010

Tony, what inputs should we use as the basis for our matching algorithm?

Every time I hear that question, it worries me a bit. eHarmony came with lots of research behind the dimensions of compatibility and how those related to marriage longevity and happiness. The people asking the above question don’t have that research and they also don’t really know how the matches should occur. For example, do you match people with similar personality traits or complementary or opposites? Do opposites really attract? Without the research, you are in the position of making educated guesses at what makes a good match.

Inputs

There are likely a lot of possible inputs. To me, the first step is to do some research on different aspects of the fit between people and projects and create a long list of possible inputs. For example, you might come up with:

Industry / Specific Knowledge

Skills / Roles

Timeframe

Geography / Travel

Personality

Team Styles

etc.

Of course, many of these items will result in more of a filtering algorithm than a matching algorithm. For example, you can have people specify experience in particular industry and the project can have requirements for experience with particular industry. This is classic filtering. Even with scoring, it still will have little mystery (perceived value).

This gets much more interesting when you get to personality and team styles.

Personality Profiles

So, this leads us to the question:

Should I use a use a personality profile in my matching algorithm?

You can use something like a DISC or MBTI personality profile. These instruments exist and are fairly well documented. But do they relate to what will make someone happy on a project, happy in a job, happy with a tutor, etc.? Chances are that without a fair bit of research, you are not going to know the answer. And particularly, you won’t know if it makes sense to match people based on similarities, complements or differences.

In the case of matching people to projects, there’s quite a bit of research already out there on personality types and team effectiveness. In fact, a cursory review of some of this information suggests that there are quite a few other kinds of inputs that will make a lot of sense such as the kinds of roles that the person naturally falls into. And, in fact, there are a lot of tools that can assess a given team and tell you about likely issues around communication, leadership, etc.

Bottom line, you should do a fairly significant review of the research and the various tools to come up with models of how personality assessments, communication styles, personal preferences, availability, natural roles, etc. fit into a matching algorithm.

Input Matching

Given our list of inputs, most matching algorithms are based on a few types of rules:

Requirements – Yes/No – if these don’t match, you don’t have a match period

Scored – calculate a distance between matching factors, multiply these by an importance factor (Scoring coefficient). For example, how far are you willing to travel, is that an important factor?

Critical Mass Problem

One thing that many startups don’t recognize going into the design of their matching algorithm is the problem of critical mass. When you first start, often the number of items that you can match are relatively small. And the value of the matching algorithm goes up as you increase the numbers.

This is probably worth it’s own post.

In terms of the design of the algorithm, you probably need to design your algorithm to be flexible in how it surfaces matches in the case where there are relatively few possible matches. You absolutely need to avoid returning “0 matches found” and asking the user to continue to change their criteria blindly hoping to find matches. Goodbye user.

Hypothesis Algorithms

So we’ve defined our inputs and algorithm for matching based on our best understanding of what makes a good match. We should call this a Hypothesis Algorithm. It’s our best educated guess.

Of course, over time you can capture results from matches by asking for input from workers and project managers all along the way (prior to start, during, after) to assess whether the match was indeed a good match. This self-reported data can then be used to tune the algorithm over time to turn it from a Hypothesis Algorithm into an algorithm based on results.

Is a Hypothesis Algorithm okay?

The answer is that there are many startups in the market today that are based on Hypothesis Algorithms. Likely they are overselling their algorithm, but the reality is that a decent Hypothesis Algorithm plus good Match Performance Support will likely yield better results than existing systems which are often quite random. Consider the examples:

Workers to Projects

Recruits to Employers

Learners to Tutors

Today, each of these are horribly inefficient, based on incomplete, random information, and performed in ways that are far from expert performance. So, the real question is whether you can outperform the existing systems more than whether you have more than a hypothesis algorithm.

Even a Hypothesis Algorithm is Hard

When it comes to a hypothesis matching algorithm, I’ve already suggested a few requirements:

Needs to have mystery – not a filter. If it’s obvious where the results come from, people won’t ascribe much value.

Must handle the critical mass problem gracefully.

Needs to hold up to scrutiny. Why did I get matched with this person? Why didn’t I get matched with this person?

Of course, needing to hold up to scrutiny and being an untested, hypothesis algorithm is a bit challenging. Once you have more experience, you can get to be like Gallup and it’s Q12 instrument that measures employee engagement. They have a question in there that I’m sure many people would love to get rid of: “Do you have a best friend at work?” When I read that question, I’m not quite sure what it’s asking me. Almost no one is quite sure. But when asked why Gallup includes the question despite the confusion, their answer is that it has been shown to be highly correlated to engagement. Basically, they don’t quite know what it means either and likely means something different to different people. But the answers to this question (as compared to 100s of other variants) correlate higher to engagement levels. At first, I really didn’t appreciate that answer. C'mon Gallup, just get rid of it to avoid the question. But in a way, there’s a beauty to it.

The problem that most startups with Hypothesis algorithms have is that they don’t have the research basis to back them up. When matches are shown and challenged, how do you defend that this is a good match. And believe me, you will get challenged.

Sample Data Sets and Testing

Of course, one of the ways to stand up to scrutiny better out of the gate is to have a good set of sample data that you can use during design and development to test the algorithm. You make sure it works first via a spreadsheet. Then in code.

Make sure this is fairly robust. If you don’t do this, then some of your early matches will be really bad. And it always seems that its the critical investor/blogger/reporter who tries your system and gets a bad match. Partly that’s because they aren’t using the system as a real user. But you do need to test those edge cases.

Wednesday, April 21, 2010

I guess it was bound to happen, I now seem to be writing a series of
posts around Matching. This is where I’ll keep my list of these posts,
related articles and posts from other sources:

Social Media Matching
– Looks at how in many ways, everything is matching. The BIG value
lies not in “deep relationships” brought online, but in leveraging the
breadth and time-and-place independent access to other people. I think
of LinkedIn as a A 24x7 networking cocktail party with 60M+ people – the
challenge in that environment is how you get matched to other people.
The real innovation and true value creation is going to occur around
taking the massive network with little to no existing relationships and
helping to find matches, foster conversation, and build relationships
where they make sense.

Matching Algorithms
– Looks at what’s really required for a matching algorithm to be viewed
as matching and not just filtering – and the concept that there’s
margin in mystery.

Tuesday, April 20, 2010

I’ve had several recent conversations with startups who are building companies based on matching. One of the things that I think gets commonly missed by these companies when I first talk to them is that they need more than providing matching, they need performance support.

What do I mean by this? Let me go back and do my classic search for "eHarmony of" startup and find a few examples to use:

People to Projects (Managers)

People to Jobs (Hiring Managers)

Students to Tutors

Each of these involve matching people to people (I’ve stayed away from content matching in these examples).

The other common aspect to each of these is that they are things that we don’t do very often and probably are not very good at it – yet they are very important activities and something we need to get right.

What is performance support?

The easiest way to think of performance support is to think about a couple of examples:

Wizard – steps you through the process of something. For example, getting a complex graph created in Excel. Sure, you can do it without the wizard, but it’s much easier when you are stepped through the process.

Turbo Tax – one of the greatest examples ever. It steps you through doing your taxes by asking you questions and then puts it in the forms. You can also edit the forms directly in the program, but good luck with that.

Performance support are systems designed to make a complex tasks simple enough that a novice can complete it effectively. They are particularly suitable when the task is:

Not common

Complex

Important to get right

So for example of the examples cited, these are things that are not done all that often, fairly complex, and generally are important to get right.

Match Performance Support

When I get asked about eHarmony, a lot of people miss that there’s some very interesting performance support going on after the match. Here’s how eHarmony’s FAQ describes the communication stages:

The Communication Stages are designed to make it easy to ask the important questions early. There are four rounds of Guided Communication, followed by unlimited access to eHarmony’s anonymous Open Communication system.

I. Stage One: Read Your Match's "About Me" Information

II. Stage Two: Send 1st Questions

The second stage of communication lets you choose five simple but informative questions to ask your match. For example: "If you were taken by your date to a party where you knew no one, how would you respond?"

III. Stage Three: Exchange 10 "Must Haves" and 10 "Can't Stands"

The third round of communication consists of reviewing and exchanging your personal list of "Must Haves" and "Can't Stands" with your match. Example of “Must Have's”:Chemistry - I must feel deeply in love and attracted to my partner. Communicator - I must have someone who is good at talking and listening. Example of “Can't Stands”: Rude - I can't stand someone who is belittling or hateful to people. Grudges - I can't stand someone who has a chip on their shoulder.

IV. Stage Four: Send 2nd Questions

The fourth round of communication is the exchange of three open-ended questions. You may write your own questions or choose questions eHarmony provides, for example: "What person in your life has been most inspirational, and why?"

V. Open Communication

eHarmony’s founders knew that initial communication with matches is not something that most people are going to be good at. Therefore, they provide tools that support you through the process.

When you look at the examples, I listed above:

Workers to Projects (Managers)

Recruits to Jobs (Hiring Managers)

Students to Tutors

Each of them also involves something that likely we are not particularly experienced with.

Thus, as a startup, you need to think beyond the match and towards how you will support the rest of the performance. For example, Project managers would have a list of questions that they can ask potential workers. Examples will be provided. They can also potentially ask other questions. Workers would be able to answer the question one time and provide that answer to each of the project managers. Similarly, workers should be able to ask questions of the project managers about the job.

Trading emails with a startup CEO building an iPhone app, I asked him why potential customers would buy his product. In response he sent me a competitive analysis. It looked like every competitive analysis I had done for 20 years, (ok maybe better.) And it made me sad.

Tim Berry has a great post on Why I Hate Those Huge Market Numbers tells us that he doesn’t like to see business plans with multi-billion market numbers used as the basis for projections. It’s the old – 5% of massive market gives us a big number. I agree completely: If it makes you feel better to give me that number in passing, okay, go ahead, but don’t put any emphasis on it.

I’ll admit to having a bias towards exits and focus my review of a potential investment on the exit potential. I do sometimes get impatient when the investment proposal has little or no information on the exit path. Even where an exit strategy is proposed, the descriptions demonstrate either limited information about what it will take to structure the exit or look like a last minute addition to the plan to ensure that that box is ticked.

I’ve been in the software startup business for a long time. One thing I have found interesting is that amongst first-time software entrepreneurs, certain “patterns” of applications kept recurring. Time and time again, entrepreneurs are tempted by one of these application categories. Not that it’s always a bad thing.

While we hear about the power of social media as marketing tools, especially for those trying to bootstrap their businesses, but just how effective is it? New research from Utpal Dholakia and Emily Durham of Rice University takes a look at this question. The study is featured in the March issue of the Harvard Business Review .

My startup (RescueTime) has enjoyed some pretty ridiculously good PR (online, print, and video). It’s not a surprise that the most common questions that we get from other founders are about PR. How do you get press and the blogosphere talking about your product?

High concept pitches are great for getting your foot in the door (“It’s Friendster… for dogs!”). But once you’re in the building, pitch a bigger vision. I’ve been talking to a lot of startups that apply to AngelList and most of them don’t have a vision that would separate investors from their money.

Every startup founder knows implicitly that startup success is a long hard road. Yet we always dream that we are the exception to the rule. So once in a while it’s good to look at some facts to temper our imagination. I was reading an article written by marketing guru Seth Godin a while back where he mentions that “it takes about six years of hard work to become an overnight success”.

About Me

Dr. Tony Karrer works as a part-time CTO for startups and midsize software companies - helping them get product out the door and turn around technology issues. He is considered one of the top technologists in eLearning and is known for working with numerous startups including being the original CTO for eHarmony for its first four years. Dr. Karrer taught Computer Science for eleven years. He has also worked on projects for many Fortune 500 companies including Credit
Suisse, Royal Bank of Canada, Citibank, Lexus, Microsoft, Nissan,
Universal, IBM, Hewlett-Packard, Sun Microsystems, Fidelity
Investments, Symbol Technologies and SHL Systemhouse. Dr. Karrer was
valedictorian at Loyola Marymount University, attended the University
of Southern California as a Tau Beta Pi fellow, one of the top 30
engineers in the nation, and received a M.S. and Ph.D. in Computer
Science. He is a frequent speaker at industry and academic events.