IBM is about to make these APIs (and many others) much more accessible as part of BlueMix (https://ace.ng.bluemix.net/ - the IBM PaaS/Heroku). I lead the team in charge of developing the Watson platform. Ask me questions!

I fully admit you may not be the right person for this question, but IBM has made a lot of hay about Watson in healthcare: lots of booths and talks at industry conferences, trotting out partner medical centers, PR pieces in the Wall Street Journal, etc... Despite all the noise, I have yet to see any sort of peer-reviewed clinical study that demonstrates the application of Watson in a real healthcare setting to improve patient outcomes. Has any study like this been conducted?

We do have many actual projects with several healthcare providers. We have functioning systems (see one here https://www.youtube.com/watch?v=8lGJ0h_jAp8) and we are starting to deploy them. But the deployments are made very cautiously and in stages (and not yet broad enough for full study) because that's the nature of healthcare.

Hey there.
The startup I co-founded has been accepted by IBM partner for Watson. A few weeks into it, we are still doing the "form-filling" marathon and have not had access to an instance. Seems like the API, when available to public will be a faster way to access it. Punekale at google's mail.

Not completely. The first set of APIs won't let you provide your own content - it will let you play with the Q&A API (and also do a lot of other fun things not directly related Q&A). The instance you will get access to through the application will let you provide your own content. We are working hard to make that available as self-service but that's going to take a little more time.

I'm curious about the "copyright" field. Do you return the original sources from where Watson learnt the information he is presenting? What are the major sources? Have you faced copyright or legal restrictions to access information and has this affected Watson's ability to answer questions in a certain area?

When you create a Watson instance, you basically load it up with what they call a "corpus" of information. They are supposed to check the source of all your info etc that you have copyright to it. I don't know what it will be like once it is opened up more, but as of now they have pretty tight hands on Watson and will only accept projects with monetization and with revshare agreements...

1. Watson has to be trained on a specific corpus of information. Each instance is it's own Watson, and needs to be trained. They say that each Watson starts off as a kindergartner. 2. Watson needs to be trained by an expert using sample questions and it learns from that.

That's amazing news - thanks a tonne. Any chance there's someone at the mothership (maybe you!) we can reach out to, for support? Specifically I'd like to toss out some questions about data sources and whatnot - to understand if Watson is worth playing with for us.

Is BlueMix a Heroku competitor or compliment? I want to develop on the Watson API but I also want to use a stack I'm familiar with. Also, do you have range estimates for the price of accessing the Watson API?

I went for an info session and a few events for Watson a couple of months ago at the IBM office in NYC. They are doing pricing with revenue shares. A lot of devs, including myself, were not happy about this. They basically were not open to projects that weren't directly making money off of Watson...

In addition, since Bluemix natively supports Java apps, you can export a Clojure app as either an uberjar or an uberwar and run it directly on Bluemix that way. So for example, if you create an uberwar of your Clojure webapp then you can push it to Bluemix by doing:

cf push your-app-name -p target/my-app-uberwar.war

... and Bluemix will run your app on a WebSphere Liberty Profile app server.

I'm happy to help if you have any questions - contact details in my HN profile.

A given instance of Watson is unique based on it's corpus (documents that have been uploaded to it) and it's level of training and fine tuning. So far each instance has had to be trained from scratch, although they are making some Watson instances available that will come pre-trained in specific fields (like medicine).

Just went through the application process linked above. Be prepared to give info about yourself and your company and an explanation of why you want access to the Watson API, as well as what type of information you'll be working with. I stated 'just want to play around with the API'. We'll see how they react to that.

That is great! I just signed up with my existing IBM ID. I am enjoying using IBM Watson right now on a customer project, and being able to experiment and learn on my own will both help the effort to help my customer and perhaps use IBM Watson for personal projects. BTW, I like your dev page, with starter kits for multiple languages and frameworks.

"Really failed"? According to Forbes in July of this year[0], Microsoft is second only to Amazon in the cloud market, and gaining. I'm not sure how that counts as a failure, except in the sense that Microsoft makes a popular whipping boy.

I personally have seen very few success stories that involved Azure. The messaging coming through to the public is that Azure is buggy and unreliable. I can recall seeing a couple of major downtime incidents ([1] at least) in the last six months, and a significant number of major ones over the last few years (certificate renewal issues, leap day issues, etc [2][3][4]).

I wouldn't choose Azure myself, and I would actively recommend against choosing it to others from what I have seen (unless you are building a solution on .NET/Windows, perhaps). I can't imagine that I'm alone.

I actually built my own version of Watson based on this idea - jeopardy questions are often google-able/searchable on wikipedia. It was pretty easy to build out - but it only gets 80% of the way there.

The last 20% is the hardest - and it's why Watson is so impressive (even though even Watson is probably only at 90%)

I would think Watson can replace most lawyers (and MDs, and ...) of this world. Most of them don't think and just rehash stuff they learnt, just like Watson does. Sure for the exceptions you need actual people, but that is the same time when you would go from your corner lawyer to a more prominent one and when your doctor would forward you to an expert anyway.

Watson could replace lawyers or doctors for people that equate Google searches to legal advice or medical advice. Think legalzoom and webmd... Absolutely seems like it could be an entertaining way for a non lawyer or no doctor to explore a law or medical library. The majority on my time spent with lawyers has been discussing my issue until it could be distiled down to a couple concise legal questions; I bought a short sell house and the seller demanded that I put a clause in the contract that said his bank couldn't issue him an i9... I have. No authority over tax laws but I also didn't want any liability or an invalid contract, nor to willingly build a bogus one. There was some real language subtlty to it all and I didn't even know the questions to ask.

Same with doctors, pain is relative, strong pains turn lesser pains into mild discomfort and people are insanely good at ignoring and normalizing pains away. Do most patients even know what to ask or describe?

Don't get me wrong, I'd love to have a lawyer and a doctor on my smartphone all day everyday but it still seems like a ways off. Watson really seems like a tool that cuts your legal fees because your lawyers research time drops 90% or something. (Or rather, he makes 90% more profit from you..)

The medical example is called a differential diagnosis. Medical school teaches you to make this. Communicating only the highest-ranked one or two items, unless explaining why you're ordering tests that rule out lower-ranked but actionable diagnoses, is not difficult.

This is only true when there isn't enough information supplied. You can't expect the doctor to figure out what the problem is just by telling the doctor you have a pain in your side, same goes for the lawyer's scenario.

How is it misinformation? The human doctor would not tell us his diagnosis in terms of percentages, because we as humans have a hard time grasping probabilities intuitively. That doesn't mean that a probabilistic diagnosis would not be more accurate.

The doctors job is to provide me with as much information about the objective criteria of my physical condition as possible. However when it comes to making choices about my treatment, say in the case of accepting/rejecting an experimental drug with some potentially nasty side effects, it should be entirely my own value judgement on what to do with said information.

I've learned a bit about a Watson from internal IBM information and this is something they understand and are working on. There are serious ethical concerns about what to tell someone, even if the diagnosis is quite compelling, IOW, "you have 6 months to live" needs to come from a human.
Obviously, the approach is to have it work as a tool for a doctor, not as a WebMD type self-diagnosis service. There are all kinds of follow-up questions, which you'd need to be a doctor to even answer, because they'd be couched in medical lingo e.g. systolic/diastolic blood pressure.

Watson can't replace most physicians. Physicians have to physically interact with a patient to gather a relevant medical history as well as subjective and objective observations about the patient's symptoms. Robots and machine vision systems are nowhere near being able to fill that role.

Watson might be able to partially replace some specialists that primary care physicians use for consultations. When PCPs are unsure about a diagnosis or proper plan of care they will often consult with a specialist for advice via phone or e-mail. So in that case the PCP has already gathered at least some preliminary data and could feed it to a computer. But even for that use case Watson won't be able to provide same level of back-and-forth interaction that's often necessary to achieve the correct result.

The volume of medical data being created and published is rapidly increasing, and it would take a doctor to read something like 160 hours per week to stay on top of their field. There are only 168 hours in a week, so there is really no way a doctor today can keep up with what is going on . The idea of Watson is to be able to take large amounts of unstructured data (i.e. an entire patient history and current symptoms) and be able to find a solution. I don't think Watson will replace a doctor or physical interaction anytime soon, but a device connected to Watson or similar solution will most definitely be used to augment the doctor's diagnosis and treatment solution.

Well, MDs just send you away when stuff doesn't go away or they think it's serious; I think Watson can easily be instructed to do the same. If it belongs in the category 'probably needs expert' he should pass his conclusions about what it is and so on to a human expert. Similarly with the cough he subscribed medicine for and after 1 week it's not gone yet; expert. This is what human MDs do as well and I had MDs actually tell me to 'be a man, suck it up' so not sure if Watson could do worse.

Has anyone at HN used either IBM Watson or Wolfram Alpha to build a real (commercial) app? It feels like there should be a whole wave of apps built on either of these technologies but it doesn't seem to be materialising.

1. "Who was the 12th president?" - Zachary Taylor
2. "What color wine is cabernet sauvignon?" - Red
3. "Is a ferret a rodent?" - The ferret is the domesticated member of the Order Carnivora, Family Mustelidae and Genus Mustela. A common misconception is that ferrets are rodents.

The real challenge is answering niche questions:

1. What size are the OEM rear wheels of a Honda S2000?
2. How can I fix MySQL error 1064?
3. How do I remove wine from a macbook?

These types of questions aren't answerable by a simple mining of Wikipedia or Encyclopedic knowledge. They represent niches within our society (S2000 owners, programmers, people who spilled wine on their macbooks). Google provides excellent links to pages that contain answers to these questions, but it cannot deduce a single answer or common response. This is why sites like Answers.com, Yahoo! Answers, StackExchange, etc. can flourish, but it's also why an NLP question and answer system is very difficult.

I've been working on a system to mine existing responses to questions - http://gotoanswer.stanford.edu - I only have a small subset of programming-related questions (~10M), but you can get an idea for what I'm trying to do by searching for "How do I remove wine from a macbook?" You'll see that there are results for removing wine the liquid and WINE the windows non-emulator.

You bring up a good point, but it seems as if Watson was designed with this in mind. If you notice in the JSON response, it lists this query as a factoid class.

It may handle different queries with different attributes differently, such as focusing on certain portions of its corpus or changing what aspects of its search results are more heavily weighed.

A query identified as a factoid might be researched and judged very differently than something a bit more nebulous, such as a comparison, or something with more specificity like the examples you listed.

Admittedly, I am basing quite a bit off of one example response given in their documentation, but it is an intriguing clue as to how Watson will handle that aspect of understanding which info to discern.

I am helping a customer integrate Watson into their system so I am very happy to see the news about BlueMix (https://ace.ng.bluemix.net/) that apparently will allow me to keep experimenting with Watson after my consulting engagement is complete.

If you read the documentation, you will see that preparing training data and questions is fairly straightforward.

Each instance of Watson is unique and has to be trained as such based on it's "corpus" (set of data) and actual feedback on the quality of it's answers by experts. The public API sounds like it will allow access to specific flavors of pre-trained and data-filled Watsons, like the food- recipe one or some basic medical ones.