Pages

Tuesday, January 3, 2012

There are 2 ways to connect a data source to the Model build node in Oracle Data Miner.

The typical method is to use a single data source that contains the data for the build and testing stages of the Model Build node. Using this method you can specify what percentage of the data, in the data source, to use for the Build step and the remaining records will be used for testing the model. The default is a 50:50 split but you can change this to what ever percentage that you think is appropriate (e.g. 60:40). The records will be split randomly into the Built and Test data sets.

The second way to specify the data sources is to use a separate data source for the Build and a separate data source for the Testing of the model.

To do this you add a new data source (containing the test data set) to the Model Build node. ODM will assign a label (Test) to the connector for the second data source.

If the label was assigned incorrectly you can swap what data sources. To do this right click on the Model Build node and select Swap Data Sources from the menu.

Oracle has recently made available a very useful webpage that that lists the functionality available for each version of the 11g Database. So before you decide which version of the database to purchase, check out this webpage.

Friday, December 23, 2011

I’ve been working in the BI and related fields since the mid 90s. Over the past number of years I’ve gotten a little bit confused about what Business Intelligence (BI) really means. Maybe it’s just a bit of old age kicking in way too early.

It seems to me that the term Business Intelligence has been hijacked by a large number of companies and software vendors. It seems that every “reporting tool” has been re-labelled into a Business Intelligence tool, without providing any really intelligence features. You are still just a reporting tool with no real intelligence features. Yes you do have some nice graphics that can be used instead of just listing numbers. But that is not Business Intelligence.

Business Intelligence is going beyond what these tools are capable off. Most of the skills and abilities for BI comes from the people who are doing it, not the tools. In reality you will need to use a number of tools or to write some custom code to help you gain the extra bit of insight into your data. The “reporting tools” can then deliver the results.

Also Ralph Kimball said a long time ago that the skills of someone working in the DW/BI area was that they needed to be half-DBA and half-MBA.

A quote that I heard recently from the Predictive Analytics World Conference, was “You need to be able to ask the right question”. This is to ensure that you can frame your analytics projects correctly and be able to measure the results.

I think that this question was key back in the mid 90s when I started out in the BI field and I still think it applies to all areas of BI. The thing that we have lost in BI is the real intelligence part of it.

So I’m proposing a new name for really BI. It is intelligent-Business Intelligence (i-BI).

Lets differentiate between BI and the real intelligent BI work.

What do I mean by intelligent BI (i-BI) ? What I mean area skills in Data Warehousing, Time Series Analysis, Advanced Analytics, Data Mining, Predictive Analysis, solving or addressing real business problems, etc.

Or maybe I’m just wrong and have missed some developments in BI over the past 16+ years. Or maybe I’m becoming a bit too cynical.

Wednesday, December 21, 2011

As we approach Christmas, many of us will be looking forward to a few days holidays/vacation. During this period we may start thinking about some techniques or methods that we discovered over the past 12 months or about things we need to find out more on, over the coming months.

One thing to consider is to write an article on these techniques or methods, for Oracle Scene. The next due date for submitting articles is 13th January.

Tuesday, December 20, 2011

In my previous blog posts on creating an ODM model, I gave the details of how you can do this using the ODM PL/SQL API.

But at some point you will have a fairly stable environment. What this means is that you will know what type of algorithm and its corresponding settings work best for for your data.

At this point you should be able to re-create your ODM model in the production database. The frequency of doing this update is dependent on number of new cases that you have. So you need to update your ODM model could be daily, weekly, monthly, etc.

To update your model you will need to:

- Creating a settings table for your model - Create a new ODM model - Rename your new ODM model to the production name

The following examples are based on the example data, model names, etc that I’ve used in my previous post.

Creating a Settings Table

The first step is to create a setting table for your algorithm. This will contain all the parameter settings needed to create the new model. You will have worked out these setting from your previous attempts at creating your models and you will know what parameters and their values work best.

We will need to use the DBMS_DATA_MINING.CREATE_MODEL procedure. In our example we will want to create a Decision Tree based on our sample data, which contains the previously generated cases and the new cases since the last model rebuild.

The model we have create created above is not the name that is used in our production software. So we will need to rename it to our production name.

But we need to be careful about when we do this. If you drop a model or rename a model when it is being used then you can end up with indeterminate results.

What I suggest you do, is to pick a time of the day when your production software is not doing any data mining. You should drop the existing mode (or rename it) and the to rename the new model to the production model name.

“… soon you'll be able to use the new Oracle R Enterprise (ORE) functionality. ORE is currently in beta and is targeted to go General Availability in the near future. ORE brings additional functionality to the ODM Option, which will then be renamed to the Oracle Advanced Analytics Option to reflect the significant adv. analytical functionality enhancements. ORE will allow R users to write R scripts and run them inside the database and eliminate and/or minimize data movement in/out of the DB. ORE will provide R to SQL transparency for SQL push-down to in-DB SQL and and expanding library of Oracle in-DB statistical functions. Packages that cannot be pushed down will be run in embedded R mode while the DB manages all data flows to the multiple R engines running inside the DB.

In January, we'll open up a new OTN discussion forum specifically for Oracle R Enterprise focused technical discussions. Stay tuned.”

I’m looking forward to getting my hands on the new Oracle R Enterprise, in 2012. In particular I’m keen to see what additional functionality will be added to the Oracle Data Mining option in the DB.

So watch out for the rebranding to Oracle Advanced Analytics

Charlie – Any chance of an advanced copy of ORE and related DB bits and bobs.

Tuesday, December 13, 2011

Mark Townsend, Database Product Manager at Oracle gave a presentation on Big Data at the UKOUG conference and used the following videos to illustrate how a company can evolve their Big Data into useful and meaningful information.

Monday, December 12, 2011

On Wednesday 7th Dec I gave my presentation at the UKOUG conference in Birmingham. The main topic of the presentation was on using the Oracle Data Miner PL/SQL API to implement a model in a production environment.

There was a good turn out considering it was the afternoon of the last day of the conference.

I asked the attendees about their experience of using the current and previous versions of the Oracle Data Mining tool. Only one of the attendees had used the pre 11g R2 version of the tool.

From my discussions with the attendees, it looks like they would have preferred an introduction/overview type presentation of the new ODM tool. I had submitted a presentation on this, but sadly it was not accepted. Not enough people had voted for it.

For for next year, I will submit an introduction/overview presentation again, but I need more people to vote for it. So watch out for the vote stage next June and vote of it.

Here are the links to the presentation and the demo scripts (which I didn’t get time to run)

Friday, December 2, 2011

At 5:20pm today (Friday 2nd December), I received an email from the Oracle ACE program. I had been nominated for the award of Oracle ACE.

“You have been chosen based on your significant contribution and activity in the Oracle technical community. Like your fellow Oracle ACEs, you have demonstrated a proficiency in Oracle technology as well as a willingness to share your knowledge and experiences with the community.”

I am so honoured, considering the experts from around the world that are members of the Oracle ACE program.

The Oracle ACE Award is issued by the Oracle Corporation and the award is made to people who are know for their strong credentials in the Oracle community as enthusiasts, advocates and technical knowledge.