DataWarehouse Implementations - How long do they REALLY take?

Have you ever wondered what is a realistic expectation for the time to design and implement a datawarehouse? I went to a demonstration last week of a DW solution run by a 3 person team. They went into companies, set up DW's using SQL as a backend with what looked like a really nice reporting front end. They promised they could produce a total DW solution in 3 months. Well, I almost fell out of my chair on that promise. After talking to them a little more, I realized that they were actually specialists in their particular market, not technical people. So, they were counting on there current system working "off the shelf" for any organization, and counting those 3 months as the time to load and refine the data. I then showed them our data model, which contained feeds from 9 different source system, some that still did not exist. I asked them again about there 3 month promise. They held firm to that estimation. They even promised specific implementation of business rules in the data as well as data cleansing, all done in 3 months. Well, I am sorry, I just didn't buy it. My current project is a DW for a healthcare environment, that after 3 years is just starting to develop into a useful tool. But, I don't think this is a reasonable time period either. Based on my own personal experiences (And that doesn't mean it's right) I think this is a reasonable expectation of time and tasks 1) Analysis of Data and User needs for data This process will determine what data needs to be fed into the DW, what additional systems may be involved, the reporting needs etc. I will limit this to a one person team. That way by adding additional people, you can reduce the time proportionately. 1 month - for full analysis Add one week for each additional data feeds from external sources 2 Weeks to review results of analysis and put together appropriate documentation 2) DW Design - Determining the data for the fact and dimension tables, looking at what will produce the best results from aggregate and summary tables etc. 3 months 3) ETL design and load process design For one source - 2 weeks Add an additional week for any additional data sources 4) Data Scrubbing 2-3 months 5) Table and structure design and testing 1 - 2 weeks 6) Design of testing methodology and strategy 2-4 weeks 7) Testing of ETL process 2 weeks Add 1 week for each external source 8) Test load of all data into testing environment 2 months (including testing and verification process) 9) Review of testing process, modifications to ETL 2 weeks - add another week for each external source 10) Retest loading process 1 Week (Depending on amount of data and RDBMS being used) 11) Creation of load tracking and reporting routines 1-2 weeks 12) Final testing 2-3 weeks 13) Load of data into production system 2-3 weeks 14) Review of production data (Including testing of data from user standpoint) 1 month I am sure I have missed a couple of steps, and I am not really trying to get to a detail level, but rather show the generic process. But basically, at least a year I would think is required to design, implement and test a small to mid size DW. Would appreciate your thoughts?? Till then.... Kawaii

Some name

"You know, you are correct. That is an important step. One of the reasons the project I am currently working on is taking so long is because the technical people didn't look at the output, but rather focused on the inputs. Good point, appreciate it!!

"Ok, I had to include an experience I had today which just shows= that you made a valid point=2EI was putting together the requirements for a report, and one of= the fields was ""Amount Billed"" Well, I looked at the table and= could not find a field called ""Amount Billed"" To me, in a Health= Plan environment, the amount billed would be what the provider= sent in=2E I would never think of it as an amount ""charged""= because this amount is not exactly what is charged to the= patient or the health plan=2E But there was a field called ""amount= charged""=2E Which I personally thought meant the amount actually= charged for the service to whoever was paying for it=2E But, the= field is actually the amount billed by whoever provided the= service=2E Confusion is definetly something that needs to be taken= into account in the time frame=2E=2E=2E

"as far as the DataScrubbing is considered, This is a tough job.Project should have SME (subject matter expert) in each area or source to sort out the issues and standardising the data otherwise it is a hell and repots have no meaning at all.

Im a New to Business Intelligence..I have designed a reportin cognos report studio.Could anyone pleasetell me the steps involved in Report Testing in Cognos..It would of great help to me if some one could explain me with sample test cases or scenarios for report testing