Im moving to a new company to be there BI specialist. doing some research on best practices and looking for some advice feedback on decisions to take.

The company has no BI just now and i belive they have a few OLTP database that there web sales system updates. Im havent seen the DB's yet but have been told they would be hard to report against. Im sure for the most part Day old data would be sufficient for reporting purposes.

My question is do i HAVE to create and OLAP database with dimensions and facts table i.e. star schema design even if the delivery method will only be ReportingServices. Im not sure if they will require the forecasting ability that SSAS can provide in the outset.

Im sure i will have to ETL the data from the production database nightly, but what are my options for this output if im only running ssrs reports off it?

If the only requirement is to run SSRS reports, the operational databases shouldn't be used and the data can be one day old, I would report off a restored backup of the OLTP servers. This seems to me the fastest way to get you going. I personally find SSRS difficult to use against MDX sources, but this isn't a problem with OLTP and standard T-SQL.

If a true BI solution were ever desired, however, this method hasn't taken any steps in that direction.

RonKyle (12/6/2013)I personally find SSRS to use against MDX sources, but this isn't a problem with OLTP and standard T-SQL.

Hi thanks for the replay, is the word "difficult"missing from the above sentence, as i would agree if it is.

yes thats what i was thinking IN THE SHORT term, run reports off a restore of the current DB. I see this as giving some value to the bisness up front and would give me more time to get to grips of the business needs to provide medium & long term BI solutions weather that be Datawarehouse, Datamarts OLAP cubes.

You were hired as a BI specialist and sooner or later you will be asked for BI solutions. Therefore, I would start working on multidimensional or tabular models ASAP. At the same time you could provide reports from a copy of OLTP but that will not work in the long time. I know people will pressure you for results but well designed data warehouse will last for many years.

You can report off OLTP systems with a light touch as long as you're not looking for significant aggregations and the like. My guess is they're not, not yet.

What I'd recommend is the following: Work off their systems. Get a feel for the company and what they're hoping to get out of their data. Build them some reports, get them interested in the idea, let them start round tabling where they'd like to go with it. Then rein in the ones running for the horizon.

With that initial set of guidelines, you can get a much better idea of how current your data needs to be (usually there's two sets, immediate dashboards and historical reviews). From there, start your initial warehouse design, realizing it'll probably be iterative (and disposable) the first time or two while you actually settle out what the company really needs from a data perspective.

The heavier the aggregations and the more load you put against the OLTP systems, the faster you'll be advancing to the warehouse.

- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

one thing im not clear on. If have seen data warehouses being dropped and recreated then populated on a nightly basis. this obviously means the ETL has to be performed on all the data in the OLTP source to populate the warehouse. is this the norm? how would this scale up?

or would you be better to adopt a strategy of just updating the warehouse with only new data?

I don't ever recommend dropping and recreating the warehouse from scratch, although I've seen that as a solution from some consultants. It's faster to develop but ultimately (and as you correctly ask about) doesn't scale well. Incremental updates are the way to go. It will take longer to develop, but your nightly run times will be stable instead of incrementally increasing until they reach a point where it's unsustainable.

RonKyle (1/7/2014)I don't ever recommend dropping and recreating the warehouse from scratch, although I've seen that as a solution from some consultants. It's faster to develop but ultimately (and as you correctly ask about) doesn't scale well. Incremental updates are the way to go. It will take longer to develop, but your nightly run times will be stable instead of incrementally increasing until they reach a point where it's unsustainable.

I won't disagree with you outright, Ron, for the simple fact that if you want a scalable solution you're dead on target. That said, most folks aren't working against huge data, and overdevelopment is overkill.

An example: A current warehouse I have starts with a Truncate/Reload component. This has a strict limit. Any table over 1,000,000 records OR longer than a 10 minute load time (they're concurrent) falls out to a delta component for the next sprint. That's where we decided to implement our sanity check.

You have to work against expectations of the developer vs. expectations of the development. Good enough is, in most cases, good enough. Smaller shops, like the one the OP is involved in, will probably only need SCD2 type work in very rare instances. In most cases, I find that using a delta for every component is indeed overdeveloping the expectations. I personally use a 10 year rule. If someone will have to go in and modify my code for data expansion expectations within the next 10 years to deal with volume vs. hardware upgrade expectations, I overdevelop to allow for the expansion. Otherwise: KISS. 10 years is (usually) more than enough to allow for business changes to already have redeveloped any particular component twice, and it can be dealt with then should optimization be considered more valuable than development time.