Presentation on theme: "Loading data into CMDB - Best practices for the entire process"— Presentation transcript:

2 Agenda Why you should never do CMDB only projectGuidance on – ‘Should this be in the CMDB?’The Life of a CIVarious best practicesQ&A

3 Typical Failed CMDB Project“We need to have a CMDB”Why? … becauseLet’s load data into itWhat data? … whatever data we have laying aroundSo, that took a long time!And the CMDB is big, out of date, and isn’t bringing any valueSee, I told you that CMDB thing was complex and useless hypeAnother big data store offering no value is obviously not the desireAvoid doing a CMDB only project

4 CONSUMERS vs. ProvidersAlthough providers supply the data for the CMDB, the important players for the CMDB are really the consumersConsumers do interesting and useful things with the dataProviders simply load dataWithout consumers – who cares what data is loadedIn fact, if no one consumes the data, it shouldn’t be loaded

5 Have an XYZ project, that includes using the CMDB (for XYZ substitute – Incident, Change, Problem, …)We need to improve our Change Management processThe CMDB is not an end in itself, it is an enabler for other processesYou must have a goal and a focus for how you want to USE the CMDBChange Management needs to know about servers, applications, services, and their relationshipsIf no one is consuming a piece of data, it should not be in the CMDBWhen in doubt, DO NOT put data into the CMDB until someone asks for itLook at the improvements in the Change Management processFailed changes and disruption to service because of change are downI can see how the CMDB makes Change Management betterLet’s look at the Incident Management process; how can we improve?There will be many different XYZ projects that all increase content and use of that content in the CMDBThe CMDB is a long journey; but there is incremental value at every step along the way

6 Choose your data sources wisely Good data providers do the following:Provides data for CDM classes you need to populate in the CMDBProvides data that is not already provided by a different data sourceCan populate attribute values which can uniquely identify CIPeriodically updates dataPeriodically flags data as no longer present in the environmentIndicates when the data was last updatedUpdates, maintains, and deletes relationships as well as CIsManual Data entry:Example: Asset Sandbox in ITSMThere are some classes we expect to populate manually, like Business ServiceCMDB provides context NOT content

7 Automated Discovery is a RequirementWithout automated discovery processes, data accuracy CANNOT be maintainedData is inaccurate before you can complete loading it

9 The Life of a CI Only load data that you need!ExtractTransformLoadCleanse and ReconcileConsumeOnly load data that you need!Define dataset per providerHave different plan for Initial vs delta loadsRun multiple copies of key steps like CMDBOutput step in spoonThink about error handling especially for custom jobsAtrium CMDBADDMADDMDatasetCIsMS SCCMAtriumIntegratorSCCMDataset.CIsAny DataSourceIMPORTDatasetCIs

10 The Life of a CI Normalize before you IdentifyExtractTransformLoadCleanse and ReconcileConsumeNormalize before you IdentifyDon’t normalize all classesBatch mode – initial or large data, Continuous – steady stateUse Impact Normalization for Change Mgmt or BPPMUse Suite Rollup / Version rollup for SWLMAlways use Reconciliation, even for a single sourceKeep your data clean, normalized, and identifiedUse qualifications to filter dataUse Standard Identification and Merge RulesPut your most specific identification rule firstAtrium CMDBProductCatalogNORMALIZTADDMDatasetRECONILATCIsSCCMDataset.ProductionDatasetCIsIMPORTDatasetCIs

11 The Life of a CI Do not modify data in production dataset directly.ExtractTransformLoadCleanse and ReconcileConsumeDo not modify data in production dataset directly.Always use sandbox datasets for manual changesIf no one consumes the data, it shouldn’t be loadedPeriodically check for duplicates and take remediation actionAtrium CMDBITSMSIMITBMProductionDatasetDashboards.BPPM

12 The Life of a CI . Extract Transform Load Cleanse and ReconcileConsumeAtrium CMDBProductCatalogADDMNORMALIZTADDMDatasetITSMRECONILATCIsCIsSIMMS SCCMAtriumIntegratorSCCMDatasetITBM..ProductionDatasetCIsCIsDashboards.Any DataSourceIMPORTDatasetCIsCIsBPPM

13 Normalization and Reconciliation exampleData Source 1Host Name: John Smith LaptopModel: Apple MacBook Pro 15"Software: Microsoft WordVersion: 2004Normalized DataHost Name: John Smith LaptopModel: MB134B/ASoftware: MSWordVersion: 2004Host Name: John Smith LaptopModel: Apple MacBook Pro 15“Software: Microsoft WordVersion:Reconciled DataDatabaseWebServicesData Source 2Host Name: John Smith LaptopModel: Apple MacBook Pro 15"Software: Microsoft WordVersion:Host Name: John Smith LaptopModel: Apple MacBook Pro 15"Software: MSWDVersion:Atrium CMDBProduction DatasetHow does Normalization Engine collaborate with Reconciliation Engine?We talked about how the objective of Reconciliation is to get clean, quality data into your production dataset,Normalization provides two key features that also work to that end:First, it improves the quality and the consistency of the data.And second, it reviews the data before being Identified and Merged, thus allowing us to focus our reconciliation efforts only on Normalized data.In the example we can see that our Data Source number one has discovered our software as MSWord, while our data source number two has discovered it as MSWD. Through normalization we are making it consistent <<click>>, and now our normalized data look not only correct, but also the same on both sources: Microsoft Word.This makes our data more accurate and usable.Finally <<click>> we can see how the information from the Source one and Source two were combined into a single record. The stars here indicate precedence. So our source one, has a higher precedence over our source two, for both Host Name and Model.The Source Two, instead, has higher precedence for Software and Version. Reconciliation uses those precedence values, and combines the information into a single record.

14 Performance considerationsEstablish an Integration ServerIn many cases when performance is an issue, poor database configuration and / or indexing is the causeConsider indexing attributes used in Identification rulesCheck query plans, review and correct themAre DB backups happening when Reconciliation jobs are running?Use qualifications whenever possible to filter your data“Fine tune” thread settings and use Private QueueLet’s talk about some other performance considerations:DB Tweaks specific to a customer environment.There is no Magic Potion here.DB Administrators need to identify long running queries. Is an index required or would it be beneficial to add one?We have many customers who complain about performance not being ideal. And in almost 90% of the cases, the performance issue was found in bad DB configuration, or bad indexing.Consider indexing attributes used in Identification rules. This will improve the performance of the Identify activity.Check query plans, review and correct.Are DB backups happening when Reconciliation jobs are running?All this needs to be considered, and maintenance needs to occur often.Establish an Integration ServerIf you have the possibility of having a server group. Dedicate one of these server to integration activities. That means, all your data loading tools (like AI/AIE/ADDM sync) will point to this server, and normalization and reconciliation should run primarily on this server too.In this way your users will have reduced impact from the resources used by these processes.Do not run Normalization or AIE/AI jobs at the same time. // May be removed? All processes use many resources. Resource sharing may impact the performance of all these jobs.On the other hand, all of these tools work directly on the CMDB data, and it’s datasets. If jobs are working on the same data DB locks may also delay the processing of the CIs, making the jobs last longer.Loading, normalizing and reconciling are best run in sequence and during non-working hours.Keep your data cleanUnresolved errors impact subsequent jobs also. The CIs that fail to reconcile will be retried on each subsequent job.I’ve seen jobs taking 1 hour to process 2 Cis. All the rest of the processing was spent on previous failed Cis.So the question is: How is my data getting dirty and how do I prevent that?Is your data getting properly identified? Do you need to change an identification rule or add a new one?Ask yourself if you are running a proper normalization job and if the Product Catalog definitions are correct and up to date?Make sure Reconciliation is only bringing in Normalized CIs.Also run purge jobs weekly to remove old deleted data from the systems, that will lower the CI count, improving the performance of the whole CMDB.Use qualifications when possibleUse them on your Identifications activities and use them on your merge activities.We worked on one environment where the Product Class had 4 million CIs. But the customer was not interested in this class at all.We removed that class by restricting our reconciliation through qualifications, and we improved the running time of the job by 400%.So remember to always look for good filtering opportunities. Understand your environment and what is needed, and then work to get that data only. That will make the jobs run faster and at the same time you will have a much cleaner and more concise Production data.Merge AlgorithmsWhen performance is a concern, set the Merge Order to “By class in separate transactions”, which is the fastest processing option.If the job must run during production hours, you instead can use the “Related CIs in separate transactions” option, which commits things like computer systems and their related components in one atomic transaction. That means a CI and all it’s relationships and childs will be moved at the same time. This is slower but safer.“Fine tune” Threads # and/or use the Private Queue when appropriate -> Demo on next slide.As we mentioned before RE uses many resources. Those resources include your Remedy Server threads.If the thread count is set too low, the AR System server will have low CPU use, poor throughput, and potentially poor response time under high loads. On the other hand, defining too many threads may result in unnecessary thread administration. Suggested thread counts are three times the number of CPUs for the fast queue and five times the number of CPUs for the list queue. So a two-CPU box might have six threads for fast and ten threads for list.Is there a limit? Well, the recommended maximum for any thread pool is never greater than 30., but note that these are suggestions, and as such they should serve as a good initial starting point.Since there are so many variables: different hardware, cpu architecture, cpu speed, etc., we highly recommend you to benchmark your environment to figure out optimum settings.Besides properly tuning the threads, we may run into the issue, especially in low-end servers, where running a Reconciliation job impacts the user perception of the Remedy System responses. The reason for this is because our Reconciliation job, or jobs, could be using all available threads, causing end-user requests to wait on queue for a longer than normal amount of time. In order to prevent this situation we have the possibility of setting up a private queue for our Reconciliation requests, thus freeing the Fast and List queue, and making them once again available for end-users.Following is a Demo on how to configure the Private Queue for Reconciliation Engine.

15 Summary Don’t do standalone CMDB project, CMDB is a means to endsApproach CMDB project from consumer side not providerDon’t boil the oceanStart small, prove value and iteratebut there is incremental value at every step along the wayNormalize before you reconcileAlways reconcile and use sandbox for manual editingService orientation is where real value lies; model services NOW

17 You are Allowed to Extend the CDM – BUT DON’TDo EVERYTHING possible to design using the CMDB default data modelThere is a mapping paper on the web site to help with mapping decisionshttps://communities.bmc.com/docs/DOC-16471If there is a request to extend, really evaluate whether there is really no existing class that it would be appropriate to map things intoIf you do extend the model, make sure you follow best practicesModel for the CONSUMER not the providerAdd as few extensions as possibleConsider that not all consumers can see a new class