User login

Navigation

Relationships

You are here

VDM Public

A Design Methodology appropriate for dimensional models, versioned dimensions, intelligent aggregation management and resilient data organization, particularly for large complex warehousing environments. It accumulates know-how and experience of more than 20 years.

A set of simple quick-to-learn Unix tools facilitating code generation and automation that support a spectrum of activities, from DDL and DML processes to parallel ETL frameworks and parsers that can supplement or replace expensive ETL technologies, and metadata transformations enabling inter-process hand-offs using machine readable documents.

A system development philosophy that removes ambiguity and improves both productivity and quality

There is a spectrum of business models, from the environment InfoKarta operated in the 2005-2013 timeframe where the data warehouse is customer hosted and the development is owned by the customer, to a data warehousing service, whereby customers enjoy EDW service without the burden of building and hosting it. Here are three distinct points on this spectrum:

Customer Hosted - Customer Delivered

The data warehouse is hosted at customer facilities, and the development is managed by customer resources. The methodology used is typically a variation of waterfall. InfoKarta provides data-related services to the customer, covering data architecture, performance engineering, ETL, SQL Application support and software automation.

Assess the appropriateness, architecture, capacity and cost for VDM/ETL and non-VDM ETL provisioning. The run-time environment is installed at the customer site and integrated with development, QA and production configurations. The sourcing strategy and technology is reviewed and a plan for data provisioning is agreed. The necessary infrastructure and naming conventions to receive,land and manage source files or continuous streams is established. Education on the interaction and the use of Mapping templates ensures effective customer/developer communication.

By "Codes" we typically refer to standardized representations of real world properties or attributes. Dependencies among codes may be standardized and captured as well. As we try to consolidate or integrate data from disparate systems, we often find that the values chosen to denote the same meaning may differ from one source system to another. Code management is a process that maintains information about the precise meaning of each code, often in more than one language, and its representations in participating systems.

Typically, facts are aggregated over regular predefined intervals, such as weeks and accounting periods. This discussion is about aggregations of facts over irregular intervals, defined between explicit dates, such as those defined between physical inventory dates in retail. Stores take inventory at different dates, and department inside stores follow different schedules and frequencies, forming irregular inventory periods specific to each department store. Calculation and trend reporting of shrink requires aggregating sales, receipts and respective adjustments over three consecutive inventory periods.

Technology upgrades pertain to software release upgrades of database or application infrastructure, reconfiguration of data storage, or controlled introduction of commercial off the shelf capabilities into existing configurations, such as setting up replication, altering LDAP configuration or changing security. Technology upgrades involve minimal or no functional changes to the data or applications.