Automated Statistical Data Mining

By completely automating the modeling process, our statistical data mining (KDD) tool makes sophisticated data analysis techniques accessible to the general user. With this tool, managers and analysts can maximize use of their available data using advanced statistical models that were previously available only to those with sufficient background in statistical modeling and analysis. Even for those with extensive statistical background, the work saved through automation in model development is enormous. In addition, graphical displays allow users to view their data, original or modeled, in a variety of forms and in any level of detail.

Our statistical data mining (KDD) tool provides automated statistical data mining analysis employable in a wide variety of applications, including detailed analysis of initial data mining results. This technology is directly applicable to credit scoring, marketing, behavioral studies, finance, and many other applications. The methods of categorical data analysis that we use are conceptually related to linear regression, but are more advanced and use aggressive models that must be solved and analyzed using more computationally intensive, patent-pending approaches.

Our tool automates advanced statistical and numerical tools, alleviating the burden on an in-house statistician. Using patent-pending technology, our automated statistical data mining tool will automatically construct, solve and analyze thousands of statistical models with essentially no intermediate input on the part of the operator. This functionality is increasingly crucial as the number of models to choose from grows exponentially in the number of factors that are considered to be influential within the data set.

In conjunction with automated data agents, out statistical data mining tool does the work of completely automating the statistical data mining process: gathering data from databases, forming and solving hundreds or thousands of possible statistical models, and determining the quality of each model. The end results include an event-probability matrix, a ranking of the predictor variables that are most influential in determining the outcome events, and a description of which predictor variables are interdependent in determining the outcome events. Our tool can be used as a client-side or as a server-side application.
Additional details are contained in the article Automated Statistical Modeling for Data Mining