Access the help you need to use our software from representatives who are knowledgeable in data mining and predictive analytics

By Phone or Online

Access the help you need to use our software from representatives who are knwoledgeable in data mining and predictive analytics

CART® is the ultimate classification tree that has revolutionized the entire field of advanced analytics and inaugurated the current era of data mining. CART, which is continually being improved, is the most important tool in modern data mining methods. Designed for both non-technical and technical users, CART can quickly reveal important data relationships that could remain hidden using other analytical tools.

CART is based on landmark mathematical theory introduced in 1984 by four world–renowned statisticians at Stanford University and the University of California at Berkeley. Salford Systems' implementation of CART is the only decision tree software embodying the original proprietary code. The CART creators continue to collaborate with Salford Systems to enhance CART with proprietary advances.

The short answer is YES such plots can be generated. Historically, we concluded that such graphs would normally not be that interesting as they would frequently be single step functions reflecting the fact that individual variables often appear only once or twice in a tree. Also, such graphs would not properly reflect the effect of a varible across most of its range of values. Thus, as of SPM® 7.0 CART® does not offer such plots. However, we can see what such plots would look like by using TreeNet® to grow a one-tree model. To do this, just set up a normal model, choose the TreeNet analysis method, and set the number of trees to be grown to 1 (see green arrow below).

CART® is an acronym for Classification and Regression Trees, a decision-tree procedure introduced in 1984 by world-renowned UC Berkeley and Stanford statisticians, Leo Breiman, Jerome Friedman, Richard Olshen, and Charles Stone. Their landmark work created the modern field of sophisticated, mathematically- and theoretically-founded decision trees. The CART methodology solves a number of performance, accuracy, and operational problems that still plague many other current decision-tree methods. CART's innovations include:

One of the strengths of CART® is that, for ordered predictors, the only information CART uses are the rank orders of the data – not the actual value of the data. In other words, if you replace a predictor with its rank order, the CART tree will be unchanged.

CART® will only search over all possible subsets of a categorical predictor for a limited number of levels. Beyond a threshold set by computational feasibility, CART will simply reject the problem. You can control this limit with the BOPTION NCLASSES = m command, but be aware that for m larger than 15, computation times increase dramatically.

Salford Systems' CART® is the only decision tree based on the original code of Breiman, Friedman, Olshen, and Stone. Because the code is proprietary, CART is the only true implementation of this classification-and-regression-tree methodology. In addition, the procedure has been substantially enhanced with new features and capabilities in exclusive collaboration with CART's creators. While some other decision-tree products claim to implement selected features of this technology, they are unable to reproduce genuine CART trees and lack key performance and accuracy components. Further, CART's creators continue to collaborate with Salford Systems to refine CART and to develop the next generation of data-mining tools.

Cross-validation is a method for estimating what the error rate of a sub-tree (of the maximal tree) would be if you had test data. Regardless of what value you set for V-fold cross validation, CART grows the same maximal tree. The monograph provides evidence that using a V of 10-20 gives better results than using a smaller number, but each number could result in a slightly different error estimate. The optimal tree — which is derived from the maximal tree by pruning — could differ from one V to another because each cross-validation run will come up with slightly different estimates of the error rates of sub-trees and thus might differ in which tree was actually best.

CART® automatically produces a predictor ranking (also known as variable importance) based on the contribution predictors make to the construction of the tree. Predictor rankings are strictly relative to a specific tree; change the tree and you might get very different rankings. Importance is determined by playing a role in the tree, either as a main splitter or as a surrogate. CART users have the option of fine tuning the variable importance algorithm.