group can be a numeric vector or categorical
column vector representing levels within a single variable, a cell
array containing one or more grouping variables, or a numeric matrix
or cell array of categorical column vectors representing levels within
multiple variables. If group is a numeric vector
or matrix, values in any column must be positive integers in the range
from 1 to the number of levels for the corresponding
variable. In this case, dummyvars treats each column
as a separate numeric grouping variable. With multiple grouping variables,
the sets of dummy variable columns are in the same order as the grouping
variables in group.

The order of the dummy variable columns in D matches
the order of the groups defined by group. When group is
a categorical vector, the groups and their order match the output
of the getlabels(group) method. When group is
a numeric vector, dummyvar assumes that the groups
and their order are 1:max(group). In this respect, dummyvars treats
a numeric grouping variable differently than grp2idx.

If group is n-by-p, D is n-by-S,
where S is the sum of the number of levels in each
of the columns of group. The number of levels s in
any column of group is the maximum positive integer
in the column or the number of categorical levels. Levels are considered
distinct if they appear in different columns of group,
even if they have the same value. Columns of D are,
from left to right, dummy variables created from the first column
of group, followed by dummy variables created from
the second column of group, etc.

dummyvar treats NaN values
or undefined categorical levels in group as missing
data and returns NaN values in D.

Dummy variables are used in regression analysis and ANOVA to
indicate values of categorical predictors.

Note:
If a column of 1s is introduced in the matrix D,
the resulting matrix X = [ones(size(D,1),1) D] will
be rank deficient. The matrix D itself will be
rank deficient if group has multiple columns. This
is because dummy variables produced from any column of group always
sum to a column of 1s. Regression and ANOVA calculations often address
this issue by eliminating one dummy variable (implicitly setting the
coefficients for dropped columns to zero) from each group of dummy
variables produced by a column of group.

Examples

Suppose you are studying the effects of two machines and three
operators on a process. Use group to organize predictor
data on machine-operator combinations: