Extract a Subset of Data

Menu locations:Data_Organizing_Extract Subset of DataData_Cleaning and Encoding_Extract Subset of Data.

This function enables you to extract subsets of data from a column of numbers by using an expression that refers to either: (1) the cells in the selected block of data; or (2) one or more indicator variables that are different columns from the data you are selecting from but in corresponding rows.

For example (1), you might want to remove all of the negative values in a data set where this was used to indicate missing data:

Rat

Drug

Placebo

1

67

72

2

72

88

3

81

-9

4

56

66

5

-7

91

6

121

170

7

44

66

Use the Data_Grouping_Extract menu item with Drug and Placebo as the data variables to create a new variable with the expression "X<0" and a replace value of "":

Rat

Drug

Placebo

1

67

72

2

72

88

3

81

4

56

66

5

91

6

121

170

7

44

66

For example (2), you might want to select only the rows from a score data column for which sex=2 and age>30; i.e. extract into one new variable, all data for females over 30:

Sex

Age

Score

1

23

2.3

1

33

3.1

1

19

3.2

2

21

1.9

2

43

4.2

2

39

5.0

2

26

1.1

Use the Data_Grouping_Extract menu item with Score as the data variable and Sex and Age and the indicator variables to create a new variable with the expression "X1=1 and X2>30":

Score [Sex=2 AND Age>30]

4.2

5

You can compose an expression using X1, X2 etc. and any of the functions, operators and logical expressions described under user-defined function. Here are some examples of expressions for the example of the two indicators mentioned above:

X1: Sex

X2 Age

"X1=1" (not all of the selected/listed indicator variables have to be used)