Dataset

Relevant variables

The columns used in this example are:

twnr, which identifies participants

V1 to V23, which contain answers to each of the 23 RAPI items

Scoring the RAPI

Writing a scoring method for the splithalfr requires implementing two functions; a sets function that describes which sets of data should be split into halves and a score function that calculates a score.

Defining the sets function

The sets function receives data from a single participant and returns a list of datasets. In this case, we will generate a list with a single element, which contains a vector with data from each of the 23 items. As this data should be split by columns, we deliver the data as a vector.

Defining the score function

The score function receives the list from a single participant and returns the sum score.

rapi_fn_score <- function (sets) {
return (sum(sets$items))
}

Calculating a score without the splithalfr

By combining the sets and score functions, a score for a single participant can be calculated. For instance, the score of UserID 1 can be calculated via the statement below.

rapi_fn_score(rapi_fn_sets(subset(ds_rapi, twnr == 1)))

Calculating scores with the splithalfr

To calculate scores for each participant, call sh_apply with four arguments:

the dataset

the column that identifies participants in the dataset

the sets function

the score function

The sh_apply function will return a data frame with one row per participant, and two columns: one that identifies participants (“twnr” in this example) and a column “score”, that contains the output of the score function.

rapi_scores <- sh_apply(ds_rapi, "twnr", rapi_fn_sets, rapi_fn_score)

Checking scores

It is recommended to check your scoring method by calculating the score of a representative participant via a different approach. For splithalfr tests, the author has done so via Excel.

Estimating split-half reliability

Calculating split scores

To calculate split-half scores for each participant, call sh_apply with an additional split_count argument, which specifies how many splits should be calculated. For each participant and split, the splithalfr will randomly divide the dataset of each element of sets into two halves that differ at most by one in size. When called with a split_count argument that is higher than zero, sh_apply returns a data frame with the following columns:

twnr, which identifies participants

split, which counts splits

score_1, and score_2, which are the scores calculated for each of the split datasets

Estimating reliability averaged over splits

Next, the output of sh_apply can be analyzed in order to estimate reliability. By default, functions are provided that automatically calculate mean Spearman-Brown (mean_sb_by_split) and Flanagan-Rulon (mean_fr_by_split) coefficients. If any missing values were encountered in the data provided to these functions, they give a warning, and then pairwise remove the missing data before calculating reliability.