Exercise

Introducing .SDcols

.SDcols specifies the columns of DT that are included in .SD. Using .SDcols comes in handy if you have too many columns and you want to perform a particular operation on a subset of the columns (apart from the grouping variable columns).

Using .SDcols allows you to apply a function to all rows of a data.table, but only to some of the columns. For example, consider the dog example from the last exercise. If you wanted to compute the average weight and age (the second and third columns) for all dogs, you could assign .SDcols accordingly:

dogs[, lapply(.SD, mean), .SDcols = 2:3]
Weight Age
1: 56 5.2

While learning the data.table package, you may want to occasionally refer to the documentation. Have a look at ?data.table for more info on .SDcols.

Yet another data.table, DT, has been prepared for you in your workspace. Start by printing it to the console.

Instructions

100xp

Calculate the sum of the columns that start with Q, using .SD and .SDcols. Set .SDcols equal to 2:4.

Set .SDcols to be the result of a function call. This time, calculate the sum of columns H1 and H2 using paste0() to specify the .SDcols argument.

Finally, select all but the first row of the groups names 6 and 8, returning only the grp column and the columns that start with Q. Use -1 in i of .SD and use paste0() again. Type desired_result into the console to see what your answer should look like.