Staff Answer

The code you have should work. I would just make sure that "ipumsi_ddi_file" corresponds to the ddi file of your IPUMS USA extract. If the issue persists, you could use the following to create a subset of the full data set with your columns of interest:

data <- subset(test, select = c(YEAR,MET2013,PERWT)

Can you check to see if the following works for reading in a subset of the example extract included in the ipumsr package?

Answers

Your code works fine for me so the issue seems to be the particular DDI file that I am using (ie. test below has 55 columns while test2 has 2 columns). I am trying to limit the number of columns for memory considerations on the initial load, otherwise a select or subset statment like you suggested would work fine.

The other thing that would be helpful (given memory limitations) is if I could read in only a specific selection of rows data at a time (such as one metropolitan area or state) but I'm not sure how to do that within ipumsr functions.

Thanks for making a ticket and for the suggestion on using a .dat file. Also, to clarify the reason why I was interested in reading in a selection of data from the data extract using ipumsr instead of requesting a smaller data extract is that I am analyzing ~40 metropolitan statistical areas (MSA) so it would be a bit of a pain to request them in separate data extracts.

However, if I had the ability to download the entire ~40 MSA dataset but only read in one MSA at a time into memory with ipumsr, then I could easily iterate through this dataset and not run up against memory limitations on a laptop. For the time being I think I will just use a smaller data extract for testing and then run the entire ~40 MSA dataset on a desktop with more memory when I have my code in a good state.

Returning to this old question to mention some brand new functionality in ipumsr.

The newest version of ipumsr (released over the weekend to CRAN) has a function `read_ipums_micro_chunked()` which allows for working with data in chunks without having to store the whole thing in memory. This could allow for filtering out the unwanted MSAs as you describe. For more details, see the "big data" vignette.

User Forum

The User Forum allows users to seek more information about our data tools by submitting questions for IPUMS staff or fellow users to answer, or by searching the answer base for previous questions and answers.