I would like to read in really large text files (about 2GB) into Mathematica. The structure of the file is such that the first row and the first column are text i.e strings. Rest is all numbers. Is it possible to take advantage of this and make the Import faster? Currently I do Import["file.txt","Table"]. It takes a long time. I would like the output list to be same as that returned by the previously mentioned Import command.

Import is often about the slowest you can have, particularly for tabular data. BinaryRead(List) can be much faster, and Java can give you still much faster reads, when one uses buffer reads
–
Leonid ShifrinDec 4 '12 at 20:06

The three links above especially the third one are related but not exactly what I need. I don't have a problem with memory. It is only the speed that I am concerned. Also, I was looking for something like if I could tell mathematica that those are going to be numbers, is it possible that it will save time? For eg: in R, it does help a lot.@LeonidShifrin can you please explain your answer? Sorry, I am not able to understand.
–
preetiDec 4 '12 at 20:39

1 Answer
1

ReadList and streams using OpenRead are your friends (also OpenAppend if you want to append to it). OpenRead with streams uses very low-level native i/o methods, which is about the fastest you can possibly get. Also ReadList is much faster than Import, which has to load a Java package internally upon its first invocation, before it can do something.

Mathematica is a registered trademark of Wolfram Research, Inc. While the mark is used herein with the limited permission of Wolfram Research, Stack Exchange and this site disclaim all affiliation therewith.