On Fri, Feb 26, 2010 at 4:29 PM, Warren Weckesser <
warren.weckesser@enthought.com> wrote:
> Ralf Gommers wrote:
> > Hi all,
> >
> > I'm trying to read in data from text files with genfromtxt, and have
> > some trouble figuring out the right combination of keywords. The
> > format is:
> >
> > ['0\t\t4.000000000000000e+007,0.000000000000000e+000\n',
> > '\t9.860280631554179e-001,-1.902586503306264e-002\n',
> > '\t9.860280631554179e-001,-1.902586503306264e-002']
> >
> > Note that there are two delimiters, tab and comma. Also, the first
> > line has an extra integer plus tab (this is a repeating pattern).
> >
>> The 'delimiter' keyword does not accept a list of strings. If it is a
> list, it must be a list of integers that are the field widths. In your
> case, that won't work.
>> You could try fromregex:
>> -----
> In [1]: import numpy as np
>> In [2]: cat sample.raw
> 0 4.000e+007,0.00000e+000
> 9.8602806e-001,-1.9025e-002
> 9.8602806e-001,-1.9025e-002
> 123 5.0e6,100.0
> 10.1,-2.0e-3
> 10.2,-2.1e-3
>>> In [3]: a = np.fromregex('sample.raw', '(.*?)\t+(.*),(.*)',
> np.dtype([('extra', 'S8'), ('x', float), ('y', float)]))
>> In [4]: a
> Out[4]:
> array([('0', 40000000.0, 0.0), ('', 0.98602805999999998, -0.019025),
> ('', 0.98602805999999998, -0.019025), ('123', 5000000.0, 100.0),
> ('', 10.1, -0.002), ('', 10.199999999999999,
> -0.0020999999999999999)],
> dtype=[('extra', '|S8'), ('x', '<f8'), ('y', '<f8')])
>>> Note that the first field of the array is a string, not an integer. The
> string will be empty in rows that did not have the initial integer. I
> don't know if that will work for you.
>> That works, thanks. I had hoped that genfromtxt could do it because it can
skip the header and is presumably faster. But I'll take what I can get.
Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20100226/a4b2b06a/attachment.html