#1071: loadtxt fails if the last column contains empty value
---------------------------------+------------------------------------------
Reporter: Electrion | Owner: somebody
Type: defect | Status: needs_review
Priority: normal | Milestone: 1.6.0
Component: numpy.lib | Version: devel
Keywords: loadtxt ascii strip |
---------------------------------+------------------------------------------
Changes (by derek):
* status: new => needs_review
* milestone: Unscheduled => 1.6.0
Comment:
Note: as default delimiter tab is treated as 'any whitespace', which
includes any number of blanks or tabs.
These are therefore treated differently from delimiters like ',' or '&'.
I'd reckon there are too many people
actually relying on this behaviour to silently change it (e.g. I know
plenty of tables with columns separated
by either one or several tabs depending on the length of the previous
entry).
But tab is apparently also treated differently if explicitly specified
with "delimiter='\t'" -
and in that case using a converter à la {{{ {2: lambda s: float(s or
'Nan')} }}} is working for
fields in the middle of the line, but not at the end - this is the case
that should be fixed.
Patch for 1.6.0beta1:
{{{
diff --git a/numpy/lib/npyio.py b/numpy/lib/npyio.py
index 9fbacaa..a1b58aa 100644
--- a/numpy/lib/npyio.py
+++ b/numpy/lib/npyio.py
@@ -724,7 +728,7 @@ def loadtxt(fname, dtype=float, comments='#',
delimiter=None,
def split_line(line):
"""Chop off comments, strip, and split at delimiter."""
- line = asbytes(line).split(comments)[0].strip()
+ line = asbytes(line).split(comments)[0].strip(asbytes(' \r\n'))
if line:
return line.split(delimiter)
else:
}}}
--
Ticket URL: <http://projects.scipy.org/numpy/ticket/1071#comment:2>
NumPy <http://projects.scipy.org/numpy>
My example project