/ Operator not meaningful for factors

/ Operator not meaningful for factors

Folks,
I have a very basic question. The solution eludes me perhaps because
of my own lack of creativity. I am not attaching a fully reproducible
session because the issue may well be becuase of the way the data file
is, and the data file is large (and I don't know whether I can legally
distribute it). If people can suggest things that might be wrong in my
data or the way that I am reading it, I would be most grateful.

I get the following error message in the session quoted at the end of
this email:
/ not meaningful for factors in: Ops.factor(BookValuePS, Price)

As you can see in that some session, I check that the two vectors
being divided are numeric. I also check that the divisor is not 0 at
any index. I also believe that this is not because of the NA's in the
data. My question is, what are other "problems" that can cause the /
operator to not be meaningful?

Re: / Operator not meaningful for factors

The mode of a factor is numeric, so your test does not do what you think
it does.

is.numeric() is the recommended test of a vector being numeric. I have no
idea where you got the idea that mode() was a useful test (perhaps you
could give us the reference you used), but it rather rarely is (typeof is
usually more informative).

>From the summary quoted, Price is clearly a factor. Test it with
is.factor.

On Sun, 15 Jan 2006, Vivek Satsangi wrote:

> Folks,
> I have a very basic question. The solution eludes me perhaps because
> of my own lack of creativity. I am not attaching a fully reproducible
> session because the issue may well be becuase of the way the data file
> is, and the data file is large (and I don't know whether I can legally
> distribute it). If people can suggest things that might be wrong in my
> data or the way that I am reading it, I would be most grateful.
>
> I get the following error message in the session quoted at the end of
> this email:
> / not meaningful for factors in: Ops.factor(BookValuePS, Price)
>
> As you can see in that some session, I check that the two vectors
> being divided are numeric.

(see the request above for your reference here)

> I also check that the divisor is not 0 at any index. I also believe that
> this is not because of the NA's in the data. My question is, what are
> other "problems" that can cause the / operator to not be meaningful?

Why not test for factor, since that is what the very helpful error message
told you the problem was?

Re: / Operator not meaningful for factors

Sir,
I made the (incorrect, probably unjustified) deduction of using mode()
based on section 3.1 of "An Introduction to R". Since the write up
talks about the "mode" of an object, and using attr() did not work (it
gives some error saying that "mode of name must be character"), I
tried mode() and reached this incorrect conclusion.

I have had this confusion for a while now about the fact that
something is numeric AND it is a factor, since if it were just a
vector and not a factor, it would still be numeric, as in:
> a <- c (1, 2, 3);
> class(a);
[1] "numeric"

I'll try to think of a way to improve the explanation in "An
Introduction to R" so that the next person coming along does not fall
into the same pit.

Re: / Operator not meaningful for factors

This will sound very stupid because I just started using R but I see you had similar problems.

I just loaded a very large dataset (2950*6602) from csv into R. The format is ticker=row, date=column.
Every time I want to compute basic operations, R returns "In Ops.factor: not meaningful for factors"

I believe it is because R does not read the data as numbers but I am not sure. Can anybody help?

Re: / Operator not meaningful for factors

On May 3, 2010, at 6:22 PM, vincent.deluard wrote:

>
> Hi there,
>
> This will sound very stupid because I just started using R but I see
> you had
> similar problems.
>
> I just loaded a very large dataset (2950*6602) from csv into R. The
> format
> is ticker=row, date=column.

Not a particularly precise description of what is in the data.

> Every time I want to compute basic operations, R returns "In
> Ops.factor: not
> meaningful for factors"

Code .... we want to see code. All of it.
>
> I believe it is because R does not read the data as numbers

Probably true. Were they dates or numbers?

> but I am not
> sure. Can anybody help?

I generally read in my data with read.table( ,,,,, as.is=TRUE,
stringsAsFactors=FALSE, ....) and then convert the columns that I know
should be numeric with as.numeric. If all of your columns should have
been numeric, you might use the colClasses argument, perhaps along
these lines:

Re: / Operator not meaningful for factors

I think that you are correct. R has the annoying habit of converting character data to factors when you don't want it to while it is importing data. This is because the in the option "stringsAsFactors" is set to TRUE for some weird historical reasons.

Try the command str(insert name of data) and see what happens. It should show you which columns of data are being treated as factors.

You can convert the back to character or to numeric. See the FAQ Part 7 "How do I convert factors to numeric? " or you can use the String as options command in the read.table to FALSE

Something like this should work, I think, but it's not tested
read.table("C:/rdata/trees.csv", stringsAsFactors=FALSE)

>
> Hi there,
>
> This will sound very stupid because I just started using R but I see
> you had
> similar problems.
>
> I just loaded a very large dataset (2950*6602) from csv into R. The
> format
> is ticker=row, date=column.

Not a particularly precise description of what is in the data.

> Every time I want to compute basic operations, R returns "In
> Ops.factor: not
> meaningful for factors"

Code .... we want to see code. All of it.
>
> I believe it is because R does not read the data as numbers

Probably true. Were they dates or numbers?

> but I am not
> sure. Can anybody help?

I generally read in my data with read.table( ,,,,, as.is=TRUE,
stringsAsFactors=FALSE, ....) and then convert the columns that I know
should be numeric with as.numeric. If all of your columns should have
been numeric, you might use the colClasses argument, perhaps along
these lines:

I think that you are correct. R has the annoying
habit of converting character data to factors when you don't want it to while
it is importing data. This is because the in the option
"stringsAsFactors" is set to TRUE for some weird historical reasons.

Try the command str(insert name of data) and see what happens. It should
show you which columns of data are being treated as factors.

You can convert the back to character or to numeric. See the FAQ Part 7
"How do I convert factors to numeric? " or you can use the String as
options command in the read.table to FALSE

Something like this should work, I think, but it's not tested
read.table("C:/rdata/trees.csv", stringsAsFactors=FALSE)

Re: / Operator not meaningful for factors

Check you input data. You have some non-numeric characters in columns where
you are expecting numerics.

The parameter is

stringsAsFactors=FALSE

You had it spelt wrong. After reading in your data, do the conversion to
numeric and then examine which locations contain NA; this will point you to
the problem line in your input. Also if you use 'colClasses', I think it
will error the offending line.

Re: / Operator not meaningful for factors

At 3:50 PM -0700 5/3/10, John Kane wrote:
> I think that you are correct. R has the annoying habit of
>converting character data to factors when you don't want it to while
>it is importing data. This is because the in the option
>"stringsAsFactors" is set to TRUE for some weird historical reasons.
>
>

Well, "annoying" is in the eye of the beholder. The reason is not
weird at all; the original S language, upon which R is based, was
designed first for statistical analysis. When the language was
expanded to include advanced modeling capabilities (linear models,
generalized linear models, and more) it became apparent that factors
are the appropriate form for using categorical data in such models.
it is still the "R Project for Statistical Computing" (see the R home
page), so the default is unchanged.

Hence, when users get factors when they were expecting numbers, it's
virtually always because the have some non-numeric character strings
mixed in with the data. R then defaults to interpreting it as
categorical data, represented as a factor.

Re: / Operator not meaningful for factors

> I think that you are correct. R has the annoying habit of converting
> character data to factors when you don't want it to while it is
importing
> data. This is because the in the option "stringsAsFactors" is set to
TRUE for
> some weird historical reasons.

It is a matter of opinion. I consider it quite useful feature. If I see by

str(some.data) or summary(data0 that numeric columns are factors I know
something is wrong with input.

and when I want to use ggplot, xyplot or just plot my data with different
colours/sizes/pchs/.... it is quite easy to use as.numeric(my.factor) to
get numeric representation of levels.

Finally you can easily change labels, concatenate levels and so on.

Just my 2 cents.

Regards
Petr

>
> Try the command str(insert name of data) and see what happens. It
should show
> you which columns of data are being treated as factors.
>
> You can convert the back to character or to numeric. See the FAQ Part 7
"How

> do I convert factors to numeric? " or you can use the String as options
> command in the read.table to FALSE
>
> Something like this should work, I think, but it's not tested
> read.table("C:/rdata/trees.csv", stringsAsFactors=FALSE)
>
>
>
>
>
> --- On Mon, 5/3/10, vincent.deluard <[hidden email]>

Re: / Operator not meaningful for factors

> another way to do it (if you are still having problems) is to use sapply
> (yourdataname,data.class) after you've read it in, which will tell you
the
> data class of each of your variables (factor, numeric etc). You can then

> > character data to factors when you don't want it to while it is
> importing
> > data. This is because the in the option "stringsAsFactors" is set to
> TRUE for
> > some weird historical reasons.
>
> It is a matter of opinion. I consider it quite useful feature. If I see
by
>
> str(some.data) or summary(data0 that numeric columns are factors I know
> something is wrong with input.
>
> and when I want to use ggplot, xyplot or just plot my data with
different
> colours/sizes/pchs/.... it is quite easy to use as.numeric(my.factor) to

> get numeric representation of levels.
>
> Finally you can easily change labels, concatenate levels and so on.
>
> Just my 2 cents.
>
> Regards
> Petr
>
>
>
> >
> > Try the command str(insert name of data) and see what happens. It
> should show
> > you which columns of data are being treated as factors.
> >
> > You can convert the back to character or to numeric. See the FAQ Part

7
> "How
> > do I convert factors to numeric? " or you can use the String as
options

>
______________________________________________________________________________________________
> UNIVERSITY OF CAPE TOWN
> This e-mail is subject to the UCT ICT policies and e-mail disclaimer
published
> on our website at http://www.uct.ac.za/about/policies/emaildisclaimer/or
> obtainable from +27 21 650 4500. This e-mail is intended only for the
person
> (s) to whom it is addressed. If the e-mail has reached you in error,
please
> notify the author. If you are not the intended recipient of the e-mail
you may
> not use, disclose, copy, redirect or print the content. If this e-mail
is not
> related to the business of UCT it is sent by the sender in the sender's
> individual capacity.
>

______________________________________________________________________________________________
> UNIVERSITY OF CAPE TOWN
> This e-mail is subject to the UCT ICT policies and e-mail disclaimer
published
> on our website at http://www.uct.ac.za/about/policies/emaildisclaimer/or
> obtainable from +27 21 650 4500. This e-mail is intended only for the
person
> (s) to whom it is addressed. If the e-mail has reached you in error,
please
> notify the author. If you are not the intended recipient of the e-mail
you may
> not use, disclose, copy, redirect or print the content. If this e-mail
is not
> related to the business of UCT it is sent by the sender in the sender's
> individual capacity.
>