Data Scientist Jobs

So I am about to graduate college in May and am looking heavily into Data Science vocations. On a whim, I was just wondering if anyone else here is currently in that field of work/how their experience has been so far.

Big data analytics is definitely more what I'm thinking. I don't have nearly the specific technical knowledge to go into Genom Data research. I'm thinking more along the lines of work for large corporations. Basically my dream is to take every piece of data I can for a company, and make sense out of it for whoever needs it.

I'm assuming since you're about to graduate that you aren't married or have any family responsibilities. I'm also assuming that you have all the big data buzzwords hadooping around in your brain.

Do you like to travel, especially flying? Live out of a hotel room for weeks on end? Do you mind living in a new city every 6-8 weeks?

Are you a good BS'er? Can you get a CEO/CFO/COO to believe what you're telling them more than their staff who have a lot more real world experience in the subject matter at hand?

The upshot is that if you can do these things and get the right placement you can easily bring home at least $120-150K a year, perhaps as much as $300K+ a year depending on the gigs you can get. The downside is that you can expect to be kicked to the curb at about age 30 when young hotshots can replace you or you have to stay home more because of the wife and kids. Then you can get the staffer that has to deal with the dreaded consultants coming in to upend your life.

Thank you so much for that reply! I was definitely going more of the approach of working for a specific company. For example, one of the positions I applied to was at General Motors. But I like those companies a lot and will be trying to find some way to apply to them.

Gas-man:

"Data Scientist" can be a bit of a misleading title, and the actual responsibilities can change depending on who you work for. How I like to think of if it, is a cross between a statistician and a computer scientist (utilizing the programs that actually deal with the data can involve a bit of programming, but not so much that it would require a full on undergrad degree in CS). So it's like making a "Science" out of data analytics. Looking for trends, finding causes and effects etc. To make it even a little more confusing for you, most schools don't even offer an undergrad in "Data Science". For example, I'm getting my degree (BS) in Applied Math. Most of the other people that I know that are applying for similar jobs as me are also Applied Math people. Sometimes they're Comp Sci majors who also happen to be really talented mathematicians, and every now and then I'll run into an Economics major as well, but that's pretty rare.

Lol, is that about as clear as mud for you now? Obviously I love this stuff, so I'm happy to explain more about it

That's a skill that's in huge demand right now. And frankly, it's not hard to learn from a syntax standpoint. It's more important to have a good understanding of the underlying statistical concepts.

So, do you have much training in statistics?

Click to expand...

I actually did some research at Fermi-Lab years ago where I used R. I'm really rusty right now, but I just downloaded it again last night, and I'm gonna brush up. Definitely a big deal right now, and that seems to be the biggest thing people are putting as a pre-req. I honestly feel way more comfortable using C++, but that's just because I've been using that more recently, obviously not because it's an easier language

To 'make sense' of business data, you may need some in-depth knowledge of business principles and data uses to sell your data distillations to Division VPs, Heads of Sales/Marketing groups, CEO's, CFOs and board members. I mean, you won't personally be there to do the presentation (your boss, or their boss will), but you need to know your sh!t so it will make sense to them.

How did you do in courses related to Macro/Micro Economics? Sales and Marketing? Statistics?

I'm assuming since you're about to graduate that you aren't married or have any family responsibilities. I'm also assuming that you have all the big data buzzwords hadooping around in your brain.

Do you like to travel, especially flying? Live out of a hotel room for weeks on end? Do you mind living in a new city every 6-8 weeks?

Are you a good BS'er? Can you get a CEO/CFO/COO to believe what you're telling them more than their staff who have a lot more real world experience in the subject matter at hand?

The upshot is that if you can do these things and get the right placement you can easily bring home at least $120-150K a year, perhaps as much as $300K+ a year depending on the gigs you can get. The downside is that you can expect to be kicked to the curb at about age 30 when young hotshots can replace you or you have to stay home more because of the wife and kids. Then you can get the staffer that has to deal with the dreaded consultants coming in to upend your life.

Click to expand...

All your assumptions are correct sir! And traveling is great for me. I actually used to be on my university's track team, and would fly to a different city every 2 weeks, and I honestly loved it. I can see myself getting tired of living out of a suitcase in the future, but for now, while I don't really have roots any where I'm down for seeing new places and what not.

I do consider persuasion to be one of my strengths. And the thing is, it's not necessarily about the money for me. I know a lot of people my age can say that, but dealing with numbers is just what I like to do. I think I was just born in a time period where that turned out to be a lucrative option for me, so I'll take it.

I actually did some research at Fermi-Lab years ago where I used R. I'm really rusty right now, but I just downloaded it again last night, and I'm gonna brush up. Definitely a big deal right now, and that seems to be the biggest thing people are putting as a pre-req. I honestly feel way more comfortable using C++, but that's just because I've been using that more recently, obviously not because it's an easier language

Click to expand...

Right, and from a syntax standpoint, R is vastly more forgiving than C++.

The problem with R is that it has a million built-in functions for doing statistical analysis that will let you "perform the analysis" without understanding what you're really doing.

I can give you a list of random numbers, and you can plug them into a regression function no problem. And R will give you an answer. But R can't know that your numbers are purely random, and that its output is, accordingly, completely meaningless.

In the old-school phrasing, it's garbage-in, garbage-out. R just makes it a lot easier to dump the garbage in!

On the other hand, if you understand the underlying statistical assumptions of the models, and the limitations of the data, R can be an extremely powerful tool.

To 'make sense' of business data, you may need some in-depth knowledge of business principles and data uses to sell your data distillations to Division VPs, Heads of Sales/Marketing groups, CEO's, CFOs and board members. I mean, you won't personally be there to do the presentation (your boss, or their boss will), but you need to know your sh!t so it will make sense to them.

How did you do in courses related to Macro/Micro Economics? Sales and Marketing? Statistics?

Click to expand...

Yeah, the fact that I don't have business courses definitely doesn't help me any. I didn't take any Eco classes, but I've taken 1 semester of Math Stats, and I'm currently in the 2nd semester. Those classes are pretty easy compared to a lot of Math classes. But I have a ton of interest in the pure business sides of things too.

Right, and from a syntax standpoint, R is vastly more forgiving than C++.

The problem with R is that it has a million built-in functions for doing statistical analysis that will let you "perform the analysis" without understanding what you're really doing.

I can give you a list of random numbers, and you can plug them into a regression function no problem. And R will give you an answer. But R can't know that your numbers are purely random, and that its output is, accordingly, completely meaningless.

In the old-school phrasing, it's garbage-in, garbage-out. R just makes it a lot easier to dump the garbage in!

On the other hand, if you understand the underlying statistical assumptions of the models, and the limitations of the data, R can be an extremely powerful tool.

Click to expand...

Lol yeah I definitely get that, but to be fair, I think that's a problem with a lot of computer languages/programs in general. After all, R is a tool to help you solve the problem, not just the answer. Otherwise, you could do like you said, and just throw numbers at it all day and feel super content that you have a number afterwards and tell yourself that correlation always implies causation, but in the end that's just wrong and your company is gonna loose a lot of $$$ with that sort of mindset. I also have a very big interest in computer programming in general (mainly C++ like I said before) and I've always been a big advocate of "computers are only as smart as the people that use them" mentality.

Data scientists are serious people with a serious background. Not entry level. You will need to know R, but also various flavors of SQL. But more than that, you need to know the business and how to ask the right questions.

Big data is only getting bigger. Hadoop is just the beginning. There's going to be a lot of work for some very smart people in the future.

Also in the future things will be less depended on R and Pandas, and more about visualizing data. It's not about the data, but the story the data tells. This story will need to be told in a way that people who are not data scientists can understand.

The biggest hurdle is the condition of data. As I said before most of the work (80%) is about cleaning dirty data. When you have a huge data set, this can be very time consuming as all transformations need to be processed and ingested.

Full disclosure. I work for Oracle. I was one of the designers of Oracle Big Data Discovery. Available NOW!!

I've been designing software for data scientists, and have studied the persona.

You should probably have a PhD in computer science, an MBA and a background in statistics.

Most of your time will be spent finding and cleaning data.

Click to expand...

That is so cool man! I know I started this whole thread, but I honestly think software development is like the coolest field! If I could have done college over again, it would be a really really close between whether I'd choose Math vs Comp Sci.

I really appreciate you're insight, but I'm pretty damn tired of school right now, and I really want to get a job. I am strongly considering furthering my education though while I work (hopefully while I'm working... as in I really really want a job lol)