Posted
by
Roblimoon Thursday March 20, 2014 @04:52PM
from the knowledge-you-might-need-someday-even-if-you-don't-need-it-now dept.

This is wide-ranging interview with Dev Patel and Poulomi Damany of BitYota, an Analytics as a Service startup that works specifically with MongoDB. Open Source? Not yet. But hopefully soon, they say. And why should an IT person or programmer care about marketing-oriented analytics? Because the more you know about functions in your company besides IT (such as finance, investor relations, and -- yes -- marketing), the more valuable you are as an employee. Dev also mentions the two main things he looks for when recruiting for BitYota: "One is intellect, and the other is attitude." He points out that this is not true merely of BitYota, but of any strong startup.
This is all good information for any job-seeker hoping to land a spot with a startup -- and for anyone who is happy with where he or she works but hopes to earn promotions and raises, too.

Dev
Patel:I am Dev Patel.

Poulomi
Damany:I am Poulomi Damany.

Robin
Miller:And
these folks work for BitYota which does analytics, specifically on
MongoDB. Why MongoDB if I may ask?

Poulomi
Damany:Okay. I think what we’re seeing is as today’s
fast changing mobile and web apps need the flexibility of a database
like MongoDB. So you can continue to add features and save that
directly without having to go through schema changes, without having
to do a new data model. So Mongo is being adopted by people who are
interested in doing fast development, new apps, and so we want to
help those people actually do analytics in that data in Mongo. So in
keeping with the theme of Internet insights need to be in real-time,
changes need to be in real-time, analytics need to be in real-time as
well. And that’s why sort of BitYota’s one area of focus
is in our MongoDB analytics.

Robin
Miller:So, you did not choose to go with MySQL owned by
Oracle, why not?

Dev
Patel:Semi-structure data is the new flexible data type as
Poulomi explained, application developers can change it, as the
application evolves, as applications going to be maybe testing, they
could add new features, and as soon as those new features are added
or removed, they want to understand the impact on user experience or
[comas] based on those new features. All this is happening fast in
the semi-structured world, that’s where operational database is
like Mongo are doing extremely well. And as Poulomi was explaining,
we want to be able to provide analytics as close to the time when the
data is being produced rather than have to wait till the end of day
or end of week, and these kinds of new requirements from the industry
where you want to provide analytics at low latency over fresh data,
is the kind of problems the industry needs to be thinking of solving,
and one way of doing that is to start solving them over the
next-generation SQL databases and provide analytics capable of using
those.

Poulomi
Damany:And really MySQL is atransactional system, so
you might be a start up and you might throw your payment systems
stuff into MySQL and then your website traffic, and your user
profiling and your product catalog in Mongo. And frequently what you
want is you’d want to join between those two, so you want to
say, all my best customers and how much money do they spend, right.
So you need a system that allows you to bring multiple structures of
data together.

Robin
Miller:I keep hearing is this emphasis, not just from you
guys on immediate analytics rather than waiting for the end of the
day, other than you’re broadcasting live the Olympics or
whatever and aside from a live broadcast who needs really their
analytics that quickly?

Dev
Patel:Let me give you a couple of examples, Robin.

Robin
Miller:Yes.

Dev
Patel:If you’re doing a marketing campaign and you’re
doing promotions, you’ve got end of season promotion and end of
product line promotion, you’ve got tickets at – I’ll
tease you a little, cricket match between England and Australia, or a
baseball

Dev
Patel:We have a shorter version too to please the American
audience, it finishes in 45 minutes. But you’ve got inventory
that will expire, whether it’s tickets, whether it’s
product lines, whatever that maybe, you’re doing a promotion,
you want to know whether a promotion to particular audience is
working, you don’t want to wait for the end of day to
understand whether it worked or not, you want to be able to answer
analytical questions like, hey, if it worked for this audience where
should we be doing it next, where should we push by geo or by
particular audience segments, by particular channel, hey I’m
doing very well, if I’m advertising through Twitter versus
Facebook. All of that you need to know quickly in many scenarios and
therefore you are continuously hearing the need that we want
analytics sooner than end of day. And therefore the popularity where
the business needs are being answered in a much shorter timeframe
than they were previously done before.

Robin
Miller:So it’s good that our IT and programmer
audience on Slashdot is aware of this because they tend to be frankly
not very marketing oriented, I’m not either, but the bosses do
come down and say do this and what you’re saying is, these
people, the IT people, remember this, they won’t be going, ???,
but they instead come across as educated to their bosses and
they will be ready to spring into action, and setup real-time are
close to a real-time analytics correct?

Poulomi
Damany:Yeah, agreed and there’s not just revenue
reasons, you could have a new version of an application and it’s
crashing on certain browsers, you want to wait 48 hours to understand
that, because you’d have varied desktop users, right, or you
have a fraud situation and have spam going on, all of these are the
need for analytics is moving upstream much, much quicker.

Robin
Miller:Those are very good examples, and ones that I –
like I said I think our IT – they’re ones that our IT and
programmer audience will definitely get their minds around, what else
for real-time analytics, what else?

Dev
Patel:Ability to join data from different sources is
important, often the example that I quoted earlier, your web, click
or viewstream or clickstream data from your mobile app can come in a
JSON document through our MongoDB, but your transactional systems are
still MySQL or Oracle, how do you join data between those two streams
very quickly? One stream is in JSON file format, the other is in a
structured file format like CSV, how do you join that? Traditional
systems can integrate with JSON type, so you have to convert them
into a structured form and then do the joints, so you’re going
through some translation of the JSON file to a structured CSV. If
your upstream application is changing the JSON, then your ETL needs
to continuously evolve which is this translation piece, it needs to
continuously evolve. How can we have systems where you’re
integrating with the native file format, JSON and CSV and you are
storing it in its native form without the need for translation, and
now you’re able to join across both these file formats to
answer questions like who are my big spenders in the last five
minutes, who are my big spenders in the last hour. How did my big
spender influence somebody else in a social stream whereby I got
another three new customers in the last hour?

Robin
Miller:That’s a good one.

Dev
Patel:As there is proliferation of information transfer
through social networks, through people communicating the word of
mouth through social, through mobile applications, how do you bring
that to understand things like user value, how do you bring that to
understand, hey, who’s influencing my staff, as an example.
Those are the kind of things, examples where data from different
sources needs to be joined and understood and that data is coming in
different file formats.

Robin
Miller:Well, I mean, you say something that’s exactly
opposite, we’re being happy and positive and optimistic here,
but it sounds to me like one of the big uses of real time analytics
is to spot negatives, as somebody just said, your product causes
cancer or whatever. It could be anything, but you know what I mean,
as something is ammunition to fight against being bad mouthed in the
social media.

Dev
Patel:Absolutely. I mean look, it’s a gruesome
examplef your product causes cancer. But, your point is critical in
that we have to catch negative messages or marketing people want to
catch negative messages as much as catch positive sentiments, it’s
the proliferation of negatives and correctly is what they want to
address very quickly because news travels fast these days.

Robin
Miller:And it does and I mean, this is a real reason why
people need to really just track everything anybody is saying about
them even if 99.9% is good, that last point one can kill sales even
if it’s false, a classic American example was the Tylenol
scare. And branded Tylenol is perfectly safe, it had nothing and yet
the company had to pull everything that was on the shelves and they
freaked out and they did it so well that people said, geez this is a
trustworthy product. One question here, JSON, still new, what is
JSON?

Poulomi
Damany:It’s Javascript Object Notation that’s
the full form of it. It’s a way to represent a data structure.
And so for a Java based application or a JavaScript written
application, all the newer technologies, newer languages, how you
represent the data is how you store it. So it becomes very easy for a
developer to say, oh my structure looks like Robin Miller is the
username, job is this, work share, and that exact structure gets
written into the database as opposed to writing it in the old MySQL
which would be, one table with the user, second with list of
professions, the third one with their habits or something, right. So,
JSON allows you to represent data in the way you want consume it
rather than the way it needs to be stored for efficient storage.

Robin
Miller:Okay, now you both work as executives, will it give
them in text your positions at BitYota. How did you which is does
nothing else but analytics, open source some or how do you
distribute, open source company?

Dev
Patel:We’re not an open source company, our
technology is built by ourselves. There are aspects of our technology
that we will look to open source in the next 12 months. We want to
develop things and once we’re comfortable with the stability of
certain aspects of our technology, we will open it up for more people
to use, to make it easier for other people to develop on and so on,
so that’s a phase of the company we will get into next year but
not this year.

Robin
Miller:Okay, so you’re not going with Eric S.
Raymond's “release early and often,” but you’re
doing the probably smarter corporate thing of waiting until you have
comparatively clean code, is that right?

Dev
Patel:Look, we do develop fast and release often just as a
development practice within the company, but we want to get a degree
of understanding of our own technology, maturity used by several,
many customers and we look to open source several parts of our
technology, whether all or not, I don’t know at this juncture.

Poulomi
Damany:I mean I have these discussions about what to open
source. I think the fundamental thing is, which is not open source
technology, you have to have a problem that somebody wants to help
you solve, right, so it has to be a big enough pain and people need
to be challenged by it, and so you can build a community around a set
of open source technology, otherwise it goes likeYahoo!
Not everything got picked up. So, it’s sort of like a buyer’s
market if you are an open source developer which is, I want to put my
time and my talent towards things that are worthwhile and we want to
find those before we say, here is our technology.

Robin
Miller:Fair enough. How did the two of you come to this
where you are working with basically searching and working with
databases, how did you get started, what should somebody know if they
want to follow in your footsteps?

Dev
Patel:I mean, the first thing is, understand what problem
you want to solve, big data is a massive word. Probably use
sometimes, but it’s definitely bringing the world to an
understanding that there is opportunities and nuggets in data. So,
first figure out what problem you really want to solve. Two, have a
team that has the ability to solve such a problem, but a founding
team that has significant experience of solving big data
infrastructure within the entire company.

Our
CTO’s who is not on this call, not on this interview, is a
database guru. He hails from the world of building database systems
at Informatics and Oracle, and then he was part of Hadoop’s
critical infrastructure development team for over three years before
we got together and founded this company. Our other founder,
co-founder is next in understanding how do we make real time
infrastructure. So we have the core team supported by a lot of able
engineers in building that infrastructure, so anybody who wants to
build data technologies, understand the specific problem we’re
going after, you got to make sure you have a bloody damn good team
out there.

Robin
Miller:Following their technology and working with it, I
think that’s good advice for anybody in any part of IT.

Dev
Patel:Correct.

Robin
Miller:So, what do you look for, and this is for our
younger audience, people that are getting started, what do you look
for in a potential hire? I’m assuming not just your company,
but that since big data is becoming useful, more useful and analyzing
that data is a very big thing in getting bigger, there’s a lot
of opportunities there, so how do you select the young engineer to
come in who works with you?

Dev
Patel:Well,
Robin, we’re a startup and essentially there are two things we
look for all the time. One is intellect and the other is attitude.
With the right amount of intellect and huge amount of attitude, any
new engineer or any new person coming into any startup, will really
propel the start up to the next level. Everyone’s contribution
is going to be a significant contribution. Again, that’s just
the nature of any start up, it’s not just ours or big data
stuff, that’s any startup. So looking for just those two
qualities in a person and identifying in an interview very quickly
and more importantly after the person is hired and showing that you
are right in those two capacities is critical.

I'm surprised BitYoga chose MongoDB for real-time analytics. Several years ago we attempted to do a real-time analytics solution with MongoDB but besides being a not so great performer when it comes to counting, it's boolean operators were still in its infancy. We ended up ripping out and replacing with another back-end solution in a couple of months and never looked back. Has MongoDB changed much to make real-time more realistic?

I'm surprised BitYoga chose MongoDB for real-time analytics. Several years ago we attempted to do a real-time analytics solution with MongoDB but besides being a not so great performer when it comes to counting, it's boolean operators were still in its infancy. We ended up ripping out and replacing with another back-end solution in a couple of months and never looked back. Has MongoDB changed much to make real-time more realistic?

Hi Dishwasha
We don't use Mongo for analytics - it is an upstream store for transactional data for us. We are a Data Warehouse Service for Analytics and we enable fast SQL-based analytics in our system for data from Mongo and other JSON (semi-structured ) sources
its fast because
(a) you don't need to do ETL on your Mongo/JSON data before analysis - so you save that time and the temporal value of fresh data is preserved
(b) we have a scale-out MPP architecture to add compute and storage as needed
check u

Because the more you know about functions in your company besides IT (such as finance, investor relations, and -- yes -- marketing), the more valuable you are as an employee.

Right, because every place I've ever worked in IT, they've been totally transparent and forthcoming about finance, marketing, and investor relations to make the people in the trenches more valuable. Oh wait, no, that never happened....

the more you know about functions in your company besides IT (such as finance, investor relations, and -- yes -- marketing), the more valuable you are as an employee.

Call them old-fashioned, but some employers actually prefer employees to focus on their area of expertise. If there is something to know about other fields, the employers has other experts that will tell what is needed.