Data Science: The Voice of Data

Why is the world going crazy about her?“ There’s a bit of me everywhere, you just need to stand and stare ” — DataScienceData Science has marked its presence in almost every industry in the world.

Probably that’s why she is the talk of today’s tech-savvy world.

Here’s how our panelists use data science in their day to day life.

Sahar Rahmani: An astrophysicist by education, she happened to implement big data analysis and machine learning on astronomical data during her Ph.

D.

and she was amazed to realize that data science is a “ wonderful storyteller ”.

In order to explore data science in greater detail, she joined a leading Bank’s cyber fusion group.

She now leads a team of Data Scientists and helps the bank detect cyber threats that cannot be detected using traditional techniques.

She also says that data science is being extensively used in banks for providing better and faster services to customers, insurance, security purposes and so on.

Babak Samareh: A former professor of the University of Toronto and a Ph.

D.

in Mechanical Engineering with a focus on Numerical Simulation recollects how he decided to become a Data Scientist.

“At some point in my life, believe it or not, I got tired of engineering and I decided to try new ventures and back in that time Data Science was a hot topic”, admits BabakHis mathematical skills and curiosity for data science helped him join a leading financial institution as a data scientist.

He adds “It was not the hype or buzz word but the extensive use of data science in my own startup NeuroBlot that helped me realize the potential in data science and hence I decided to become a data scientist.

”Currently, he enjoys detecting anomalies in financial transactions with advanced machine learning and graphical models and he likes how the power of predictive modeling is being used for trading in banks.

As a founder of NeuroBlot, a health care company, recognized as one of the top 10 companies by UTEST and MaRS for 2017 cohort, he uses data science to analyze anonymized patient data, particularly from Kentucky hospital and social media to detect early stages of Alzheimer’s disease.

He is currently working on an assignment which would enable patients to run tests from home, thus cutting down the time and cost of patient visits to the hospital.

Ehsan Amjadian: Data Science has always been his passion.

With more than a decade of experience, Ehsan Amjadian is currently working as a senior data scientist in DNA, the heart of Artificial Intelligence, in one of the leading Banks of Canada.

He researches on cutting-edge technologies related to deep learning and machine learning to build effective business solutions.

Not only does he enjoy learning and implementing but also believes that transferring knowledge is his duty.

Ehsan, a Ph.

D.

in deep learning, machine learning, and machine vision.

He also heads TDLS which is a discussion series designed to enlighten data science practitioners with highly advanced techniques in machine learning, deep learning and at times quantum computing.

but are there times when even she fails to perform well?“Well!!!.most of the times”, was the unanimous answer.

“You begin with a hypothesis, run your experiments, and unfortunately you fail.

Each time you fail, you learn from the mistakes, update your hypothesis and iterate the process multiple times till you have good enough results to prove that either you have accomplished the mission, or the target is not achievable”, they continued.

“Once I was looking at some time series data according to which the bank should have been bankrupt five years ago.

” — recollects Ehsan“Usually there are two aspects: Why is the model not performing well or why am I seeing such erroneous results.

In either case, the agile model is preferred as it supports development in small pieces and continuous updates.

If it’s a performance issue, you prefer to fail fast so that you can update your model accordingly.

In order to tackle the second aspect, you need to have regular stand up calls to review your model and verify the results with domain experts.

Many a times data is recorded in different ways for different domains and hence things which make sense in data science may not make sense to business and vise-versa.

So, the time series data that I was talking of earlier is a good example of such issues.

” — Dr.

AmjadianWe can all agree that the key to boosting the performance of any industry lies in the ability to understand the data well.

But what’s the best way to do that.

Can Data science help us out?Is Data Science the voice of data?“Traditionally businesses were operating according to the intuition and opinions of domain experts and most of the time they were correct.

However, this takes objectivity away from what can be done.

If we can mathematically analyze the data and show the insights through visualization, then business would be amazed by the results, and that’s the power of data science.

”, — Ehsan.

What are the typical data problems that Data Science is afraid of?“The challenge is that there is always work to do with data and that would never go away entirely.

Many a times data quality itself is an issue, for example, some information is missing and there is almost no way we find the necessary information.

That is a barrier for Data Science.

Another challenge is that given the nature of data, it limits the application of certain processes.

”, says EhsanEhsan suggests that teams should work on a collaborative approach wherein the stakeholders should discuss the problem instead of the final product that they want, as many a times there is a huge gap between the two.

Babak furthers the suggestion by bringing out the problems that data scientists normally face while gathering data.

“Data collection and data wrangling in big organizations is quite difficult because there is no unified platform for doing so.

Each data source has different data custodians and stakeholders and hence accessing the data involves a lengthy procedure of walking down the lane of different rules and regulations” — BabakHe hopes that in the near future every industry would have an additional layer of infrastructure that would do the heavy lifting of data collection, cleaning, and making it available to other teams, since lots of data science teams are emerging and everyone needs to follow the same tedious process to begin their analysis.

Data Science is flamboyant and enigmatic.

As our panelist demonstrate the different shades of Data Science, our urge to work with her becomes stronger.

However, the question remains; can we reach out to her?.Do we have the required expertise?Do we have the expertise to work with her?According to Ehsan, a data scientist’s treasury must sparkle with jewels likeMathematics — good understanding of the Math behind machine learningMachine Learning — A good understanding of Machine Learning and Deep Learning techniquesBig Data — Big Data programming techniquesVisualization — Visualization is a good storyteller.

It helps to communicate the findings to higher management in a better way, thus adding greater value to it.

Communication skills — Good Communication skills is a daily part of data science job.

Continuous updating — The most important thing for a data scientist is to keep updating himself continuously as things are changing very rapidly.

InnovationManagement.

se [Digital image].

(2013).

Retrieved March 15, 2019, from http://www.

innovationmanagement.

se/2016/04/12/innovation-knowing-where-to-begin/Where should the quest for “treasure hunt” begin — In academia.

In industries.

In MOOCS…!!Ehsan believes, “Academia is constantly updating their courses as per industrial requirements, at least in Canada.

MOOCs are good, in fact, some of them are amazing, but there is a lot of material out there, which makes it difficult to choose from.

These courses are also basic in nature or are targeted to a specific product of a company.

Technical conferences are also interesting and inform us about the most recent research areas but they usually discuss advanced topics which are difficult to grasp for newbies.

Second checkpoint — MOOCs : Let the online material support you in your implementation.

Third Checkpoint — Conferences talks: Update yourselves with the new inventions.

Babak adds, “Industries and Universities have different goals for Data Science.

In the industry, the analysis and predictions focus on adding value to the business.

In schools though, it is more research-based.

It is important for an individual to explore both sides of the coin.

”He further says, “Most of the industries offer great learning programs and co-op opportunities which enables students to have a taste of industrial use of data science.

”“Academia + co-op” — Babak’s formula to expedite learning objective and adapt to the rapidly changing business modelsSahar agrees to this but recommends that other than learning to build models, schools should emphasize on data cleaning techniques.

She has often observed that people are quite efficient in building models but lack in finding insights from data wrangling activities and the probable reason could be that they were always provided with clean structured data which is a myth in the real world.

Is that all we need to know about Data science?.Most of the advanced machine learning models require a lot of training data to produce accurate results.

Is data privacy at stake to quench this data thirst?.Has Data Science led to an increase in cybercrimes?European Business Magazine (2018, March 19).

Retrieved March 15, 2019, from http://europeanbusinessmagazine.

com/business/dark-side-big-data/We have been talking about her charm so far, but does she have a dark side as well?“It’s true!.Machine Learning is being weaponized for both privacy and propaganda issues” — unanimous reply“Earlier attackers used brute force to breach security protocols but now they can leverage the power of machine learning to generate false information or hack sensitive data.

Therefore it is important that we also use data science to prohibit such breaches in security as it certainly gives us the power to do so”, says Sahar and Ehsan.

“ The ethical way to implement machine learning is to anonymize or encrypt personal information such as name and address because machine learning never actually needs that data.

There are many apps that run a process in the background to collect user data and sell it to other companies.

Such practices are unethical but there is no harm in performing exploratory analysis on any data as long as it is anonymized.

It helps companies to do targeted marketing which is beneficial to both ends.

But companies who own the data, particularly the social networking sites with huge databases, should assert more control over how data is being used.

So, for instance, President Trump’s campaign used data analysis to manipulate people whereas I used it for my business to enhance customer satisfaction index.

” — BabakHe rightly says,“It is like a double-sided blade.

You can use data science to do social engineering or manipulate people’s minds.

”This brings us to the end of our story “Data Science: The voice of Data”.

The discussion was very insightful and we are grateful to our immensely talented panelists for enlightening us with their expert views on Data Science.

On an ending note, Ehsan had a beautiful message for all students aspiring to become data scientists.

He suggests,“Don’t get discouraged or distracted by all the noise out there.

There are lots going on with Data Science but don’t be demotivated by this fact.

Don’t think that you are lagging behind.

I can almost guarantee that nobody knows everything in Data Science.

Just follow your passion.

Academia helps you learn the fundamentals of Machine Learning and Data Science.

Keep learning and specialize in the area of your interest.

We live in a very noisy world but do not let the noise distract you or discourage you.