Becoming a Successful Data Scientist: All You Need to Know

In this sequence of articles, we will find answers to the following questions.  

  • What is data science?
  • How old is it?
  • How do I figure out how to become a good data scientist (hint:  you need not be a great programmer or a mathematician)
  • How should I plan my data science career?

We expect no technical background to understand these articles.  However, these are mostly useful for those who just finished some form of a bachelor’s degree.

What is data science?

Data science is a field of science that uses mathematical techniques and computer programming to extract patterns from the data and visualize them effectively!

Got you!!  Well, when you say it like that it looks sort of crazy.  But, let me explain it differently.  
Let us say, you want to make a decision about something important but logical.  It could be about choosing a job from the three offers you got to decide which vehicle you want to buy.  How do you go about it?  Well, depending on your nature.

You may read about it on social media…

Research about it in the library…

Or maybe just ask a lot of people.

Essentially, you collect a lot of data relevant to that question.  Analyze it carefully and do what worked best before for others.  

Congratulations!  You just practiced data science.

It turns out that, like you, companies also need to make a lot of decisions.  But, as these decisions involve many more risks, they look at a lot more data than you did.  They collect data from 10,000 documents, 100,000 people, and million tweets.  Obviously, analyzing so much data is not humanly possible.

So, they turn to mathematics to do their magic.  

Over the years, mathematicians developed some great techniques (they call them algorithms) that can look at data and analyze it carefully and tell what worked in the past and what did not.  

Companies use this method (collecting a lot of data, analyzing it using mathematics) and take some great decisions.

  • A super specialty hospital can look at data of 100,000 doctors treating a million patients and figure out the best practices. 
  • A telecom company can look at all the service failures of the past and learn under what circumstances they are likely to see failure and try to prevent it.  

It does not matter which business it is!  You can take a much better decision if you look at a lot of data and figure out what worked and what did not.  

A diversion…

We have been conducting data science awareness sessions for years and year after year and session after session, we get the same question without a fail.  
“I work for an xxx company.  Can we use data science?  I have a yyy background.  Can we use data science?”  
We are exhausted…

Here it is! One last time.  Do you ask yourself, “I have an xxx background and work for a yyy company?  Do I benefit from making better decisions?”  Fill in your choice for xxx and yyy.  You see, it does not matter what your xxx and yyy are…  

Data science is not about a science.  It is about making better decisions.  
That is why it has become as important as an M.B.A degree (perhaps even more important) for today’s employers.  

Just how old is data science?

Interestingly, data science has become popular now but was in practice in one way or the other for a long time.  

OK.  Tell me, who invented it. Go ahead take your time...

Well, here is a clue.  If someone asks you who invented something in physics if you know the answer, great!

Else guess one of the two below


You have a fair chance of being right.

Similarly, in mathematics, an equivalent of these two greats is Gauss (he more or less invented everything there was).  

He was the first one to define mathematical ways of extracting insights from data.  However, Sir Francis Galton was the first to publish extensively on this area and proposed what goes by the name regression in the 1880s.

By the way, he is the half-cousin of Charles Darwin.

The first work in data science came about 150 years back.  Of course, a confluence of philosophies, indulgences and technologies makes it so popular today.

  1. Open source philosophy is enabling easy access to complex mathematical algorithms in a plug and play format
  2. Social media is generating a lot of data
  3. Computer hardware is becoming cheaper and powerful continuously so storage is easy.  
  4. IoT is fueling the data creation capabilities

So, large volumes of data, cheap storage and compute power coupled with powerful and yet free algorithms are making companies lap up “data science” way of solving problems.  They really want a lot of people who can use this method of solving problems.

In the next article, we will explain how you can acquire these skills meaningfully. You can read all about it here.

If you already know how to acquire these skills, then great! You can read our article on how you should plan your data science career here.

Leave a Reply

Your email address will not be published. Required fields are marked *