We expect no technical background to understand these articles. However, these are mostly useful for those who have some form of a bachelor’s degree.
How do I become a data scientist? People ask me this question always. They read about the opportunities provided by data science somewhere and get excited. But, as they do not know the next steps, they come to me.
Do I need to revise my 12th grade mathematics?
Do I need to learn python or R?
Just what all do I need to do to become a successful Data Scientist.
Confused? Don’t worry.
In this article, I shall give you a very simple framework to think about this aspect systematically.
For this article’s purpose, I assume that you have a college degree and less than 5 years of work experience and are looking for a job or a change of job.
Next week, I will explain the same from an enterprise viewpoint.
Before anything else, figure out what you would like to become. Yes. Ask yourself the question below and try to explore the answer.
What would you like to do for the next 10 years at work?
Do you enjoy coding and mathematics? (so much that for 10 years, every day you can do that for several hours?)
Do you enjoy building things? Are you one of those who always liked taking apart the toy and fixing it back again?
Do you enjoy thinking about a problem logically and explaining the solutions systematically to less informed people?
Decide where you want to see yourself. The cool thing is thatData Science allows you to get into any of these roles and grow.
Decided? Awesome! You made the most important first step in your journey to becoming a successful data scientist.
INSOFE is one of the very few institutes in the world that offers data science programs for every aspiration you have. We do not believe in one size fits all. We understand you are unique and so are your goals.
Now, let us see what you need to do to become a data scientist based on the choice you just made. But, before that, let me introduce you to some jargon
Great scientists keep inventing some cool mathematical processes that can look at the data and extract useful information better than previously available tools. These mathematical processes are called “Algorithms”. They are independent of any data or any problem. They are simply mathematical formulae coded as software.
I know it is a bit confusing. So, let us take an example of a charted accountant. A CA learns, in the college, the fundamental frameworks to build a balance sheet or business model. By the time she graduates, she has the ability to work with any firm and build their business plan or balance sheet.
He/she, in our language, is an algorithm. They have the processes that are independent of any data, firm, or department to do certain tasks.
No! sorry for getting you excited. The ones I am going to talk about are rather boring actually!
Some data scientists take existing algorithms and run them on a sample of the data they put together carefully. The algorithm then produces a specialized formula that works for that specific type of data. It can then look at any other part of that data and make valuable predictions and forecasts.
It does not work on some other data of some other business as effectively. This specialized formula that is churned out by an algorithm after it worked on a specific data is called a “model”.
Coming back to our CA analogy, when she joins a company she uses her knowledge (algorithm) and creates an excel sheet with all formulae pertinent to that company’s data. She has now built a model on that data.
Any department in the business can use this model to make predictions.
Algorithms and models normally give very dull looking models, numbers, formulae as output.
If you don’t believe me, try taking a CA on a date!
They are useful. But, are extremely difficult to understand.
Some data scientists learn how to build impactful stories and narratives around these outputs. The purpose is to help non-technical people, leaders quickly understand the predictions of the model and its effect on the business.
So, visualizations present models and algorithms powerfully so that people can quickly make good decisions.
In our CA analogy, the final PPT the business person makes to the board or investors based on the excel the CA gave is the visualization. It has the output of the model presented in the simplest but most impactful format.
Great! Just take a small breath and recap the terms algorithm, model, and visualization.