Hello and welcome to a new blog post all about becoming a Data Scientist. If you have not heard of Data Science, you are in for a treat. If you have heard of it, you are in for a treat as well!
But the $64,000 question either way is what is Data Science and should you become a Data Scientist? Well, in this post I am going to attempt to answer both questions as simply as possible. Read on to find out more…
What Exactly Is Data Science?
If you are like the average Joe, you may only have a vague understanding of what being a Data Scientist is all about. To help you understand, it would probably be useful to find out where Data Science came from in the first place. I would like to cite this Quora post for help in writing this part of my post.
In all started way back in the dark ages, before the invention of electricity and jobs – the 1960s – when a statistician called John Tukey came up with the idea of data science. He made it a branch of science, rather than mathematics, reasoning that data science would deal with real data, as opposed to the assumptions and logic related to mathematical statistics.
Are you still with me? Only a couple of more paragraphs, I promise.
Generally clever chaps continued to develop data science over the next half century or so, becoming known as Applied Statistics. Applied Statistics had two branches, Predictive Analytics and Data Mining. Fast forward a bit and SAS (Statistical Analysis System) Analytics was developed.
Finally, Data Science appeared as money was thrown at more clever chaps to research, teach, and develop lots of new software; for example, MATLAB, R and Python. It should be noted that the development has not stopped and will no doubt continue to evolve as new and better tools are developed.
What Is a Data Scientist?
Okay, you’re saying. That’s all very interesting (not!) but what is a Data Scientist? That’s a good question, I’m glad you asked.
A Data Scientist tries to make predictions using statistics and machine learning. Data scientists need to know how to deal with large amounts of data, a.k.a big data. Below on the right is a pretty good diagram* that more or less says it all.
They need to be good at maths, statistics, and computing. But that’s not all. They also need to be Business Analysts, builders of data products and software platforms, and developers of visualisation and machine learning algorithms.
Let’s look at job roles and responsibilities of a Data Scientist in a bit more detail.
As a Data Scientist, you’ll select features, build and optimise classifiers using machine learning techniques. You’ll also mine data using state-of-the-art methods. Additionally, you’ll enhance data collection procedures, including relevant information for analytical systems.
That’s not all. You’ll also need to know how to process, cleanse, and verify the integrity of data used for analysis, do ad-hoc analysis and present results in a clear manner. Finally, you’ll need to be able to create automated anomaly detection systems and constantly track performance.
If you don’t understand all the above, I can’t say I blame you. However, it certainly sounds like a Data Scientist is a bit more exciting than the name suggests. But why else should you become a Data Scientist? Let’s have a look below.
Why Become a Data Scientist?
The problem, as you have no doubt already worked out, is that the job title ‘Data Scientist’ is a bit of a catchall term. You may end up as a Data Scientist, but there is quite a high chance that you won’t start as one. It’s more than likely that you will start as one of the following:
- Data Engineer
- Data Architect
- Data Administrator
- Data Analyst
- Business Analyst
- Data/Analytics Manager
- Business Intelligence Manager
Over the months to come, I will attempt to link all the above to job descriptions and possible courses you can take. So you may see no links, some links, or all of the above as links. What aren’t linked, you can do a quick google search to find the information you want.
But I am sure there is another reason why you are thinking about becoming a Data Scientist, and that is because you have heard the money’s good. Well, you’ll be glad to know that seems to be true. Check out this link to find out what you could earn.
You’ll also be glad to know that the job market is only going to get better. So if you are young and thinking about what to do at university, you could do a lot worse than this career. Data Science jobs will continue to rise as the amount of data generated continues to rise.
And as long companies both big and small continue to generate data on a massive scale, they’ll need Data Scientists. And the very best of those will be earning fortunes working for companies such as PayPal, Google, Twitter et al. You could well be one of them.
What Are Your Next Steps?
I would strongly recommend you check out some courses to decide whether Data Science is for you. And I think you could do a lot worse than check out Coursera. They will give you a strong inkling as to whether you are thinking along the correct lines.
*I have no idea where this diagram comes from originally. Please contact me if it’s yours.