DSC_0069 Zach Doty Cover Photo for What is data science

What is Data Science?

Welcome back, campers! It’s been a minute (read:months) since I’ve last posted, and we’re back! (For now.)

Today’s topic, data science: supposedly the latest exploding field, critical to every enterprise.

Why is it important and relevant? The rise of big data has created a relatively untapped treasure trove of insight. However, it’s undeveloped! Further, the tapping of this insight requires a blended skill set which is currently in short supply in the market: the data scientist.

Who and what is a data scientist?

A data scientist is someone who finds new discoveries in data. They investigate hypotheses and look for meaning and knowledge within the data. They visualize the data by creating reports and looking for patterns. What distinguishes a data scientist from a traditional business analyst is the use of algorithms. Algorithms are one of the fundamental tools for data scientists. This requires mathematics knowledge, computer science savvy and domain knowledge.

What does it mean to be a data scientist?

A data scientist may handle open-ended questions such as, “Which customers are more likely to churn?” The data scientist would gather all the data, and run algorithms to find dependable patterns to improve the situation. Seems straightforward, yes? However, there are a range of misconceptions about data science and data scientist For example,  a data scientist may not necessarily be a developer-only or business intelligence analyst-only.

A data scientist will be able to combine both technical know-how and business domain knowledge into mathematics and statistics for maximum effect. That being said, true data scientists are extremely difficult to find and train. However, it may be possible to become a data scientist without expensive and time-consuming degrees, via focused tools and application training.

An oversimplified Venn diagram showing the makeup and value of a data scientist

An oversimplified Venn diagram showing the makeup and value of a data scientist

More Notes on Data Science

When considering data science from a managerial perspective, it’s important to understand the current broad allocation of the average data scientist’s time. An estimated 60-705 of a data scientist’s time is spent assembling and cleaning data, tasks which could be delegated to technical specialists, data integration specialists and so forth. (For example, text mining, SQL queries and so forth.)

If you’ve followed my site and blog lately, you’ll noticed I’ve lapsed a bit on posting. I’m trying to get back into sharing my education again, so stay tuned. Things have just been busy lately. 🙂