What is data science?

Posted on Posted in Data

Data science is an interdisciplinary field about processes and systems to extract knowledge or insights from data in various forms, either structured or unstructured.

A lot of people are trying to define the term data scientist. However it is still a ‘fuzzy’ term. At the Joint Statistical Meetings of American Statistical Association, applied statistician Nate Silver said, “I think data-scientist is a sexed up term for a statistician. Statistics is a branch of science. Data scientist is slightly redundant in some way and people shouldn’t berate the term statistician.” A fast amount of blogs, articles and other information channels aim to define Data Science. This blog will use the Venn diagram of Drew Conway to further explain Data Science. Drew Conway is a leading expert in the application of computational methods to social and behavioral problems at large-scale.

Hacking Skills:

Data is a commodity traded electronically; in order to be in this market you need to speak hacker language. This does not mean that you need to require a Computer Science background. But being able to use the command-line to open, store and manipulate files, understanding vectored operations and thinking ‘algorithmically’ is very useful. You should be able to prepare the dataset before you can start with your analysis, this is called data munging. The data transformation are typically applied to distinct entities within a data set and could include such actions as extractions, parsing, joining, standardizing, augmenting, cleansing, consolidating and filtering to create desired outputs that can be used for your analysis.

Math & Statistics Knowledge:

When you have the cleaned data you can start extracting insights from it. In order to do this you need a good understanding of math and statistics. You need to be able understand all the summary statistics (e.g., averages, standard deviation, correlations, p-values, alpha’s, beta’s etc…) and the more complex mathematical tools. Otherwise you will get a black box effect, this means that your mathematical algorithm spits out values but you don’t have the slightest clue on what has happened with the data. The danger with this is that results may seem good but are incorrect. (You will get in the Danger Zone!)

Substantive Expertise:

On this point Conway thought different from most of definitions of data science. With only data, math and statistics you will get machine learning. Conway stated: “Science is about discovery and building knowledge, which requires some motivating questions about the world and hypotheses that can be brought to data and tested with statistical methods.”

Data science can boost your company!

So how can these combined talents boost your company? There are several simple and fast benefits you can gain if you use data science:

  1. Make well-informed decision based on your data
    All business leaders try to make objective, well-informed decisions, however everyone is subject to the occasional error in judgement or misguided choice. Data can back up decisions with hard evidence and provide balance in situations where opinions vary widely or emotions run high. This way you can eliminate decisions based on ‘gut’ feelings.
  2. Data delivers a competitive advantage
    Big data is an increasingly important consideration for companies that want to get ahead in their industry and outperform their competitors. A survey from Bain & Company of more than 400 large businesses. Found that organizations that have the most advanced analytics capabilities do tend to pull ahead of industry peers.
    David Court a director at McKinsey & Company, stated the following about the importance of big data:
    “Big data and analytics actually have been receiving attention for a few years, but the reason is changing. A few years ago, I thought the question was ‘We have all this data. Surely there’s something we can do with it.’ Now the question is ‘I see my competitors exploiting this and I feel I’m getting behind.’ And in fact, the people who say this are right.”
  3. Data drives revenues increases
    The University of Texas conducted the following study: “Measuring the Business Impacts of Effective Data.” This study dug into data sets from Fortune 1000 companies in every major industry to analyse the impact data has on key business performance metrics. The results revealed that making very small improvements to existing data can reap big benefits. Some of the findings stated that companies could:
    – Increase revenue by more than $2 billion a year by increasing data usability by just 10%
    – Increase return on equity by 16% by increasing both the quality of data and the ability of salespeople to access it by just 10%
    – Increase return on investment by 0.7% (which equates to $2.87 million of additional income) by increasing both the intelligence and accessibility of data by just 10%

From all these results it should be obvious that data science really is an important part of science, and should not be underestimated.

Discover what data science can do for your company – dataplied.com

Sources:

http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram
https://en.wikipedia.org/wiki/Data_science
http://blog.crossover.com/big-data-for-business
http://www.datascienceassn.org/sites/default/files/Measuring%20Business%20Impacts%20of%20Effective%20Data%20I.pdf
http://www.bain.com/publications/articles/the-value-of-big-data.aspx
https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century
http://researchhubs.com/post/ai/introduction-to-data-science/what-is-data-science.html