Demystifying Machine Learning
Research departments are under pressure. They are expected to deliver faster, cheaper and more impactful insights than ever before.
Instead of doing more and faster research, insight departments are also able to revisit existing data sources.
Often, companies have plenty of valuable data sources at their disposable, without being aware of their full potential. Moreover, relevant databases are often publicly available via APIs or sold via data brokers.
Yet, the biggest hurdle lies in making sense of this abundance of data. Companies are struggling to connect different sources due to different structures, missing values and other complexities.
In the last years enormous advancements have been made in machine learning and data science. Although demystifying is needed in order to better understand this discipline in context.
With a creative and pragmatic mind-set, the problem can be solved by borrowing techniques from this field of data science.
We show a case – in the beverage industry – where we exploited existing data sources to uncover a hidden layer of insights.
Machine Learning: A Brief Explanation
Machine learning has become a hype, and is used for a various number of things not necessarily correctly.
When you ask people to define machine learning you usually receive the following answer:
Advanced analytical tools which have the ability to learn by so-called self- learning algorithms
But what does self-learning mean?
We can distinguish within machine learning three different types:
- Reinforcement Learning.
Within Supervised and Unsupervised learning no self learning is taking place. In both cases, the learning comes from the amount of data available: the more data available, the more ‘learning’ the system is.
The only type where self learning takes place is Reinforcement, also applied in robots.
It reflects roughly only 1% of all machine learning applications.
Figure 1 shows an overview of the main machine learning tools where half consist of statistical techniques.
Also for these statistical techniques counts; the more data, the higher the predictive value and better estimated results.
We think that machine learning needs a proper definition.
In our view machine learning is an algorithm which:
- Is not assuming certain distributions of the variables /features (no statistical testing)
- Is applied to generate a relationship between input and output but you are not interested in the coefficients
- Is usually using three sub sets to validate the model
- Training: to generate the model
- Validation: to optimise the model
- Testing: to verify the model
This distinguishes real machine learning tools from statistical applications.
The application of machine learning within market research can be seen in automation of research (bot applications, drop out predictions), Text Analytics and Mining, Social Media Analytics, Image Mining and Predictive Modelling.
Read the full article here on - Insight Platforms.