What Should I Do If My Data Iis Bad For Machine Learning

By Bilal Mahmood, Commodities.

Bullseye!

There are a number of automobile learning models to choose from. We can use Linear Regression to predict a value, Logistic Regression to classify distinct outcomes, and Neural Networks to model non-linear behaviors.

When we build these models, we ever use a set up of historical data to help our machine learning algorithms acquire what is the relationship between a set up of input features to a predicted output. Only fifty-fifty if this model tin can accurately predict a value from historical information, how do we know it will work as well on new information?

Or more than patently, how do we evaluate whether a machine learning model is actually "good"?

In this mail nosotros'll walk through some common scenarios where a seemingly skilful machine learning model may still be incorrect. We'll show how yous tin evaluate these issues past assessing metrics of bias vs. variance and precision vs. recall, and present some solutions that can help when you come across such scenarios.

Loftier Bias or High Variance

screen-shot-2016-11-26-at-6-38-21-pm

When evaluating a machine learning model, i of the get-go things y'all want to appraise is whether you have "Loftier Bias" or "Loftier Variance".

High Bias refers to a scenario where your model is "underfitting" your example dataset (encounter figure above). This is bad because your model is not presenting a very accurate or representative pic of the human relationship between your inputs and predicted output, and is often outputting high error (e.g. the departure between the model'south predicted value and actual value).

High Variance represents the opposite scenario. In cases of High Variance or "overfitting", your auto learning model is so authentic that it is perfectly fitted to your case dataset. While this may seem similar a practiced upshot, it is also a cause for business concern, as such models oft fail to generalize to future datasets. So while your model works well for your existing data, y'all don't know how well it'll perform on other examples.

Simply how can you know whether your model has High Bias or High Variance?

One straightforward method is to do aTrain-Test Split of your information. For instance, railroad train your model on 70% of your information, and then mensurate its error rate on the remaining 30% of data. If your model has loftier error in both the train and test datasets, y'all know your model is underfitting both sets and has High Bias. If your model has depression error in the preparation set but loftier error in the test set up, this is indicative of Loftier Variance as your model has failed to generalize to the second gear up of data.

If you lot tin generate a model with overall low mistake in both your train (past) and exam (future) datasets, you'll have institute a model that is "Only Right" and balanced the right levels of bias and variance.

Low Precision or Low Recall

screen-shot-2016-11-27-at-9-09-02-am

Even when you take high accuracy, it'south possible that your machine learning model may be susceptible to other types of error.

Take the case of classifying email as spam (the positive grade) or not spam (the negative class). 99% of the time, the email you receive is not spam, but perhaps 1% of the fourth dimension it is spam. If we were to railroad train a auto learning model and information technology learned toalways predict an email as non spam (negative class), and so information technology would be accurate 99% of the time despite never catching the positive class.

In scenarios like this, it's helpful to wait at what percent of the positive grade we're really predicting, given by two metrics of Precision and Call up.

screen-shot-2016-11-26-at-9-56-34-pm

Precision is a measure out of how oftentimes your predictions for the positive class are actually true. Information technology's calculated as the number of True Positives (due east.g. predicting an e-mail is spam and it is actually spam) over the sum of the True Positives and Imitation Positives (e.g. predicting an email is spam when it's not).

Think is the measure of how often the actual positive class is predicted as such. Information technology's calculated as the number of Truthful Positives over the sum of the True Positives and False Negatives (e.1000. predicting an email is not spam when it is).

Another way to translate the difference between Precision and Recall, is that Precision is measuring what fraction of your predictions for the positive class are valid, while Call up is telling you how often your predictions actually capture the positive class. Hence, a situation ofLow Precisionemerges when very few of your positive predictions are true, andLow Recollect occurs if virtually of your positive values are never predicted.

The goal of a good automobile learning model is to go the right balance of Precision and Recall, by trying to maximize the number of True Positives while minimizing the number of False Negatives and False Positives (as represented in the diagram higher up).

5 Ways to Better Your Model

screen-shot-2016-11-26-at-6-38-57-pm

If you face issues of High Bias vs. High Variance in your models, or have trouble balancing Precision vs. Recall, in that location are a number of strategies you can utilise.

For instances of High Bias in your car learning model, you tin can tryincreasing the number of input features. As discussed, Loftier Bias emerges when your model is underfit to the underlying information and you take high error in both your train and examination set. Plotting model fault as a part of the number of input features you are using (see figure to a higher place), we discover that more features leads to a better fit in the model.

It follows then in the contrary scenario of Loftier Variance, you canreduce the number of input features. If your model is overfit to the training data, it's possible yous've used also many features and reducing the number of inputs will make the model more flexible to examination or future datasets. Similarly,increasing the number of training examples tin can help in cases of high variance, helping the machine learning algorithm build a more generalizable model.

For balancing cases of Low Precision and Low Recall, you can modify the probability threshold at which yous classify the positive vs. negative class (meet figure above). For cases of Low Precision y'all canincrease the probability threshold, thereby making your model more bourgeois in its designation of the positive class. On the flip side if you are seeing Low Call back you mayreduce the probability threshold, therein predicting the positive class more oftentimes.

With plenty iterations, its hence oftentimes possible to detect an appropriate machine learning model with the right balance of bias vs. variance and precision vs. remember.

This blog mail service is based on concepts taught in Stanford's Machine Learning course notes by Andrew Ng on Coursera.

Bilal Mahmood is a cofounder of Bolt. He formerly lead data warehousing and analytics at Optimizely, and is passionate about helping companies plough information into activity.

Bolt is a predictive marketing layer that helps companies connect, predict, and personalize their user experiences. The platform automatically connects user personas across analytics and payment solutions, and leverages car learning to predict and meliorate whatsoever conversion or churn event

Original. Reposted with permission.

Related: