Recently, I had the opportunity to build a regression model for one of FTS Data & AI‘s customers in the medical domain. Medical data poses an interesting challenge for machine learning experiments. In most cases when running algorithms for binary classification, the expected result in the training set will contain a large percentage of negatives. For example the goal of an experiment might be to predict if – based on a set of known clinical test results – a patient has a certain medical condition. The percentage of positive results in such a set, if it is a generic dataset for a vast number of medical conditions will most likely be very low. As a result a machine learning model when initially tested using a small set of chosen features will most likely come up with a high number of false negatives.
The latter however is a big problem in experiments involving clinical data, i.e. categorising that a patient does not have a certain medical condition incorrectly could have disastrous consequences. Once a confusion matrix is built, the model’s effectiveness is measured using indicators such as area under curve, accuracy, precision, recall and F1 score. In medical datasets, recall plays a big role as it measures the impact of false negatives. It can therefore hold significant weight in determining the most appropriate model for a given experiment.
The definition of recall is –
Recall = (True Positives) / (True Positives + False Negatives)
In the confusion matrix, the denominator in this equation makes up the total actual positives. So, recall therefore is effectively measuring the correct positive predictions over the actual number of positives in the dataset. If there were no false negatives, recall would be at the ideal score of 1, however if a large number of actual positives were predicated as negatives (i.e. false negatives), recall would be much lower.
As the model evolves and more relevant features are chosen for prediction, recall should start improving. In domains such as medicine where false negative predictions can have dire consequences, the recall score should play a vital role in choosing the most optimum model.