Medical Imaging AI Glossary

Understanding what determines accuracy can be confusing. But it doesn’t have to be!

To help you understand the most common terms used in artificial intelligence, machine learning, and statistics, we have composed a glossary of the most common terms you’ll hear in the Medical Imaging AI space.


The accuracy of a machine learning classification algorithm is one way to measure how often the algorithm classifies a data point correctly. Accuracy is the number of correctly predicted data points out of all the data points. More formally, it is defined as the number of true positives and true negatives divided by the number of true positives, true negatives, false positives, and false negatives. A true positive or true negative is a data point that the algorithm correctly classified as true or false, respectively. A false positive or false negative, on the other hand, is a data point that the algorithm incorrectly classified. For example, if the algorithm classified a false data point as true, it would be a false positive. Often, accuracy is used along with precision and recall, which are other metrics that use various ratios of true/false positives/negatives. Together, these metrics provide a detailed look at how the algorithm is classifying data points.

AUC – Area Under the ROC Curve

ROC curve is a performance measurement for a classification problem at various thresholds settings. The idea is that typically the output of AI systems contains a confidence level, and you can set a confidence threshold such that everything with higher confidence will be classified as positive, and lower would accordingly be negative. For example, you can say that you classify as positive, only when you are very sure (95% confidence) or even when there is a slight chance (20%).

By changing this threshold, on the one hand, you increase sensitivity and on the other, you decrease specificity. Playing around with those different Sensitivity/specificity thresholds you can plot a graph describing how accurate is the AI in different working points. This is the so-called ROC curve.

AUC is basically a value between 0.5 and 1 and area under the ROC curve.


In most cases, the neural network is utilized to classify data – assign analyzed data to (usually) predetermined classes. Therefore, the trained neural network is commonly referred to as Classifier.

Confusion Matrix

Analyzes the performance of a classifier measures how well the AI technology (algorithm) performs.
True Positive(TP): a correct detection (ICH classified as ICH)
False Positive (FP): an incorrect detection (for example, healthy classified as ICH)
True Negative (TN): a correct non-detection (healthy classified as healthy)
False Negative (FN): an incorrect non-detection (for example, ICH classified as healthy)

Deep Learning

Deep Learning is a field in machine learning where the methods used are methods based on artificial neural networks.

Machine Learning

Machine learning (ML) is the scientific study of algorithms and statistical models that computer systems use in order to perform a specific task effectively without using explicit instructions.

Neural Networks

Neural Networks are computing systems that are inspired by, but not necessarily identical to, the biological neural networks. An ANN is based on a collection of connected units or nodes called artificial neurons, which loosely model the neurons in a biological brain.


Sensitivity and specificity are statistical measures of the performance of a binary classification test, also known in statistics as classification function, that are widely used in medicine:
Sensitivity (also called the true positive rate, the recall, or probability of detection[1] in some fields) measures the proportion of actual positives that are correctly identified as such (e.g., the percentage of sick people who are correctly identified as having the condition).
Specificity (also called the true negative rate) measures the proportion of actual negatives that are correctly identified as such (e.g., the percentage of healthy people who are correctly identified as not having the condition).

By FeanDoe – Modified version from Walber’s Precision and Recall



Positive predictive value (PPV) is used to indicate the probability:
In case of a positive test => the patient really has the specified disease.
The proportion of true positives against all positive results set


Negative Predictive Value (NPV) is the proportion of true negative test against all negative results set.

Explore the Latest AI Insights, Trends and Research