Category Archives: Data Science

The ROC Curve

The ROC Curve is a commonly used method for and evaluating the performance of classification models. ROC curves use a combination the false positive rate (i.e. occurrences that were predicted positive, but actually negative) and true positive rate (i.e. occurrences that were correctly predicted) to build up a summary picture of the classification performance. ROC… Read More »

Phonetic Algorithms

Phonetic algorithms assist in comparing and matching words by their pronunciation, rather than just their spelling. Early phonetic algorithms were introduced to help analyse US census data. Phonetic algorithms and other ‘fuzzy matching‘ techniques have since played an essential part in many activities including spelling correction, database record linkage and search recommendations. This post looks… Read More »