Tag Archives: data

Wikipedia Data Stream

Streaming data is an important part of modern data processing. If you are just starting out, and perhaps don’t yet work somewhere with access to a big data streaming infrastructure, it can be hard to know where to start. This post talks you through a simple wikipedia data stream example from the wikimedia documentation. Wikipedia… Read More »

A Machine Learning Workflow

Machine learning is an essential part of data science – a field which covers a range of activities from data acquisition and cleaning, through to analytics and data visualisation. It can be helpful to think in terms of a machine learning workflow that puts some structure around some of these processes. This post looks at… Read More »

SQL Joins: Some Basics

Joining tables in SQL is an essential part of many digital services and products. Joins let you answer questions such as which customers bought which products? How many staff work at a particular location? Which students are in which classes at a school? Joining in SQL becomes even more important when your data are stored… Read More »

Top 10 Data Science Techniques

Data science can mean different things to different people, but we can try to define it by the techniques a data scientist tends to use. A recent online poll gave a top 10 of algorithms and methods used by Data Scientists. This post goes through that top 10 list of data science techniques to flesh… Read More »