An end to end implementation of a Machine Learning pipeline

October 25, 2018

As a researcher on Computer Vision, I come across new blogs and tutorials on ML (Machine Learning) every day. However, most of them are just focussing on introducing the syntax and the terminology relevant to the field. For example – a 15 minute tutorial on Tensorflow using MNIST dataset, or a 10 minute intro to Deep Learning in Keras on Imagenet.

While people are able to copy paste and run the code in these tutorials and feel that working in ML is really not that hard, it doesn’t help them at all in using ML for their own purposes. For example, they never introduce you to how you can run the same algorithm on your own dataset. Or, how do you get the dataset if you want to solve a problem.

Or, which algorithms do you use – Conventional ML, or Deep Learning? How do you evaluate your models performance? How do you write your own model, as opposed to choosing a ready made architecture?

All these form fundamental steps in any Machine Learning pipeline, and it is these steps that take most of our time as ML practitioners. This tutorial breaks down the whole pipeline, and leads the reader through it step by step in an hope to empower you to actually use ML, and not just feel that it was not too hard. Needless to say, this will take much longer than 15-30 minutes.

I believe a weekend would be a good enough estimate.

Source: github.io