It is impossible to follow everything ML. These 5, though, are worth a look.

Machine learning, as a field, is growing at a breakneck speed. Github is that whiteboard which the whole world is watching. Top quality code is being regularly posted on that infinite board of wisdom.

It is obviously impossible to track all things that go on in the world of machine learning but Github has a star-rating for each project. Basically, if you star a repository, you show your appreciation for the project as well as keep track of repositories that you find interesting.

Github Blog

This star rating then can be one of the good metrics to know the most followed projects. Let’s take a look at 5 highly rated ones.

1) face-recognition — 25,858 ★

The world’s simplest tool for facial recognition. It provides an application programming interface (API) for Python and the command line. It is useful for recognising and manipulating faces in images. It is built using dlib’s state-of-the-art face recognition algorithm. The deep-learning model has an accuracy of 99.38% on the Labeled Faces in the Wild dataset.

It also provides a simple face_recognition command line tool that lets you do face recognition on a folder of images from the command line itself!

This library can also handle real-time face recognition

Github — face-recognition

2) fastText by FacebookResearch — 18,819 ★

fastText is an open source and free library by Facebook team for efficient learning of word representations. It is lightweight and allows users to learn text representations and sentence classifiers. It works on standard, generic hardware. Models can be reduced in size to even fit on mobile devices.

Text classification is a core problem to many applications, like spam detection, sentiment analysis or smart replies. The goal of text classification is to assign documents (such as emails, posts, text messages, product reviews, etc…) to multiple categories.

Example of word classes | Source:

It is a very useful resource for NLP enthusiasts.


3) awesome-tensorflow — 14,424★

This is a collection of resources that help you understand and utilise TensorFlow. The github repo contains a curated list of awesome TensorFlow experiments, libraries, and projects.

TensorFlow is an end-to-end open source platform for machine learning designed by Google. It has a comprehensive ecosystem of tools, libraries and community resources that lets researchers create the state-of-the-art in ML. Using it developers can easily build and deploy ML powered applications.


4) predictionio by Apache — 11852 ★

Apache PredictionIO is an open source machine learning framework for developers, data scientists, and end users. Users can use this framework to build real-world ML apps, deploy and test them.

It even supports event collection, evaluation, and querying predictive results. It is based on scalable open source services like Hadoop, HBase etc.

It basically takes the load off of a developer’s mind as far as Machine Learning is concerned.


5) Style2Paints — 9184 ★

This repository is slightly different from all of the above as it has been shut down due to lack of funds! It is quite an interesting concept where AI is used to color images.

They claimStyle2paints V4 as the current best AI driven Line-Art colorization tool.

They claim that it is different from previous end-to-end image-to-image translation methods because it is the first system to colorize a line-art in real-life human workflow. Most human artists are familiar with this workflow

sketching -> color filling/flattening -> gradients/details adding -> shading

Style2Paints is designed according to this flow. Such a flow produces the middle image from the leftmost image in just 2 clicks.


And in just 4 more clicks this is what you get


Internet is an ocean and machine learning is a river that flows into it. The stars on Github are a good metric to sift through this river of treasure.