How to Start Learning Machine Learning
by Mahmut on December 11, 2020
Google says “Machine Learning is the future,” and the future of Machine Learning is going to be very bright.
Machine learning algorithms power Walmart product recommendations, surge pricing at Uber, fraud detection at top financial institutions, content that Twitter, LinkedIn, Facebook and Instagram display on social media feeds or Google Maps.
In this article, I am going to talk about :
What is Machine Learning
Machine learning is the subfield of computer science that gives “computers the ability to learn without being explicitly programmed.”
Let me explain what I mean when I say “without being explicitly programmed.”
How Machine Learning Works
Assume that you have a dataset of images of animals such as cats and dogs, and you want to have the software or an application that can recognize and differentiate them. The first thing that you have to do here is to interpret the images as a set of feature sets. For example, does the image show the animal’s eyes? If so, what are their sizes? Does it have ears? What about tails? How many legs? Does it have wings?
Prior to machine learning, each image would be transformed to a vector of features. Then traditionally, we had to write down some rules or methods in order to get computers to be intelligent and detect the animals. But, it was a failure. Why? Well as you can guess, it needed a lot of rules, highly dependent on the current dataset, and not generalized enough to detect out of sample cases. This is when machine learning entered the scene. Using machine learning, allows us to build a model that looks at all the feature sets, and their corresponding type of animals, and it learns the pattern of each animal.
It is a model built by machine learning algorithms. It detects without explicitly being programmed to do so. In essence, machine learning follows the same process that a 4 years old child uses to learn, understand and differentiate animals. So, machine learning algorithms, inspired by the human learning process, iteratively learn from data, and allow computers to find hidden insights. These models help us in a variety of tasks, such as object recognition, summarization, recommendation and so on.
Examples of Machine Learning
Machine learning impacts society in a very influential way. Here are some real-life examples.
First, how do you think Netflix and Amazon recommend videos, movies, and TV shows to their users? They use Machine Learning to produce suggestions that you might enjoy. This is similar to how your friends might recommend a television show to you, based on their knowledge of the types of shows you like to watch.
How do you think banks make a decision when approving a loan application? They use Machine Learning to predict the probability of default for each applicant and then approve or refuse the loan application based on that probability.
Telecommunication companies use their customers’ demographic data to segment them or predict if they will unsubscribe from their company the next month.
There are many other applications of machine learning that we see every day in our daily life, such as chatbots, logging into our phones, or even computer games using face recognition. Each of these using different machine learning techniques or algorithms.
So Let’s quickly examine a few of the more popular techniques.
Major Machine Learning Techniques
The Regression / Estimation technique is used for predicting a continuous value. For example, predicting things like the price of a house based on its characteristics, or to estimate the Co2 emission from a car’s engine.
A classification technique is used for predicting the class or category of a case, for example, if a cell is benign or malignant, or whether or not a customer will churn.
Clustering groups of similar cases, for example, can find similar patients or can be used for customer segmentation in the banking field.
Association technique is used for finding items or events that often co-occur, for example, grocery items that are usually bought together by a particular customer
Anomaly detection is used to discover abnormal and unusual cases, for example, it is used for credit card fraud detection.
Sequence mining is used for predicting the next event, for instance, the click stream in websites.
Dimension reduction is used to reduce the size of data.
And finally, recommendation systems, this associates people’s preferences with others who have similar tastes, and recommends new items to them, such as books or movies.
Python Libraries for Machine Learning
Python is a popular and powerful general-purpose programming language that recently emerged as the preferred language among data scientists. You can write your machine learning algorithms using Python, and it works very well. However, there are a lot of modules and libraries already implemented in Python, that can make your life much easier.
Scikit-learn is a collection of algorithms and tools for machine learning in Python programming language.
It has most of the classification, regression, and clustering algorithms, and it’s designed to work with Python numerical and scientific libraries: Numpy and SciPy.
Also, it includes very good documentation. On top of that, implementing machine learning models with SciKit Learn is really easy with a few lines of Python code.
Most of the tasks that need to be done in a machine learning pipeline are implemented already in Scikit Learn including pre-processing of data, feature selection, feature extraction, train test splitting, defining the algorithms, fitting models, tuning parameters, prediction,
evaluation, and exporting the model.
Numpy package is a math library to work with N-dimensional arrays in Python. It enables you to do computation efficiently and effectively.
It is better than regular Python because of its amazing capabilities. For example, for working with arrays, dictionaries, functions, data types, and working with images you need to know Numpy.
If you want to learn more about Numpy, then check out my “Best Way to Learn Numpy” article.
SciPy is a collection of numerical algorithms and domain-specific toolboxes, including signal processing, optimization, statistics, and much more.
SciPy is a good library for scientific and high performance computation.
Matplotlib is a very popular plotting package that provides 2D plotting, as well as 3D plotting.
The matplotlib package is the most well-known library for data visualization, and it’s excellent for making graphs and plots. The graphs are also highly customizable.
If you want to learn Machine Learning, you can find lots of sources ( books, courses ) on the internet. The best way to learn something is to learn by doing it. So if you really want to learn it, then build something and learn by doing it.