Artificial intelligence is the simulation of human intelligence in machines programmed to think and learn like humans so that they can automate tasks that are normally performed by humans. Machine learning develops and applies models and algorithms that are capable of learning from data.
Surely you know the series of films about the Terminator, by film director James Cameron. In this fiction, a “terminator” is a humanoid robot designed by Skynet as a practically indestructible soldier dedicated to exterminating poor human beings, who are subdued and at war against intelligent machines, which have rebelled and dominate the world.
The main character, a T-800 model, quite evil in the first film and turned good in the second, fights with the real bad guy, a T-1000. When one of his battles ends, the T-800 says goodbye with these words, which have remained for posterity: hasta la vista, baby.
It is not that the phrase is very intellectual or that it has a lot of substance, but there was not much to choose from in the repertoire of the T-800, indeed lethal as a soldier and protector, but truly short on words.
But the bottom of this story, if we think twice, can be disturbing. What is this about good and bad robots? Are there intelligent machines? Could machines rebel against us and come to dominate us and even kill us? I say this because in times when there is so much talk about artificial intelligence or machine learning, these ideas can cross our minds.
Indeed, over the last few years, terms such as artificial intelligence, machine learning and deep learning have gained popularity, which can make us think that machines can be intelligent and, even, learn.
However, and for everyone’s peace of mind (at least for the moment), when we see what these terms really mean, our fears will begin to be lost in time, like tears in the rain (words of another robot, a replicant, much more educated than our T-800). Fortunately, and at least for the moment, artificial intelligence is more artificial than intelligence.
The beginnings of artificial intelligence (AI) date back to the 1950s, when some of the creators of another nascent science, computing, began to consider whether computers could be designed to “think”. This term has so many possible implications that, even today, we keep thinking about it.
We can define AI as the simulation of human intelligence in machines programmed to think and learn like humans, so that they can automate tasks that are normally performed by humans. However, this is a modern definition of the term, since until the 1980s no mention was made of the subject of learning.
Today, AI systems can be trained to perform tasks like recognizing speech, understanding natural language, making decisions, and playing games. These systems can also be used to improve decision making and automate processes in various industries. And, there is no doubt, they have achieved surprising achievements, such as the virtual assistants of Google or Amazon, the transcription and translation of texts, the personalization of advertisements, etc.
But this should not make us fear a future rebellion of the machines: at present, machines and AI systems do not have the ability to revolt against humans. AI systems are designed and programmed to perform specific tasks and meet certain objectives, but they do not have the ability to have goals or desires of their own, or to make decisions based on emotions or motivations.
The problem may be with the programmer, not with the machine. It is critical that AI development is monitored and regulated to avoid potential ethical and security issues.
We can define machine learning as a subfield of AI that involves training systems to learn from data and make predictions or decisions without being explicitly programmed to do so. Let’s see what this means.
The first known general purpose computer is the Analytical Engine, which was designed in the mid-19th century by Charles Babbage. His friend, Ada Lovelace, considered the first programmer in history, said that a machine like this had no claim to create anything new, but simply to carry out those tasks that it was ordered to do. In short, it was an assistant to perform tasks well understood by human beings.
Almost two centuries later, this statement can be questioned. Could a machine “originate” something or would it always have to be limited to processes understood by humans? Could it do something new, be creative? Learn from its experience?
The possibility of machine learning implies a change in the way we use computers in our workflow. With the classical approach, such as that of Ada Lovelace, the computer is given a series of instructions (a computer program) that it must follow in a generally sequential manner to transform the data into a result, the answer to the problem. For this, the programmer must know in advance the rules that govern the behavior of the data, in order to write the program instructions for the machine.
Machine learning turns this approach on its head: the computer “looks” at the input data and the corresponding final answer, and from both, figures out the rules that govern the behavior of the data, which were unknown beforehand.
A machine learning system is not programmed, but trained: it is given samples relevant to a given task and finds the underlying statistical structure of the data, allowing it to automate the task and apply it to new data that the system had not previously seen.
Thus, machine learning involves the concept of learning from data. An algorithm is no longer designed to respond to a specific problem, but it is design as a generic learning algorithm that, based on the examples and solutions provided, is capable of solving the problem in other situations.
This may seem very abstract, but in reality, we often do it when using “conventional” statistical techniques. Let’s think about a simple statistical model to which we are more used to: simple linear regression.
We establish the model according to the following formula:
y = ax + b + e
where “e” is the error component of the model. We can develop a model to estimate the coefficients of the linear model without first knowing the “x” and “y” values, but we cannot assume the linear relationship without first looking at the actual data. When we train the linear model (try different coefficients and see how it fits), we are “learning” from the data.
With all that said, we could define machine learning as a discipline that develops and applies models and algorithms that are capable of learning from data.
Types of machine learning
Although we can find some more described, in general we consider two types of machine learning: supervised and unsupervised.
In supervised machine learning, the machine learning algorithm is given a data set with already known correct answers so that it learns to generalize and make accurate predictions about new data. They are used to predict the unknown value of a variable from a series of known variables.
Let’s imagine that we want to predict the price of a car based on characteristics such as size, displacement, range, type of fuel used, etc. From the cars for which the price and their features are given, the model would try to predict the price of a car that it has not seen before if we tell it what features it has.
On the other hand, in unsupervised machine learning, the algorithm does not receive labeled information about the correct answers. Instead, it relies on the structure and relationships present in the input data to discover interesting patterns and features that are unknown to the researcher.
Continuing with the example of our cars, observing all their characteristics, the algorithm could discover patterns that would allow cars to be classified into different groups: sports, SUVs, family, etc.
Supervised machine learning algorithms
Supervised learning is usually divided into two large groups, regression and classification. The first is used when the outcome variable we are trying to predict is numeric, while classification is used for the prediction of categorical variables.
As an example of regression, a model can be trained on historical home price data and then used to predict the price of a specific home based on its size, location, etc. The most popular regression algorithms include linear regression, polynomial regression, logistic regression, and neural networks.
Finally, as an example of classification, an algorithm can be trained on labeled images of animals and then used to classify images of previously unseen animals. The most popular classification algorithms include logistic regression, k-nearest neighbors, decision trees, random forests, support vector machines, the naive-Bayes method, and neural networks.
Unsupervised machine learning algorithms
As we have already said, unsupervised learning does not aim to predict variables from other variables, but focuses on discovering hidden patterns that can follow the data that is fed to the algorithm. To do this, it uses dimensionality reduction techniques.
Dimensionality reduction is the process of reducing the number of features or dimensions in a data set, while trying to preserve important information. This is done to solve problems that learning algorithms may encounter when handling data sets with many features, and also to simplify the visualization of the data.
As an example, these techniques are very useful in genetic studies, where thousands or millions of polymorphisms can be studied in each participant. It is useful to find a reduced set of variables that explains most of the genetic variability of each individual and to analyze the data only with that reduced set.
The two most widely used techniques for dimensionality reduction are principal components análisis and clustering methods.
All the algorithms we have seen so far do their job in a more or less visible way. As an example we have simple neural networks, which we have already seen can be used for both regression and classification.
By scaling up the complexity of these algorithms, we can create networks by successively combining layers. Each layer is responsible for taking the data from the previous layer, doing some transformation on the data, and passing it on to the next layer. Thus, we use these neural networks with several hidden layers to model complex patterns and relationships in the data. It is what is known as deep learning.
These inner layers can learn nonlinear features and combine them to perform complex tasks such as image classification, natural language processing, and image and audio generation.
We are not going to go into the inners of neural networks, since it deserves a post (or more) to explain it.
What is Big Data?
We can already imagine that all of these machine learning techniques use larger and more complex data sets than are typically used in more conventional statistical techniques. This is often taken to unsuspected limits when the multitude of data that is generated daily and the computing power that has been developed over the last decades are combined.
When this gets totally out of hand, we have to invent a new concept: Big Data. In a simple way, we can talk about Big Data when the data we are using cannot be handled on conventional computers. And I’m not talking about gigabytes or terabytes, I’m talking about petabytes (1,000 Tb) and even exabytes (1,000 Pb) of data.
This means that computer systems capable of working with Big Data are not available to everyone. In practice, only a few large corporations like Google, Facebook or Amazon actually work with Big Data.
A paradigm shift
Everything we have talked about in today’s post can be included under the term Data Science, which can be defined in a simplified way as the discipline in charge of the study of methods, techniques and tools for the storage, recovery, analysis and visualization of data.
Data Science has developed in parallel with our computing capacity and the generation of increasing volumes of data. In recent years it has become progressively popular and I also believe that it will change the way we guide our statistical studies.
In the past, with not very large volumes of data and with a reduced computing capacity (imagine before having computers), the effort was directed towards developing statistical inference techniques capable of working with the available means.
However, today we have a much larger volume of data and an infinitely greater storage and computing capacity. There is no doubt that machine learning algorithms will gradually fill the place that the most classical statistical techniques have held for so long.
I believe that without our realizing it, the Methods section of our scientific papers will be invaded by neural networks, random forests or vector-supported machines, to name a few. Follow my advice: study Data Science techniques.
We are leaving…
And here we are going to end this long (although I hope useful) post for today.
We have practically not talked at all about what the different algorithms that we have mentioned consist of. We could have also talked a bit about how a neural network works, how it is trained and how it learns. Don’t worry, we’ll come back to these issues. But that is another story…