Category Machine learning

Probability and linear regression Probability and linear regression

Machine learning, Sin categoría, Statistics

The sympathy of pendulums

The rationale for minimizing the sum of squared errors in linear regression, which is often presented as a simple choice of convenience, is discussed. A probabilistic perspective suggests that the least squares equation arises naturally from assuming that the model's residuals follow a normal distribution.

Manuel Molina
06/10/2025

Machine learning, Statistics

The three musketeers

There are three important components involved in the training process of a machine learning algorithm: the loss function, the performance metric, and the validation control. The need to balance accuracy and predictive capacity to obtain robust and effective models is emphasized.

Manuel Molina
03/18/2025

Machine learning, Sin categoría, Statistics

Apophenia

Overfitting occurs when an algorithm over-learns the details of the training data, capturing not only the essence of the relationship between them, but also the random noise that will always be present. This negatively affects its performance and its ability to generalize when we introduce new data, not seen during training.

Manuel Molina
11/27/2024

Machine learning, Statistics

The wisdom of the weirdwoods

Simple decision trees have the problem of being less accurate than other regression or classification algorithms, as well as being less robust to small modifications of the data with which they are built. Some techniques for building ensemble decision trees are described, such as resampling aggregation (bagging) and random forests, which aim to improve the accuracy of predictions and avoid overfitting of models.

Manuel Molina
11/26/2024

Machine learning, Statistics

The tree and the labyrinth

A decision tree is a machine learning model that is used to estimate a target variable based on several input variables. This target variable can be either numerical (regression trees) or nominal (classification trees). The methodology for constructing decision trees for regression and classification is described, as well as their interpretation.

Manuel Molina
05/21/2024