Category Sin categoría

Probability and linear regression Probability and linear regression

The sympathy of pendulums

The rationale for minimizing the sum of squared errors in linear regression, which is often presented as a simple choice of convenience, is discussed. A probabilistic perspective suggests that the least squares equation arises naturally from assuming that the model's residuals follow a normal distribution.

Probability and linear regression Probability and linear regression

The tribulations of an astronaut

Binary logistic regression uses the sigmoid function to estimate the probability of the target variable when it is binary. However, this function does not allow direct probability estimates when dealing with nominal variables with more than two categories. In these cases, we will use multinomial logistic regression, which will use the softmax function to estimate the probabilities with respect to a reference category.

Probability and linear regression Probability and linear regression

The mystery of the imperfect crime

The tau-squared represents the variability of effects between the different populations from which the primary studies of a systematic review are derived, according to the assumption of the random effects model of meta-analysis. Its usefulness for weighting studies and for calculating prediction intervals is described, understanding how its significance goes beyond being a mere indicator of heterogeneity.

Probability and linear regression Probability and linear regression

Between preferences and coincidences

Cramer's V allows the strength of the association between two categorical (nominal) variables, not ordinal, to be quantified. It is especially useful when the variables have multiple categories, since it allows the strength of the association to be condensed into a single figure. Its values range from 0, no association, to 1, a perfect association.

Probability and linear regression Probability and linear regression

Too many paths, no final destination

Contrary to what it could be supposed, the inclusion of a large number of variables in a linear regression model can be counterproductive to its performance, producing overfitting of the data and decreasing the capacity for generalization. This is known as the curse of multidimensionality.

Esta web utiliza cookies propias y de terceros para su correcto funcionamiento y para fines analíticos. Al hacer clic en el botón Aceptar, aceptas el uso de estas tecnologías y el procesamiento de tus datos para estos propósitos. Antes de aceptar puedes ver Configurar cookies para realizar un consentimiento selectivo.   
Privacidad