0% found this document useful (0 votes)
121 views1 page

The Logit and Sigmoid

The document discusses the logit and sigmoid functions which are useful in machine learning for working with probabilities and classification. The logit function maps probabilities to real numbers by taking the log of the odds. The sigmoid function is the inverse of the logit and maps real numbers back to the range of 0 to 1. These functions are useful because their gradients are simple to calculate, which is important for optimization in neural networks, however the sigmoid can have problems with vanishing gradients.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
121 views1 page

The Logit and Sigmoid

The document discusses the logit and sigmoid functions which are useful in machine learning for working with probabilities and classification. The logit function maps probabilities to real numbers by taking the log of the odds. The sigmoid function is the inverse of the logit and maps real numbers back to the range of 0 to 1. These functions are useful because their gradients are simple to calculate, which is important for optimization in neural networks, however the sigmoid can have problems with vanishing gradients.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 1

The Logit and Sigmoid Functions

If you mess around with machine learning for


long enough, you’ll eventually run into the
logit and sigmoid functions. These are useful
functions when you are working with
probabilities or trying to classify data.

Given a probability p, the corresponding odds


are calculated as p / (1 – p). For example if
p=0.75, the odds are 3 to 1: 0.75/0.25 = 3.
The sigmoid might be useful if you want to
The logit function is simply the logarithm of transform a real valued variable into something
the odds: logit(x) = log(x / (1 – x)). Here is a that represents a probability. This sometimes
plot of the logit function: happens at the end of a classification process.
(As Wikipedia and other sources note, the term
“sigmoid function” is used to refer to a class of
functions with S-shaped curves. In most
machine learning contexts, “sigmoid” usually
refers specifically to the function described
above.)

There are other functions that map probabilities


to reals (and vice-versa), so what’s so special
about the logit and sigmoid? One reason is that
the logit function has the nice connection to
odds described at the beginning of the article.
The value of the logit function heads towards A second is that the gradients of the logit and
infinity as p approaches 1 and towards negative sigmoid are simple to calculate (try it and see).
infinity as it approaches 0. The reason why this is important is that many
optimization and machine learning techniques
The logit function is useful in analytics because make use of gradients, for example when
it maps probabilities (which are values in the estimating parameters for a neural network.
range [0, 1]) to the full range of real numbers.
In particular, if you are working with “yes-no” The biggest drawback of the sigmoid function
(binary) inputs it can be useful to transform for many analytics practitioners is the so-called
them into real-valued quantities prior to “vanishing gradient” problem. You can read
modeling. This is essentially what happens in more about this problem here (and here), but
logistic regression. the point is that this problem pertains not only
to the sigmoid function, but any function that
The inverse of the logit function is the sigmoid squeezes real values to the [0, 1] range. In
function. That is, if you have a probability p, neural networks, where the vanishing gradient
sigmoid(logit(p)) = p. The sigmoid function problem is particularly annoying, it is often a
maps arbitrary real values back to the range [0, good idea to seek alternatives as suggested
1]. The larger the value, the closer to 1 you’ll here.
get.
https://nathanbrixius.wordpress.com/2016/06/04
The formula for the sigmoid function is σ(x) = /functions-i-have-known-logit-and-sigmoid/
1/(1 + exp(-x)). Here is a plot of the function:

You might also like