The document discusses the logit and sigmoid functions which are useful in machine learning for working with probabilities and classification. The logit function maps probabilities to real numbers by taking the log of the odds. The sigmoid function is the inverse of the logit and maps real numbers back to the range of 0 to 1. These functions are useful because their gradients are simple to calculate, which is important for optimization in neural networks, however the sigmoid can have problems with vanishing gradients.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
121 views1 page
The Logit and Sigmoid
The document discusses the logit and sigmoid functions which are useful in machine learning for working with probabilities and classification. The logit function maps probabilities to real numbers by taking the log of the odds. The sigmoid function is the inverse of the logit and maps real numbers back to the range of 0 to 1. These functions are useful because their gradients are simple to calculate, which is important for optimization in neural networks, however the sigmoid can have problems with vanishing gradients.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 1
The Logit and Sigmoid Functions
If you mess around with machine learning for
long enough, you’ll eventually run into the logit and sigmoid functions. These are useful functions when you are working with probabilities or trying to classify data.
Given a probability p, the corresponding odds
are calculated as p / (1 – p). For example if p=0.75, the odds are 3 to 1: 0.75/0.25 = 3. The sigmoid might be useful if you want to The logit function is simply the logarithm of transform a real valued variable into something the odds: logit(x) = log(x / (1 – x)). Here is a that represents a probability. This sometimes plot of the logit function: happens at the end of a classification process. (As Wikipedia and other sources note, the term “sigmoid function” is used to refer to a class of functions with S-shaped curves. In most machine learning contexts, “sigmoid” usually refers specifically to the function described above.)
There are other functions that map probabilities
to reals (and vice-versa), so what’s so special about the logit and sigmoid? One reason is that the logit function has the nice connection to odds described at the beginning of the article. The value of the logit function heads towards A second is that the gradients of the logit and infinity as p approaches 1 and towards negative sigmoid are simple to calculate (try it and see). infinity as it approaches 0. The reason why this is important is that many optimization and machine learning techniques The logit function is useful in analytics because make use of gradients, for example when it maps probabilities (which are values in the estimating parameters for a neural network. range [0, 1]) to the full range of real numbers. In particular, if you are working with “yes-no” The biggest drawback of the sigmoid function (binary) inputs it can be useful to transform for many analytics practitioners is the so-called them into real-valued quantities prior to “vanishing gradient” problem. You can read modeling. This is essentially what happens in more about this problem here (and here), but logistic regression. the point is that this problem pertains not only to the sigmoid function, but any function that The inverse of the logit function is the sigmoid squeezes real values to the [0, 1] range. In function. That is, if you have a probability p, neural networks, where the vanishing gradient sigmoid(logit(p)) = p. The sigmoid function problem is particularly annoying, it is often a maps arbitrary real values back to the range [0, good idea to seek alternatives as suggested 1]. The larger the value, the closer to 1 you’ll here. get. https://nathanbrixius.wordpress.com/2016/06/04 The formula for the sigmoid function is σ(x) = /functions-i-have-known-logit-and-sigmoid/ 1/(1 + exp(-x)). Here is a plot of the function: