What's a mean field variational family?
$begingroup$
I'm working through variational Bayesian methods at the moment, and I think I have a grasp of the bigger picture. Where I sometimes have trouble is with the exact details of how it can be implemented. Right now, this centrs on the idea of a mean field variational family. Specifically, Blei et al. say the following:
In this review we focus on the mean-field variational family, where
the latent variables are mutually independent and each governed by a
distinct factor in the variational density. A generic member of the
mean-field variational family is
$$q(z) = prod_ {j=1}^m q_j (z_j )$$
I'm afraid that I can't see how a distribution can be expressed as a product in this way without being reduced to a constant. Clearly, I'm missing something fundamental, but I seem to be going around in circles trying to google the answer.
Can anyone supply some intuition? Thannk!
probability computational-statistics variational-bayes
$endgroup$
add a comment |
$begingroup$
I'm working through variational Bayesian methods at the moment, and I think I have a grasp of the bigger picture. Where I sometimes have trouble is with the exact details of how it can be implemented. Right now, this centrs on the idea of a mean field variational family. Specifically, Blei et al. say the following:
In this review we focus on the mean-field variational family, where
the latent variables are mutually independent and each governed by a
distinct factor in the variational density. A generic member of the
mean-field variational family is
$$q(z) = prod_ {j=1}^m q_j (z_j )$$
I'm afraid that I can't see how a distribution can be expressed as a product in this way without being reduced to a constant. Clearly, I'm missing something fundamental, but I seem to be going around in circles trying to google the answer.
Can anyone supply some intuition? Thannk!
probability computational-statistics variational-bayes
$endgroup$
add a comment |
$begingroup$
I'm working through variational Bayesian methods at the moment, and I think I have a grasp of the bigger picture. Where I sometimes have trouble is with the exact details of how it can be implemented. Right now, this centrs on the idea of a mean field variational family. Specifically, Blei et al. say the following:
In this review we focus on the mean-field variational family, where
the latent variables are mutually independent and each governed by a
distinct factor in the variational density. A generic member of the
mean-field variational family is
$$q(z) = prod_ {j=1}^m q_j (z_j )$$
I'm afraid that I can't see how a distribution can be expressed as a product in this way without being reduced to a constant. Clearly, I'm missing something fundamental, but I seem to be going around in circles trying to google the answer.
Can anyone supply some intuition? Thannk!
probability computational-statistics variational-bayes
$endgroup$
I'm working through variational Bayesian methods at the moment, and I think I have a grasp of the bigger picture. Where I sometimes have trouble is with the exact details of how it can be implemented. Right now, this centrs on the idea of a mean field variational family. Specifically, Blei et al. say the following:
In this review we focus on the mean-field variational family, where
the latent variables are mutually independent and each governed by a
distinct factor in the variational density. A generic member of the
mean-field variational family is
$$q(z) = prod_ {j=1}^m q_j (z_j )$$
I'm afraid that I can't see how a distribution can be expressed as a product in this way without being reduced to a constant. Clearly, I'm missing something fundamental, but I seem to be going around in circles trying to google the answer.
Can anyone supply some intuition? Thannk!
probability computational-statistics variational-bayes
probability computational-statistics variational-bayes
asked 9 hours ago
Lodore66Lodore66
1083
1083
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
Loosely speaking, the mean field family defines a specific class of joint distributions. So $z$ here is actually a parameter vector of length m. That means that $q(z)$ describes a joint distribution over all of the individual z's, and can be written as
$$q(z) = q(z_1, z_2, ldots, z_m)$$
We can use the chain rule to factorize this:
$$ = q(z_1)q(z_2|z_1)ldots q(z_m|z_1, z_2, ldots z_{m-1})$$
Now, for this joint distribution to be in the mean field family, we make a simplifying assumption and assume that all of the $z_i$s are independent from each other. I'll note here that this assumes that the $z_i$'s under the variational distributions are independent; the true joint $p(z_1, ldots z_m)$ is almost certainly going to have some dependence among the variables. In this sense, we are trading off accuracy (throwing away all covariances) for some computational benefits.
Now, if we make that independence assumption, we can see that the joint reduces down to
$$q(z) = q(z_1)q(z_2)ldots q(z_m) = prod_{i=1}^m q(z_i)$$
Which is the form that the mean field family takes. As for your question about how this won't reduce to a constant, I'm not entirely sure what you mean. All of the $z_i$'s are random variables, so I don't see how this could become a constant.
New contributor
$endgroup$
$begingroup$
This is really helpful and has clarified things immensely. What was catching me out was where all the marginal probabilities went; by explaining that this is an approximation that trades off accuracy for computability over the joint distribution makes it much more intuitive. Thanks indeed!
$endgroup$
– Lodore66
8 hours ago
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "65"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f391776%2fwhats-a-mean-field-variational-family%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Loosely speaking, the mean field family defines a specific class of joint distributions. So $z$ here is actually a parameter vector of length m. That means that $q(z)$ describes a joint distribution over all of the individual z's, and can be written as
$$q(z) = q(z_1, z_2, ldots, z_m)$$
We can use the chain rule to factorize this:
$$ = q(z_1)q(z_2|z_1)ldots q(z_m|z_1, z_2, ldots z_{m-1})$$
Now, for this joint distribution to be in the mean field family, we make a simplifying assumption and assume that all of the $z_i$s are independent from each other. I'll note here that this assumes that the $z_i$'s under the variational distributions are independent; the true joint $p(z_1, ldots z_m)$ is almost certainly going to have some dependence among the variables. In this sense, we are trading off accuracy (throwing away all covariances) for some computational benefits.
Now, if we make that independence assumption, we can see that the joint reduces down to
$$q(z) = q(z_1)q(z_2)ldots q(z_m) = prod_{i=1}^m q(z_i)$$
Which is the form that the mean field family takes. As for your question about how this won't reduce to a constant, I'm not entirely sure what you mean. All of the $z_i$'s are random variables, so I don't see how this could become a constant.
New contributor
$endgroup$
$begingroup$
This is really helpful and has clarified things immensely. What was catching me out was where all the marginal probabilities went; by explaining that this is an approximation that trades off accuracy for computability over the joint distribution makes it much more intuitive. Thanks indeed!
$endgroup$
– Lodore66
8 hours ago
add a comment |
$begingroup$
Loosely speaking, the mean field family defines a specific class of joint distributions. So $z$ here is actually a parameter vector of length m. That means that $q(z)$ describes a joint distribution over all of the individual z's, and can be written as
$$q(z) = q(z_1, z_2, ldots, z_m)$$
We can use the chain rule to factorize this:
$$ = q(z_1)q(z_2|z_1)ldots q(z_m|z_1, z_2, ldots z_{m-1})$$
Now, for this joint distribution to be in the mean field family, we make a simplifying assumption and assume that all of the $z_i$s are independent from each other. I'll note here that this assumes that the $z_i$'s under the variational distributions are independent; the true joint $p(z_1, ldots z_m)$ is almost certainly going to have some dependence among the variables. In this sense, we are trading off accuracy (throwing away all covariances) for some computational benefits.
Now, if we make that independence assumption, we can see that the joint reduces down to
$$q(z) = q(z_1)q(z_2)ldots q(z_m) = prod_{i=1}^m q(z_i)$$
Which is the form that the mean field family takes. As for your question about how this won't reduce to a constant, I'm not entirely sure what you mean. All of the $z_i$'s are random variables, so I don't see how this could become a constant.
New contributor
$endgroup$
$begingroup$
This is really helpful and has clarified things immensely. What was catching me out was where all the marginal probabilities went; by explaining that this is an approximation that trades off accuracy for computability over the joint distribution makes it much more intuitive. Thanks indeed!
$endgroup$
– Lodore66
8 hours ago
add a comment |
$begingroup$
Loosely speaking, the mean field family defines a specific class of joint distributions. So $z$ here is actually a parameter vector of length m. That means that $q(z)$ describes a joint distribution over all of the individual z's, and can be written as
$$q(z) = q(z_1, z_2, ldots, z_m)$$
We can use the chain rule to factorize this:
$$ = q(z_1)q(z_2|z_1)ldots q(z_m|z_1, z_2, ldots z_{m-1})$$
Now, for this joint distribution to be in the mean field family, we make a simplifying assumption and assume that all of the $z_i$s are independent from each other. I'll note here that this assumes that the $z_i$'s under the variational distributions are independent; the true joint $p(z_1, ldots z_m)$ is almost certainly going to have some dependence among the variables. In this sense, we are trading off accuracy (throwing away all covariances) for some computational benefits.
Now, if we make that independence assumption, we can see that the joint reduces down to
$$q(z) = q(z_1)q(z_2)ldots q(z_m) = prod_{i=1}^m q(z_i)$$
Which is the form that the mean field family takes. As for your question about how this won't reduce to a constant, I'm not entirely sure what you mean. All of the $z_i$'s are random variables, so I don't see how this could become a constant.
New contributor
$endgroup$
Loosely speaking, the mean field family defines a specific class of joint distributions. So $z$ here is actually a parameter vector of length m. That means that $q(z)$ describes a joint distribution over all of the individual z's, and can be written as
$$q(z) = q(z_1, z_2, ldots, z_m)$$
We can use the chain rule to factorize this:
$$ = q(z_1)q(z_2|z_1)ldots q(z_m|z_1, z_2, ldots z_{m-1})$$
Now, for this joint distribution to be in the mean field family, we make a simplifying assumption and assume that all of the $z_i$s are independent from each other. I'll note here that this assumes that the $z_i$'s under the variational distributions are independent; the true joint $p(z_1, ldots z_m)$ is almost certainly going to have some dependence among the variables. In this sense, we are trading off accuracy (throwing away all covariances) for some computational benefits.
Now, if we make that independence assumption, we can see that the joint reduces down to
$$q(z) = q(z_1)q(z_2)ldots q(z_m) = prod_{i=1}^m q(z_i)$$
Which is the form that the mean field family takes. As for your question about how this won't reduce to a constant, I'm not entirely sure what you mean. All of the $z_i$'s are random variables, so I don't see how this could become a constant.
New contributor
New contributor
answered 9 hours ago
snickerdoodles777snickerdoodles777
462
462
New contributor
New contributor
$begingroup$
This is really helpful and has clarified things immensely. What was catching me out was where all the marginal probabilities went; by explaining that this is an approximation that trades off accuracy for computability over the joint distribution makes it much more intuitive. Thanks indeed!
$endgroup$
– Lodore66
8 hours ago
add a comment |
$begingroup$
This is really helpful and has clarified things immensely. What was catching me out was where all the marginal probabilities went; by explaining that this is an approximation that trades off accuracy for computability over the joint distribution makes it much more intuitive. Thanks indeed!
$endgroup$
– Lodore66
8 hours ago
$begingroup$
This is really helpful and has clarified things immensely. What was catching me out was where all the marginal probabilities went; by explaining that this is an approximation that trades off accuracy for computability over the joint distribution makes it much more intuitive. Thanks indeed!
$endgroup$
– Lodore66
8 hours ago
$begingroup$
This is really helpful and has clarified things immensely. What was catching me out was where all the marginal probabilities went; by explaining that this is an approximation that trades off accuracy for computability over the joint distribution makes it much more intuitive. Thanks indeed!
$endgroup$
– Lodore66
8 hours ago
add a comment |
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f391776%2fwhats-a-mean-field-variational-family%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown