What's a mean field variational family?












1












$begingroup$


I'm working through variational Bayesian methods at the moment, and I think I have a grasp of the bigger picture. Where I sometimes have trouble is with the exact details of how it can be implemented. Right now, this centrs on the idea of a mean field variational family. Specifically, Blei et al. say the following:




In this review we focus on the mean-field variational family, where
the latent variables are mutually independent and each governed by a
distinct factor in the variational density. A generic member of the
mean-field variational family is




$$q(z) = prod_ {j=1}^m q_j (z_j )$$



I'm afraid that I can't see how a distribution can be expressed as a product in this way without being reduced to a constant. Clearly, I'm missing something fundamental, but I seem to be going around in circles trying to google the answer.



Can anyone supply some intuition? Thannk!










share|cite|improve this question









$endgroup$

















    1












    $begingroup$


    I'm working through variational Bayesian methods at the moment, and I think I have a grasp of the bigger picture. Where I sometimes have trouble is with the exact details of how it can be implemented. Right now, this centrs on the idea of a mean field variational family. Specifically, Blei et al. say the following:




    In this review we focus on the mean-field variational family, where
    the latent variables are mutually independent and each governed by a
    distinct factor in the variational density. A generic member of the
    mean-field variational family is




    $$q(z) = prod_ {j=1}^m q_j (z_j )$$



    I'm afraid that I can't see how a distribution can be expressed as a product in this way without being reduced to a constant. Clearly, I'm missing something fundamental, but I seem to be going around in circles trying to google the answer.



    Can anyone supply some intuition? Thannk!










    share|cite|improve this question









    $endgroup$















      1












      1








      1





      $begingroup$


      I'm working through variational Bayesian methods at the moment, and I think I have a grasp of the bigger picture. Where I sometimes have trouble is with the exact details of how it can be implemented. Right now, this centrs on the idea of a mean field variational family. Specifically, Blei et al. say the following:




      In this review we focus on the mean-field variational family, where
      the latent variables are mutually independent and each governed by a
      distinct factor in the variational density. A generic member of the
      mean-field variational family is




      $$q(z) = prod_ {j=1}^m q_j (z_j )$$



      I'm afraid that I can't see how a distribution can be expressed as a product in this way without being reduced to a constant. Clearly, I'm missing something fundamental, but I seem to be going around in circles trying to google the answer.



      Can anyone supply some intuition? Thannk!










      share|cite|improve this question









      $endgroup$




      I'm working through variational Bayesian methods at the moment, and I think I have a grasp of the bigger picture. Where I sometimes have trouble is with the exact details of how it can be implemented. Right now, this centrs on the idea of a mean field variational family. Specifically, Blei et al. say the following:




      In this review we focus on the mean-field variational family, where
      the latent variables are mutually independent and each governed by a
      distinct factor in the variational density. A generic member of the
      mean-field variational family is




      $$q(z) = prod_ {j=1}^m q_j (z_j )$$



      I'm afraid that I can't see how a distribution can be expressed as a product in this way without being reduced to a constant. Clearly, I'm missing something fundamental, but I seem to be going around in circles trying to google the answer.



      Can anyone supply some intuition? Thannk!







      probability computational-statistics variational-bayes






      share|cite|improve this question













      share|cite|improve this question











      share|cite|improve this question




      share|cite|improve this question










      asked 9 hours ago









      Lodore66Lodore66

      1083




      1083






















          1 Answer
          1






          active

          oldest

          votes


















          3












          $begingroup$

          Loosely speaking, the mean field family defines a specific class of joint distributions. So $z$ here is actually a parameter vector of length m. That means that $q(z)$ describes a joint distribution over all of the individual z's, and can be written as



          $$q(z) = q(z_1, z_2, ldots, z_m)$$



          We can use the chain rule to factorize this:



          $$ = q(z_1)q(z_2|z_1)ldots q(z_m|z_1, z_2, ldots z_{m-1})$$



          Now, for this joint distribution to be in the mean field family, we make a simplifying assumption and assume that all of the $z_i$s are independent from each other. I'll note here that this assumes that the $z_i$'s under the variational distributions are independent; the true joint $p(z_1, ldots z_m)$ is almost certainly going to have some dependence among the variables. In this sense, we are trading off accuracy (throwing away all covariances) for some computational benefits.



          Now, if we make that independence assumption, we can see that the joint reduces down to



          $$q(z) = q(z_1)q(z_2)ldots q(z_m) = prod_{i=1}^m q(z_i)$$



          Which is the form that the mean field family takes. As for your question about how this won't reduce to a constant, I'm not entirely sure what you mean. All of the $z_i$'s are random variables, so I don't see how this could become a constant.






          share|cite|improve this answer








          New contributor




          snickerdoodles777 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.






          $endgroup$













          • $begingroup$
            This is really helpful and has clarified things immensely. What was catching me out was where all the marginal probabilities went; by explaining that this is an approximation that trades off accuracy for computability over the joint distribution makes it much more intuitive. Thanks indeed!
            $endgroup$
            – Lodore66
            8 hours ago











          Your Answer





          StackExchange.ifUsing("editor", function () {
          return StackExchange.using("mathjaxEditing", function () {
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          });
          });
          }, "mathjax-editing");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "65"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f391776%2fwhats-a-mean-field-variational-family%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          3












          $begingroup$

          Loosely speaking, the mean field family defines a specific class of joint distributions. So $z$ here is actually a parameter vector of length m. That means that $q(z)$ describes a joint distribution over all of the individual z's, and can be written as



          $$q(z) = q(z_1, z_2, ldots, z_m)$$



          We can use the chain rule to factorize this:



          $$ = q(z_1)q(z_2|z_1)ldots q(z_m|z_1, z_2, ldots z_{m-1})$$



          Now, for this joint distribution to be in the mean field family, we make a simplifying assumption and assume that all of the $z_i$s are independent from each other. I'll note here that this assumes that the $z_i$'s under the variational distributions are independent; the true joint $p(z_1, ldots z_m)$ is almost certainly going to have some dependence among the variables. In this sense, we are trading off accuracy (throwing away all covariances) for some computational benefits.



          Now, if we make that independence assumption, we can see that the joint reduces down to



          $$q(z) = q(z_1)q(z_2)ldots q(z_m) = prod_{i=1}^m q(z_i)$$



          Which is the form that the mean field family takes. As for your question about how this won't reduce to a constant, I'm not entirely sure what you mean. All of the $z_i$'s are random variables, so I don't see how this could become a constant.






          share|cite|improve this answer








          New contributor




          snickerdoodles777 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.






          $endgroup$













          • $begingroup$
            This is really helpful and has clarified things immensely. What was catching me out was where all the marginal probabilities went; by explaining that this is an approximation that trades off accuracy for computability over the joint distribution makes it much more intuitive. Thanks indeed!
            $endgroup$
            – Lodore66
            8 hours ago
















          3












          $begingroup$

          Loosely speaking, the mean field family defines a specific class of joint distributions. So $z$ here is actually a parameter vector of length m. That means that $q(z)$ describes a joint distribution over all of the individual z's, and can be written as



          $$q(z) = q(z_1, z_2, ldots, z_m)$$



          We can use the chain rule to factorize this:



          $$ = q(z_1)q(z_2|z_1)ldots q(z_m|z_1, z_2, ldots z_{m-1})$$



          Now, for this joint distribution to be in the mean field family, we make a simplifying assumption and assume that all of the $z_i$s are independent from each other. I'll note here that this assumes that the $z_i$'s under the variational distributions are independent; the true joint $p(z_1, ldots z_m)$ is almost certainly going to have some dependence among the variables. In this sense, we are trading off accuracy (throwing away all covariances) for some computational benefits.



          Now, if we make that independence assumption, we can see that the joint reduces down to



          $$q(z) = q(z_1)q(z_2)ldots q(z_m) = prod_{i=1}^m q(z_i)$$



          Which is the form that the mean field family takes. As for your question about how this won't reduce to a constant, I'm not entirely sure what you mean. All of the $z_i$'s are random variables, so I don't see how this could become a constant.






          share|cite|improve this answer








          New contributor




          snickerdoodles777 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.






          $endgroup$













          • $begingroup$
            This is really helpful and has clarified things immensely. What was catching me out was where all the marginal probabilities went; by explaining that this is an approximation that trades off accuracy for computability over the joint distribution makes it much more intuitive. Thanks indeed!
            $endgroup$
            – Lodore66
            8 hours ago














          3












          3








          3





          $begingroup$

          Loosely speaking, the mean field family defines a specific class of joint distributions. So $z$ here is actually a parameter vector of length m. That means that $q(z)$ describes a joint distribution over all of the individual z's, and can be written as



          $$q(z) = q(z_1, z_2, ldots, z_m)$$



          We can use the chain rule to factorize this:



          $$ = q(z_1)q(z_2|z_1)ldots q(z_m|z_1, z_2, ldots z_{m-1})$$



          Now, for this joint distribution to be in the mean field family, we make a simplifying assumption and assume that all of the $z_i$s are independent from each other. I'll note here that this assumes that the $z_i$'s under the variational distributions are independent; the true joint $p(z_1, ldots z_m)$ is almost certainly going to have some dependence among the variables. In this sense, we are trading off accuracy (throwing away all covariances) for some computational benefits.



          Now, if we make that independence assumption, we can see that the joint reduces down to



          $$q(z) = q(z_1)q(z_2)ldots q(z_m) = prod_{i=1}^m q(z_i)$$



          Which is the form that the mean field family takes. As for your question about how this won't reduce to a constant, I'm not entirely sure what you mean. All of the $z_i$'s are random variables, so I don't see how this could become a constant.






          share|cite|improve this answer








          New contributor




          snickerdoodles777 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.






          $endgroup$



          Loosely speaking, the mean field family defines a specific class of joint distributions. So $z$ here is actually a parameter vector of length m. That means that $q(z)$ describes a joint distribution over all of the individual z's, and can be written as



          $$q(z) = q(z_1, z_2, ldots, z_m)$$



          We can use the chain rule to factorize this:



          $$ = q(z_1)q(z_2|z_1)ldots q(z_m|z_1, z_2, ldots z_{m-1})$$



          Now, for this joint distribution to be in the mean field family, we make a simplifying assumption and assume that all of the $z_i$s are independent from each other. I'll note here that this assumes that the $z_i$'s under the variational distributions are independent; the true joint $p(z_1, ldots z_m)$ is almost certainly going to have some dependence among the variables. In this sense, we are trading off accuracy (throwing away all covariances) for some computational benefits.



          Now, if we make that independence assumption, we can see that the joint reduces down to



          $$q(z) = q(z_1)q(z_2)ldots q(z_m) = prod_{i=1}^m q(z_i)$$



          Which is the form that the mean field family takes. As for your question about how this won't reduce to a constant, I'm not entirely sure what you mean. All of the $z_i$'s are random variables, so I don't see how this could become a constant.







          share|cite|improve this answer








          New contributor




          snickerdoodles777 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.









          share|cite|improve this answer



          share|cite|improve this answer






          New contributor




          snickerdoodles777 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.









          answered 9 hours ago









          snickerdoodles777snickerdoodles777

          462




          462




          New contributor




          snickerdoodles777 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.





          New contributor





          snickerdoodles777 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.






          snickerdoodles777 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.












          • $begingroup$
            This is really helpful and has clarified things immensely. What was catching me out was where all the marginal probabilities went; by explaining that this is an approximation that trades off accuracy for computability over the joint distribution makes it much more intuitive. Thanks indeed!
            $endgroup$
            – Lodore66
            8 hours ago


















          • $begingroup$
            This is really helpful and has clarified things immensely. What was catching me out was where all the marginal probabilities went; by explaining that this is an approximation that trades off accuracy for computability over the joint distribution makes it much more intuitive. Thanks indeed!
            $endgroup$
            – Lodore66
            8 hours ago
















          $begingroup$
          This is really helpful and has clarified things immensely. What was catching me out was where all the marginal probabilities went; by explaining that this is an approximation that trades off accuracy for computability over the joint distribution makes it much more intuitive. Thanks indeed!
          $endgroup$
          – Lodore66
          8 hours ago




          $begingroup$
          This is really helpful and has clarified things immensely. What was catching me out was where all the marginal probabilities went; by explaining that this is an approximation that trades off accuracy for computability over the joint distribution makes it much more intuitive. Thanks indeed!
          $endgroup$
          – Lodore66
          8 hours ago


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Cross Validated!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f391776%2fwhats-a-mean-field-variational-family%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          If I really need a card on my start hand, how many mulligans make sense? [duplicate]

          Alcedinidae

          Can an atomic nucleus contain both particles and antiparticles? [duplicate]