Weka API: How to obtain a joint probability, e.g., Pr(A=x, B=y), from a BayesNet object?
I am using Weka Java API. I trained a Bayesnet on an Instances object (data set) with class (label) unspecified.
/**
* Initialization
*/
Instances data = ...;
BayesNet bn = new EditableBayesNet(data);
SearchAlgorithm learner = new TAN();
SimpleEstimator estimator = new SimpleEstimator();
/**
* Training
*/
bn.initStructure();
learner.buildStructure(bn, data);
estimator.estimateCPTs(bn);
Suppose the Instances object data
has three attributes, A, B and C, and the dependency discovered is B->A, C->B.
The trained Bayesnet object bn
is not for classification (I did not specify the class attribute for data
), but I just want to calculate the joint probability of Pr(A=x, B=y). How do I get this probability from bn
?
As far as I know, the distributionForInstance
function of BayesNet
may be the closest thing to use. It returns the probability distribution of a given instance (in our case, the instances is (A=x, B=y)). To use it, I could create a new Instance
object testDataInstance
and set value A=x
and B=y
, and call distributionForInstance
with testDataInstance
.
/**
* Obtain Pr(A="x", B="y")
*/
Instance testDataInstance = new SparseInstance(3);
Instances testDataSet = new Instances(
bn.m_Instances);
testDataSet.clear();
testDataInstance.setValue(testDataSet.attribute("A"), "x");
testDataInstance.setValue(testDataSet.attribute("B"), "y");
testDataSet.add(testDataInstance);
bn.distributionForInstance(testDataSet.firstInstance());
However, to my knowledge, the probability distribution indicates probabilities of all possible values for the class attribute in the bayesnet. As I did not specify a class attribute for data
, it is unclear to me what the returned probability distribution means.
java machine-learning weka bayesian bayesian-networks
add a comment |
I am using Weka Java API. I trained a Bayesnet on an Instances object (data set) with class (label) unspecified.
/**
* Initialization
*/
Instances data = ...;
BayesNet bn = new EditableBayesNet(data);
SearchAlgorithm learner = new TAN();
SimpleEstimator estimator = new SimpleEstimator();
/**
* Training
*/
bn.initStructure();
learner.buildStructure(bn, data);
estimator.estimateCPTs(bn);
Suppose the Instances object data
has three attributes, A, B and C, and the dependency discovered is B->A, C->B.
The trained Bayesnet object bn
is not for classification (I did not specify the class attribute for data
), but I just want to calculate the joint probability of Pr(A=x, B=y). How do I get this probability from bn
?
As far as I know, the distributionForInstance
function of BayesNet
may be the closest thing to use. It returns the probability distribution of a given instance (in our case, the instances is (A=x, B=y)). To use it, I could create a new Instance
object testDataInstance
and set value A=x
and B=y
, and call distributionForInstance
with testDataInstance
.
/**
* Obtain Pr(A="x", B="y")
*/
Instance testDataInstance = new SparseInstance(3);
Instances testDataSet = new Instances(
bn.m_Instances);
testDataSet.clear();
testDataInstance.setValue(testDataSet.attribute("A"), "x");
testDataInstance.setValue(testDataSet.attribute("B"), "y");
testDataSet.add(testDataInstance);
bn.distributionForInstance(testDataSet.firstInstance());
However, to my knowledge, the probability distribution indicates probabilities of all possible values for the class attribute in the bayesnet. As I did not specify a class attribute for data
, it is unclear to me what the returned probability distribution means.
java machine-learning weka bayesian bayesian-networks
add a comment |
I am using Weka Java API. I trained a Bayesnet on an Instances object (data set) with class (label) unspecified.
/**
* Initialization
*/
Instances data = ...;
BayesNet bn = new EditableBayesNet(data);
SearchAlgorithm learner = new TAN();
SimpleEstimator estimator = new SimpleEstimator();
/**
* Training
*/
bn.initStructure();
learner.buildStructure(bn, data);
estimator.estimateCPTs(bn);
Suppose the Instances object data
has three attributes, A, B and C, and the dependency discovered is B->A, C->B.
The trained Bayesnet object bn
is not for classification (I did not specify the class attribute for data
), but I just want to calculate the joint probability of Pr(A=x, B=y). How do I get this probability from bn
?
As far as I know, the distributionForInstance
function of BayesNet
may be the closest thing to use. It returns the probability distribution of a given instance (in our case, the instances is (A=x, B=y)). To use it, I could create a new Instance
object testDataInstance
and set value A=x
and B=y
, and call distributionForInstance
with testDataInstance
.
/**
* Obtain Pr(A="x", B="y")
*/
Instance testDataInstance = new SparseInstance(3);
Instances testDataSet = new Instances(
bn.m_Instances);
testDataSet.clear();
testDataInstance.setValue(testDataSet.attribute("A"), "x");
testDataInstance.setValue(testDataSet.attribute("B"), "y");
testDataSet.add(testDataInstance);
bn.distributionForInstance(testDataSet.firstInstance());
However, to my knowledge, the probability distribution indicates probabilities of all possible values for the class attribute in the bayesnet. As I did not specify a class attribute for data
, it is unclear to me what the returned probability distribution means.
java machine-learning weka bayesian bayesian-networks
I am using Weka Java API. I trained a Bayesnet on an Instances object (data set) with class (label) unspecified.
/**
* Initialization
*/
Instances data = ...;
BayesNet bn = new EditableBayesNet(data);
SearchAlgorithm learner = new TAN();
SimpleEstimator estimator = new SimpleEstimator();
/**
* Training
*/
bn.initStructure();
learner.buildStructure(bn, data);
estimator.estimateCPTs(bn);
Suppose the Instances object data
has three attributes, A, B and C, and the dependency discovered is B->A, C->B.
The trained Bayesnet object bn
is not for classification (I did not specify the class attribute for data
), but I just want to calculate the joint probability of Pr(A=x, B=y). How do I get this probability from bn
?
As far as I know, the distributionForInstance
function of BayesNet
may be the closest thing to use. It returns the probability distribution of a given instance (in our case, the instances is (A=x, B=y)). To use it, I could create a new Instance
object testDataInstance
and set value A=x
and B=y
, and call distributionForInstance
with testDataInstance
.
/**
* Obtain Pr(A="x", B="y")
*/
Instance testDataInstance = new SparseInstance(3);
Instances testDataSet = new Instances(
bn.m_Instances);
testDataSet.clear();
testDataInstance.setValue(testDataSet.attribute("A"), "x");
testDataInstance.setValue(testDataSet.attribute("B"), "y");
testDataSet.add(testDataInstance);
bn.distributionForInstance(testDataSet.firstInstance());
However, to my knowledge, the probability distribution indicates probabilities of all possible values for the class attribute in the bayesnet. As I did not specify a class attribute for data
, it is unclear to me what the returned probability distribution means.
java machine-learning weka bayesian bayesian-networks
java machine-learning weka bayesian bayesian-networks
edited Nov 27 '18 at 14:38
Zhongjun 'Mark' Jin
asked Nov 20 '18 at 22:40
Zhongjun 'Mark' JinZhongjun 'Mark' Jin
8961229
8961229
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
The javadoc page for distributionForInstance says that it calculates the class membership probabilities: http://weka.sourceforge.net/doc.dev/weka/classifiers/bayes/BayesNet.html#distributionForInstance-weka.core.Instance-
So, that's not what you want probably. I think you can use the getDistribution(int nTargetNode)
or getDistribution(java.lang.String sName)
to achieve your answer.
P(A=x, B=y) can be calculated as follows,
P(A=x|B=y) = P(A=x, B=y)/P(B=y), which implies,
P(A=x, B=y) = P(A=x|B=y)*P(B=y)
Here is a pseudocode which illustrates my approach,
double AP = bn.getDistribution("A"); // gives P(A|B) table
double BP = bn.getDistribution("B"); // gives P(B|C) table
double BPy = 0;
// I am assuming x,y to be ints, but if they are not,
// there should be some way of calculating BP[0][y] or AP[y][x]
// BP[0][y] represents P(B=y) and AP[y][x] represents P(A=x|B=y)
for(int i=0;i<BP.length;i++){
BPy+=BP[0][y];
}
//BPy now contains probability of P(B=y)
System.out.println(AP[y][x]*BPy)
Thanks @mettleap. This is exactly what I thought. We need conditional probability P(A=x|B=y) and marginal probability P(B=y) to get the joint probability P(A=x, B=y). I found BayesNet has a function "getMargin" that is supposed to return the marginal probability distribution of a given node, which seems to be an alternative way to get BPy. However, "getMargin" returns all zero for all nodes. Do you know why is that?
– Zhongjun 'Mark' Jin
Nov 27 '18 at 2:58
1
@Zhongjun'Mark'Jin, I think you have to use the estimateCPTs(BayesNet bayesNet) in the SimpleEstimator class first so that it fills the cpts, then maybe it will give the correct values ... also, there is an alpha parameter which you have to set for the SimpleEstimator which will decide the actual values in the CPTs
– mettleap
Nov 27 '18 at 5:39
Thanks. I did run estimateCPTs (forgot to add it in the post previously). I did not particularly specify alpha for the estimator and it was set to 0.5 by default.
– Zhongjun 'Mark' Jin
Nov 27 '18 at 7:11
Btw, I created a new thread at stackoverflow.com/questions/53494595/… for this question.
– Zhongjun 'Mark' Jin
Nov 27 '18 at 7:22
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53402674%2fweka-api-how-to-obtain-a-joint-probability-e-g-pra-x-b-y-from-a-bayesnet%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
The javadoc page for distributionForInstance says that it calculates the class membership probabilities: http://weka.sourceforge.net/doc.dev/weka/classifiers/bayes/BayesNet.html#distributionForInstance-weka.core.Instance-
So, that's not what you want probably. I think you can use the getDistribution(int nTargetNode)
or getDistribution(java.lang.String sName)
to achieve your answer.
P(A=x, B=y) can be calculated as follows,
P(A=x|B=y) = P(A=x, B=y)/P(B=y), which implies,
P(A=x, B=y) = P(A=x|B=y)*P(B=y)
Here is a pseudocode which illustrates my approach,
double AP = bn.getDistribution("A"); // gives P(A|B) table
double BP = bn.getDistribution("B"); // gives P(B|C) table
double BPy = 0;
// I am assuming x,y to be ints, but if they are not,
// there should be some way of calculating BP[0][y] or AP[y][x]
// BP[0][y] represents P(B=y) and AP[y][x] represents P(A=x|B=y)
for(int i=0;i<BP.length;i++){
BPy+=BP[0][y];
}
//BPy now contains probability of P(B=y)
System.out.println(AP[y][x]*BPy)
Thanks @mettleap. This is exactly what I thought. We need conditional probability P(A=x|B=y) and marginal probability P(B=y) to get the joint probability P(A=x, B=y). I found BayesNet has a function "getMargin" that is supposed to return the marginal probability distribution of a given node, which seems to be an alternative way to get BPy. However, "getMargin" returns all zero for all nodes. Do you know why is that?
– Zhongjun 'Mark' Jin
Nov 27 '18 at 2:58
1
@Zhongjun'Mark'Jin, I think you have to use the estimateCPTs(BayesNet bayesNet) in the SimpleEstimator class first so that it fills the cpts, then maybe it will give the correct values ... also, there is an alpha parameter which you have to set for the SimpleEstimator which will decide the actual values in the CPTs
– mettleap
Nov 27 '18 at 5:39
Thanks. I did run estimateCPTs (forgot to add it in the post previously). I did not particularly specify alpha for the estimator and it was set to 0.5 by default.
– Zhongjun 'Mark' Jin
Nov 27 '18 at 7:11
Btw, I created a new thread at stackoverflow.com/questions/53494595/… for this question.
– Zhongjun 'Mark' Jin
Nov 27 '18 at 7:22
add a comment |
The javadoc page for distributionForInstance says that it calculates the class membership probabilities: http://weka.sourceforge.net/doc.dev/weka/classifiers/bayes/BayesNet.html#distributionForInstance-weka.core.Instance-
So, that's not what you want probably. I think you can use the getDistribution(int nTargetNode)
or getDistribution(java.lang.String sName)
to achieve your answer.
P(A=x, B=y) can be calculated as follows,
P(A=x|B=y) = P(A=x, B=y)/P(B=y), which implies,
P(A=x, B=y) = P(A=x|B=y)*P(B=y)
Here is a pseudocode which illustrates my approach,
double AP = bn.getDistribution("A"); // gives P(A|B) table
double BP = bn.getDistribution("B"); // gives P(B|C) table
double BPy = 0;
// I am assuming x,y to be ints, but if they are not,
// there should be some way of calculating BP[0][y] or AP[y][x]
// BP[0][y] represents P(B=y) and AP[y][x] represents P(A=x|B=y)
for(int i=0;i<BP.length;i++){
BPy+=BP[0][y];
}
//BPy now contains probability of P(B=y)
System.out.println(AP[y][x]*BPy)
Thanks @mettleap. This is exactly what I thought. We need conditional probability P(A=x|B=y) and marginal probability P(B=y) to get the joint probability P(A=x, B=y). I found BayesNet has a function "getMargin" that is supposed to return the marginal probability distribution of a given node, which seems to be an alternative way to get BPy. However, "getMargin" returns all zero for all nodes. Do you know why is that?
– Zhongjun 'Mark' Jin
Nov 27 '18 at 2:58
1
@Zhongjun'Mark'Jin, I think you have to use the estimateCPTs(BayesNet bayesNet) in the SimpleEstimator class first so that it fills the cpts, then maybe it will give the correct values ... also, there is an alpha parameter which you have to set for the SimpleEstimator which will decide the actual values in the CPTs
– mettleap
Nov 27 '18 at 5:39
Thanks. I did run estimateCPTs (forgot to add it in the post previously). I did not particularly specify alpha for the estimator and it was set to 0.5 by default.
– Zhongjun 'Mark' Jin
Nov 27 '18 at 7:11
Btw, I created a new thread at stackoverflow.com/questions/53494595/… for this question.
– Zhongjun 'Mark' Jin
Nov 27 '18 at 7:22
add a comment |
The javadoc page for distributionForInstance says that it calculates the class membership probabilities: http://weka.sourceforge.net/doc.dev/weka/classifiers/bayes/BayesNet.html#distributionForInstance-weka.core.Instance-
So, that's not what you want probably. I think you can use the getDistribution(int nTargetNode)
or getDistribution(java.lang.String sName)
to achieve your answer.
P(A=x, B=y) can be calculated as follows,
P(A=x|B=y) = P(A=x, B=y)/P(B=y), which implies,
P(A=x, B=y) = P(A=x|B=y)*P(B=y)
Here is a pseudocode which illustrates my approach,
double AP = bn.getDistribution("A"); // gives P(A|B) table
double BP = bn.getDistribution("B"); // gives P(B|C) table
double BPy = 0;
// I am assuming x,y to be ints, but if they are not,
// there should be some way of calculating BP[0][y] or AP[y][x]
// BP[0][y] represents P(B=y) and AP[y][x] represents P(A=x|B=y)
for(int i=0;i<BP.length;i++){
BPy+=BP[0][y];
}
//BPy now contains probability of P(B=y)
System.out.println(AP[y][x]*BPy)
The javadoc page for distributionForInstance says that it calculates the class membership probabilities: http://weka.sourceforge.net/doc.dev/weka/classifiers/bayes/BayesNet.html#distributionForInstance-weka.core.Instance-
So, that's not what you want probably. I think you can use the getDistribution(int nTargetNode)
or getDistribution(java.lang.String sName)
to achieve your answer.
P(A=x, B=y) can be calculated as follows,
P(A=x|B=y) = P(A=x, B=y)/P(B=y), which implies,
P(A=x, B=y) = P(A=x|B=y)*P(B=y)
Here is a pseudocode which illustrates my approach,
double AP = bn.getDistribution("A"); // gives P(A|B) table
double BP = bn.getDistribution("B"); // gives P(B|C) table
double BPy = 0;
// I am assuming x,y to be ints, but if they are not,
// there should be some way of calculating BP[0][y] or AP[y][x]
// BP[0][y] represents P(B=y) and AP[y][x] represents P(A=x|B=y)
for(int i=0;i<BP.length;i++){
BPy+=BP[0][y];
}
//BPy now contains probability of P(B=y)
System.out.println(AP[y][x]*BPy)
answered Nov 26 '18 at 18:00
mettleapmettleap
1,080216
1,080216
Thanks @mettleap. This is exactly what I thought. We need conditional probability P(A=x|B=y) and marginal probability P(B=y) to get the joint probability P(A=x, B=y). I found BayesNet has a function "getMargin" that is supposed to return the marginal probability distribution of a given node, which seems to be an alternative way to get BPy. However, "getMargin" returns all zero for all nodes. Do you know why is that?
– Zhongjun 'Mark' Jin
Nov 27 '18 at 2:58
1
@Zhongjun'Mark'Jin, I think you have to use the estimateCPTs(BayesNet bayesNet) in the SimpleEstimator class first so that it fills the cpts, then maybe it will give the correct values ... also, there is an alpha parameter which you have to set for the SimpleEstimator which will decide the actual values in the CPTs
– mettleap
Nov 27 '18 at 5:39
Thanks. I did run estimateCPTs (forgot to add it in the post previously). I did not particularly specify alpha for the estimator and it was set to 0.5 by default.
– Zhongjun 'Mark' Jin
Nov 27 '18 at 7:11
Btw, I created a new thread at stackoverflow.com/questions/53494595/… for this question.
– Zhongjun 'Mark' Jin
Nov 27 '18 at 7:22
add a comment |
Thanks @mettleap. This is exactly what I thought. We need conditional probability P(A=x|B=y) and marginal probability P(B=y) to get the joint probability P(A=x, B=y). I found BayesNet has a function "getMargin" that is supposed to return the marginal probability distribution of a given node, which seems to be an alternative way to get BPy. However, "getMargin" returns all zero for all nodes. Do you know why is that?
– Zhongjun 'Mark' Jin
Nov 27 '18 at 2:58
1
@Zhongjun'Mark'Jin, I think you have to use the estimateCPTs(BayesNet bayesNet) in the SimpleEstimator class first so that it fills the cpts, then maybe it will give the correct values ... also, there is an alpha parameter which you have to set for the SimpleEstimator which will decide the actual values in the CPTs
– mettleap
Nov 27 '18 at 5:39
Thanks. I did run estimateCPTs (forgot to add it in the post previously). I did not particularly specify alpha for the estimator and it was set to 0.5 by default.
– Zhongjun 'Mark' Jin
Nov 27 '18 at 7:11
Btw, I created a new thread at stackoverflow.com/questions/53494595/… for this question.
– Zhongjun 'Mark' Jin
Nov 27 '18 at 7:22
Thanks @mettleap. This is exactly what I thought. We need conditional probability P(A=x|B=y) and marginal probability P(B=y) to get the joint probability P(A=x, B=y). I found BayesNet has a function "getMargin" that is supposed to return the marginal probability distribution of a given node, which seems to be an alternative way to get BPy. However, "getMargin" returns all zero for all nodes. Do you know why is that?
– Zhongjun 'Mark' Jin
Nov 27 '18 at 2:58
Thanks @mettleap. This is exactly what I thought. We need conditional probability P(A=x|B=y) and marginal probability P(B=y) to get the joint probability P(A=x, B=y). I found BayesNet has a function "getMargin" that is supposed to return the marginal probability distribution of a given node, which seems to be an alternative way to get BPy. However, "getMargin" returns all zero for all nodes. Do you know why is that?
– Zhongjun 'Mark' Jin
Nov 27 '18 at 2:58
1
1
@Zhongjun'Mark'Jin, I think you have to use the estimateCPTs(BayesNet bayesNet) in the SimpleEstimator class first so that it fills the cpts, then maybe it will give the correct values ... also, there is an alpha parameter which you have to set for the SimpleEstimator which will decide the actual values in the CPTs
– mettleap
Nov 27 '18 at 5:39
@Zhongjun'Mark'Jin, I think you have to use the estimateCPTs(BayesNet bayesNet) in the SimpleEstimator class first so that it fills the cpts, then maybe it will give the correct values ... also, there is an alpha parameter which you have to set for the SimpleEstimator which will decide the actual values in the CPTs
– mettleap
Nov 27 '18 at 5:39
Thanks. I did run estimateCPTs (forgot to add it in the post previously). I did not particularly specify alpha for the estimator and it was set to 0.5 by default.
– Zhongjun 'Mark' Jin
Nov 27 '18 at 7:11
Thanks. I did run estimateCPTs (forgot to add it in the post previously). I did not particularly specify alpha for the estimator and it was set to 0.5 by default.
– Zhongjun 'Mark' Jin
Nov 27 '18 at 7:11
Btw, I created a new thread at stackoverflow.com/questions/53494595/… for this question.
– Zhongjun 'Mark' Jin
Nov 27 '18 at 7:22
Btw, I created a new thread at stackoverflow.com/questions/53494595/… for this question.
– Zhongjun 'Mark' Jin
Nov 27 '18 at 7:22
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53402674%2fweka-api-how-to-obtain-a-joint-probability-e-g-pra-x-b-y-from-a-bayesnet%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown