Weka API: How to obtain a joint probability, e.g., Pr(A=x, B=y), from a BayesNet object?

I am using Weka Java API. I trained a Bayesnet on an Instances object (data set) with class (label) unspecified.

/**

 * Initialization

 */

Instances data = ...;

BayesNet bn = new EditableBayesNet(data);

SearchAlgorithm learner = new TAN();

SimpleEstimator estimator = new SimpleEstimator();

/**

 * Training

 */

bn.initStructure();

learner.buildStructure(bn, data);

estimator.estimateCPTs(bn);

Suppose the Instances object data has three attributes, A, B and C, and the dependency discovered is B->A, C->B.

The trained Bayesnet object bn is not for classification (I did not specify the class attribute for data), but I just want to calculate the joint probability of Pr(A=x, B=y). How do I get this probability from bn?

As far as I know, the distributionForInstance function of BayesNet may be the closest thing to use. It returns the probability distribution of a given instance (in our case, the instances is (A=x, B=y)). To use it, I could create a new Instance object testDataInstance and set value A=x and B=y, and call distributionForInstance with testDataInstance.

/**

 * Obtain Pr(A="x", B="y")

 */ 

Instance testDataInstance = new SparseInstance(3);

Instances testDataSet = new Instances(

            bn.m_Instances);

testDataSet.clear();

testDataInstance.setValue(testDataSet.attribute("A"), "x");

testDataInstance.setValue(testDataSet.attribute("B"), "y");

testDataSet.add(testDataInstance);

bn.distributionForInstance(testDataSet.firstInstance());

However, to my knowledge, the probability distribution indicates probabilities of all possible values for the class attribute in the bayesnet. As I did not specify a class attribute for data, it is unclear to me what the returned probability distribution means.

edited Nov 27 '18 at 14:38

asked Nov 20 '18 at 22:40

Zhongjun 'Mark' Jin

8961229

add a comment |

I am using Weka Java API. I trained a Bayesnet on an Instances object (data set) with class (label) unspecified.

/**

 * Initialization

 */

Instances data = ...;

BayesNet bn = new EditableBayesNet(data);

SearchAlgorithm learner = new TAN();

SimpleEstimator estimator = new SimpleEstimator();

/**

 * Training

 */

bn.initStructure();

learner.buildStructure(bn, data);

estimator.estimateCPTs(bn);

Suppose the Instances object data has three attributes, A, B and C, and the dependency discovered is B->A, C->B.

/**

 * Obtain Pr(A="x", B="y")

 */ 

Instance testDataInstance = new SparseInstance(3);

Instances testDataSet = new Instances(

            bn.m_Instances);

testDataSet.clear();

testDataInstance.setValue(testDataSet.attribute("A"), "x");

testDataInstance.setValue(testDataSet.attribute("B"), "y");

testDataSet.add(testDataInstance);

bn.distributionForInstance(testDataSet.firstInstance());

edited Nov 27 '18 at 14:38

asked Nov 20 '18 at 22:40

Zhongjun 'Mark' Jin

8961229

add a comment |

I am using Weka Java API. I trained a Bayesnet on an Instances object (data set) with class (label) unspecified.

/**

 * Initialization

 */

Instances data = ...;

BayesNet bn = new EditableBayesNet(data);

SearchAlgorithm learner = new TAN();

SimpleEstimator estimator = new SimpleEstimator();

/**

 * Training

 */

bn.initStructure();

learner.buildStructure(bn, data);

estimator.estimateCPTs(bn);

Suppose the Instances object data has three attributes, A, B and C, and the dependency discovered is B->A, C->B.

/**

 * Obtain Pr(A="x", B="y")

 */ 

Instance testDataInstance = new SparseInstance(3);

Instances testDataSet = new Instances(

            bn.m_Instances);

testDataSet.clear();

testDataInstance.setValue(testDataSet.attribute("A"), "x");

testDataInstance.setValue(testDataSet.attribute("B"), "y");

testDataSet.add(testDataInstance);

bn.distributionForInstance(testDataSet.firstInstance());

edited Nov 27 '18 at 14:38

asked Nov 20 '18 at 22:40

Zhongjun 'Mark' Jin

8961229

I am using Weka Java API. I trained a Bayesnet on an Instances object (data set) with class (label) unspecified.

/**

 * Initialization

 */

Instances data = ...;

BayesNet bn = new EditableBayesNet(data);

SearchAlgorithm learner = new TAN();

SimpleEstimator estimator = new SimpleEstimator();

/**

 * Training

 */

bn.initStructure();

learner.buildStructure(bn, data);

estimator.estimateCPTs(bn);

Suppose the Instances object data has three attributes, A, B and C, and the dependency discovered is B->A, C->B.

/**

 * Obtain Pr(A="x", B="y")

 */ 

Instance testDataInstance = new SparseInstance(3);

Instances testDataSet = new Instances(

            bn.m_Instances);

testDataSet.clear();

testDataInstance.setValue(testDataSet.attribute("A"), "x");

testDataInstance.setValue(testDataSet.attribute("B"), "y");

testDataSet.add(testDataInstance);

bn.distributionForInstance(testDataSet.firstInstance());

java machine-learning weka bayesian bayesian-networks

edited Nov 27 '18 at 14:38

asked Nov 20 '18 at 22:40

Zhongjun 'Mark' Jin

8961229

edited Nov 27 '18 at 14:38

asked Nov 20 '18 at 22:40

Zhongjun 'Mark' Jin

8961229

edited Nov 27 '18 at 14:38

asked Nov 20 '18 at 22:40

Zhongjun 'Mark' Jin

8961229

asked Nov 20 '18 at 22:40

Zhongjun 'Mark' Jin

8961229

asked Nov 20 '18 at 22:40

Zhongjun 'Mark' Jin

8961229

add a comment |

1 Answer
1

active

oldest

votes

+50

The javadoc page for distributionForInstance says that it calculates the class membership probabilities: http://weka.sourceforge.net/doc.dev/weka/classifiers/bayes/BayesNet.html#distributionForInstance-weka.core.Instance-

So, that's not what you want probably. I think you can use the getDistribution(int nTargetNode) or getDistribution(java.lang.String sName) to achieve your answer.

P(A=x, B=y) can be calculated as follows,

P(A=x|B=y) = P(A=x, B=y)/P(B=y), which implies,



P(A=x, B=y) = P(A=x|B=y)*P(B=y)

Here is a pseudocode which illustrates my approach,

double AP = bn.getDistribution("A"); // gives P(A|B) table

double BP = bn.getDistribution("B"); // gives P(B|C) table

double BPy = 0;



// I am assuming x,y to be ints, but if they are not,

// there should be some way of calculating BP[0][y] or AP[y][x]

// BP[0][y] represents P(B=y) and AP[y][x] represents P(A=x|B=y)

for(int i=0;i<BP.length;i++){

    BPy+=BP[0][y];

}

//BPy now contains probability of P(B=y)

System.out.println(AP[y][x]*BPy)

answered Nov 26 '18 at 18:00

mettleap

1,080216

Thanks @mettleap. This is exactly what I thought. We need conditional probability P(A=x|B=y) and marginal probability P(B=y) to get the joint probability P(A=x, B=y). I found BayesNet has a function "getMargin" that is supposed to return the marginal probability distribution of a given node, which seems to be an alternative way to get BPy. However, "getMargin" returns all zero for all nodes. Do you know why is that?

– Zhongjun 'Mark' Jin
Nov 27 '18 at 2:58

1

@Zhongjun'Mark'Jin, I think you have to use the estimateCPTs(BayesNet bayesNet) in the SimpleEstimator class first so that it fills the cpts, then maybe it will give the correct values ... also, there is an alpha parameter which you have to set for the SimpleEstimator which will decide the actual values in the CPTs

– mettleap
Nov 27 '18 at 5:39

Thanks. I did run estimateCPTs (forgot to add it in the post previously). I did not particularly specify alpha for the estimator and it was set to 0.5 by default.

– Zhongjun 'Mark' Jin
Nov 27 '18 at 7:11

Btw, I created a new thread at stackoverflow.com/questions/53494595/… for this question.

– Zhongjun 'Mark' Jin
Nov 27 '18 at 7:22

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53402674%2fweka-api-how-to-obtain-a-joint-probability-e-g-pra-x-b-y-from-a-bayesnet%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

+50

So, that's not what you want probably. I think you can use the getDistribution(int nTargetNode) or getDistribution(java.lang.String sName) to achieve your answer.

P(A=x, B=y) can be calculated as follows,

P(A=x|B=y) = P(A=x, B=y)/P(B=y), which implies,



P(A=x, B=y) = P(A=x|B=y)*P(B=y)

Here is a pseudocode which illustrates my approach,

double AP = bn.getDistribution("A"); // gives P(A|B) table

double BP = bn.getDistribution("B"); // gives P(B|C) table

double BPy = 0;



// I am assuming x,y to be ints, but if they are not,

// there should be some way of calculating BP[0][y] or AP[y][x]

// BP[0][y] represents P(B=y) and AP[y][x] represents P(A=x|B=y)

for(int i=0;i<BP.length;i++){

    BPy+=BP[0][y];

}

//BPy now contains probability of P(B=y)

System.out.println(AP[y][x]*BPy)

answered Nov 26 '18 at 18:00

mettleap

1,080216

Thanks @mettleap. This is exactly what I thought. We need conditional probability P(A=x|B=y) and marginal probability P(B=y) to get the joint probability P(A=x, B=y). I found BayesNet has a function "getMargin" that is supposed to return the marginal probability distribution of a given node, which seems to be an alternative way to get BPy. However, "getMargin" returns all zero for all nodes. Do you know why is that?

– Zhongjun 'Mark' Jin
Nov 27 '18 at 2:58

1

@Zhongjun'Mark'Jin, I think you have to use the estimateCPTs(BayesNet bayesNet) in the SimpleEstimator class first so that it fills the cpts, then maybe it will give the correct values ... also, there is an alpha parameter which you have to set for the SimpleEstimator which will decide the actual values in the CPTs

– mettleap
Nov 27 '18 at 5:39

Thanks. I did run estimateCPTs (forgot to add it in the post previously). I did not particularly specify alpha for the estimator and it was set to 0.5 by default.

– Zhongjun 'Mark' Jin
Nov 27 '18 at 7:11

Btw, I created a new thread at stackoverflow.com/questions/53494595/… for this question.

– Zhongjun 'Mark' Jin
Nov 27 '18 at 7:22

add a comment |

+50

So, that's not what you want probably. I think you can use the getDistribution(int nTargetNode) or getDistribution(java.lang.String sName) to achieve your answer.

P(A=x, B=y) can be calculated as follows,

P(A=x|B=y) = P(A=x, B=y)/P(B=y), which implies,



P(A=x, B=y) = P(A=x|B=y)*P(B=y)

Here is a pseudocode which illustrates my approach,

double AP = bn.getDistribution("A"); // gives P(A|B) table

double BP = bn.getDistribution("B"); // gives P(B|C) table

double BPy = 0;



// I am assuming x,y to be ints, but if they are not,

// there should be some way of calculating BP[0][y] or AP[y][x]

// BP[0][y] represents P(B=y) and AP[y][x] represents P(A=x|B=y)

for(int i=0;i<BP.length;i++){

    BPy+=BP[0][y];

}

//BPy now contains probability of P(B=y)

System.out.println(AP[y][x]*BPy)

answered Nov 26 '18 at 18:00

mettleap

1,080216

Thanks @mettleap. This is exactly what I thought. We need conditional probability P(A=x|B=y) and marginal probability P(B=y) to get the joint probability P(A=x, B=y). I found BayesNet has a function "getMargin" that is supposed to return the marginal probability distribution of a given node, which seems to be an alternative way to get BPy. However, "getMargin" returns all zero for all nodes. Do you know why is that?

– Zhongjun 'Mark' Jin
Nov 27 '18 at 2:58

1

@Zhongjun'Mark'Jin, I think you have to use the estimateCPTs(BayesNet bayesNet) in the SimpleEstimator class first so that it fills the cpts, then maybe it will give the correct values ... also, there is an alpha parameter which you have to set for the SimpleEstimator which will decide the actual values in the CPTs

– mettleap
Nov 27 '18 at 5:39

Thanks. I did run estimateCPTs (forgot to add it in the post previously). I did not particularly specify alpha for the estimator and it was set to 0.5 by default.

– Zhongjun 'Mark' Jin
Nov 27 '18 at 7:11

Btw, I created a new thread at stackoverflow.com/questions/53494595/… for this question.

– Zhongjun 'Mark' Jin
Nov 27 '18 at 7:22

add a comment |

+50

So, that's not what you want probably. I think you can use the getDistribution(int nTargetNode) or getDistribution(java.lang.String sName) to achieve your answer.

P(A=x, B=y) can be calculated as follows,

P(A=x|B=y) = P(A=x, B=y)/P(B=y), which implies,



P(A=x, B=y) = P(A=x|B=y)*P(B=y)

Here is a pseudocode which illustrates my approach,

double AP = bn.getDistribution("A"); // gives P(A|B) table

double BP = bn.getDistribution("B"); // gives P(B|C) table

double BPy = 0;



// I am assuming x,y to be ints, but if they are not,

// there should be some way of calculating BP[0][y] or AP[y][x]

// BP[0][y] represents P(B=y) and AP[y][x] represents P(A=x|B=y)

for(int i=0;i<BP.length;i++){

    BPy+=BP[0][y];

}

//BPy now contains probability of P(B=y)

System.out.println(AP[y][x]*BPy)

answered Nov 26 '18 at 18:00

mettleap

1,080216

So, that's not what you want probably. I think you can use the getDistribution(int nTargetNode) or getDistribution(java.lang.String sName) to achieve your answer.

P(A=x, B=y) can be calculated as follows,

P(A=x|B=y) = P(A=x, B=y)/P(B=y), which implies,



P(A=x, B=y) = P(A=x|B=y)*P(B=y)

Here is a pseudocode which illustrates my approach,

double AP = bn.getDistribution("A"); // gives P(A|B) table

double BP = bn.getDistribution("B"); // gives P(B|C) table

double BPy = 0;



// I am assuming x,y to be ints, but if they are not,

// there should be some way of calculating BP[0][y] or AP[y][x]

// BP[0][y] represents P(B=y) and AP[y][x] represents P(A=x|B=y)

for(int i=0;i<BP.length;i++){

    BPy+=BP[0][y];

}

//BPy now contains probability of P(B=y)

System.out.println(AP[y][x]*BPy)

answered Nov 26 '18 at 18:00

mettleap

1,080216

answered Nov 26 '18 at 18:00

mettleap

1,080216

answered Nov 26 '18 at 18:00

mettleap

1,080216

answered Nov 26 '18 at 18:00

mettleap

1,080216

Thanks @mettleap. This is exactly what I thought. We need conditional probability P(A=x|B=y) and marginal probability P(B=y) to get the joint probability P(A=x, B=y). I found BayesNet has a function "getMargin" that is supposed to return the marginal probability distribution of a given node, which seems to be an alternative way to get BPy. However, "getMargin" returns all zero for all nodes. Do you know why is that?

– Zhongjun 'Mark' Jin
Nov 27 '18 at 2:58

1

@Zhongjun'Mark'Jin, I think you have to use the estimateCPTs(BayesNet bayesNet) in the SimpleEstimator class first so that it fills the cpts, then maybe it will give the correct values ... also, there is an alpha parameter which you have to set for the SimpleEstimator which will decide the actual values in the CPTs

– mettleap
Nov 27 '18 at 5:39

Thanks. I did run estimateCPTs (forgot to add it in the post previously). I did not particularly specify alpha for the estimator and it was set to 0.5 by default.

– Zhongjun 'Mark' Jin
Nov 27 '18 at 7:11

Btw, I created a new thread at stackoverflow.com/questions/53494595/… for this question.

– Zhongjun 'Mark' Jin
Nov 27 '18 at 7:22

add a comment |

Thanks @mettleap. This is exactly what I thought. We need conditional probability P(A=x|B=y) and marginal probability P(B=y) to get the joint probability P(A=x, B=y). I found BayesNet has a function "getMargin" that is supposed to return the marginal probability distribution of a given node, which seems to be an alternative way to get BPy. However, "getMargin" returns all zero for all nodes. Do you know why is that?

– Zhongjun 'Mark' Jin
Nov 27 '18 at 2:58

1

@Zhongjun'Mark'Jin, I think you have to use the estimateCPTs(BayesNet bayesNet) in the SimpleEstimator class first so that it fills the cpts, then maybe it will give the correct values ... also, there is an alpha parameter which you have to set for the SimpleEstimator which will decide the actual values in the CPTs

– mettleap
Nov 27 '18 at 5:39

Thanks. I did run estimateCPTs (forgot to add it in the post previously). I did not particularly specify alpha for the estimator and it was set to 0.5 by default.

– Zhongjun 'Mark' Jin
Nov 27 '18 at 7:11

Btw, I created a new thread at stackoverflow.com/questions/53494595/… for this question.

– Zhongjun 'Mark' Jin
Nov 27 '18 at 7:22

Thanks @mettleap. This is exactly what I thought. We need conditional probability P(A=x|B=y) and marginal probability P(B=y) to get the joint probability P(A=x, B=y). I found BayesNet has a function "getMargin" that is supposed to return the marginal probability distribution of a given node, which seems to be an alternative way to get BPy. However, "getMargin" returns all zero for all nodes. Do you know why is that?

– Zhongjun 'Mark' Jin
Nov 27 '18 at 2:58

@Zhongjun'Mark'Jin, I think you have to use the estimateCPTs(BayesNet bayesNet) in the SimpleEstimator class first so that it fills the cpts, then maybe it will give the correct values ... also, there is an alpha parameter which you have to set for the SimpleEstimator which will decide the actual values in the CPTs

– mettleap
Nov 27 '18 at 5:39

Thanks. I did run estimateCPTs (forgot to add it in the post previously). I did not particularly specify alpha for the estimator and it was set to 0.5 by default.

– Zhongjun 'Mark' Jin
Nov 27 '18 at 7:11

Btw, I created a new thread at stackoverflow.com/questions/53494595/… for this question.

– Zhongjun 'Mark' Jin
Nov 27 '18 at 7:22

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

dw,g6YrzOZu6uzJBbo rV1m,EJ8u66Ro,Lh35kHDag5feV6nxGMU

搜尋此網誌

Argthtjtr