How can I calculate probability for all each numpy value at once?

I have a function for calculating probability like below:

def multinormpdf(x, mu, var): # calculate probability of multi Gaussian distribution

    k = len(x)

    det = np.linalg.det(var)

    inv = np.linalg.inv(var)

    denominator = math.sqrt(((2*math.pi)**k)*det)

    numerator = np.dot((x - mean).transpose(), inv)

    numerator = np.dot(numerator, (x - mean))

    numerator = math.exp(-0.5 * numerator)

    return numerator/denominator

and I have mean vector, covariance matrix and 2D numpy array for test

mu = np.array([100, 105, 42]) # mean vector

var = np.array([[100, 124, 11], # covariance matrix

               [124, 150, 44],

               [11, 44, 130]])



arr = np.array([[42, 234, 124],  # arr is 43923794 x 3 matrix

                [123, 222, 112],

                [42, 213, 11],

                ...(so many values about 40,000,000 rows),

                [23, 55, 251]])

I have to calculate for probability for each value, so I used this code

for i in arr:

    print(multinormpdf(i, mu, var)) # I already know mean_vector and variance_matrix

But it is so slow...

Is there any faster way to calculate probability?
Or is there any way to calculate probability for test arr at once like 'batch'?

edited Nov 29 '18 at 14:05

asked Nov 21 '18 at 16:28

YeongHwa Jin

386

Where does normpdf come from?

– Nils Werner
Nov 21 '18 at 21:54

@Nils Werner I think that is not important. But I updated code for normpdf

– YeongHwa Jin
Nov 22 '18 at 2:36

Can you give a complete example with input and output?

– Nils Werner
Nov 22 '18 at 7:17

@Nils Werner I uploaded more explanation

– YeongHwa Jin
Nov 22 '18 at 12:04

Your code is not valid Python code, and does not work, even after fixing the syntax issues. Please post a proper MVCE!

– Nils Werner
Nov 22 '18 at 12:09

|
show 1 more comment

I have a function for calculating probability like below:

def multinormpdf(x, mu, var): # calculate probability of multi Gaussian distribution

    k = len(x)

    det = np.linalg.det(var)

    inv = np.linalg.inv(var)

    denominator = math.sqrt(((2*math.pi)**k)*det)

    numerator = np.dot((x - mean).transpose(), inv)

    numerator = np.dot(numerator, (x - mean))

    numerator = math.exp(-0.5 * numerator)

    return numerator/denominator

and I have mean vector, covariance matrix and 2D numpy array for test

mu = np.array([100, 105, 42]) # mean vector

var = np.array([[100, 124, 11], # covariance matrix

               [124, 150, 44],

               [11, 44, 130]])



arr = np.array([[42, 234, 124],  # arr is 43923794 x 3 matrix

                [123, 222, 112],

                [42, 213, 11],

                ...(so many values about 40,000,000 rows),

                [23, 55, 251]])

I have to calculate for probability for each value, so I used this code

for i in arr:

    print(multinormpdf(i, mu, var)) # I already know mean_vector and variance_matrix

But it is so slow...

Is there any faster way to calculate probability?
Or is there any way to calculate probability for test arr at once like 'batch'?

edited Nov 29 '18 at 14:05

asked Nov 21 '18 at 16:28

YeongHwa Jin

386

Where does normpdf come from?

– Nils Werner
Nov 21 '18 at 21:54

@Nils Werner I think that is not important. But I updated code for normpdf

– YeongHwa Jin
Nov 22 '18 at 2:36

Can you give a complete example with input and output?

– Nils Werner
Nov 22 '18 at 7:17

@Nils Werner I uploaded more explanation

– YeongHwa Jin
Nov 22 '18 at 12:04

Your code is not valid Python code, and does not work, even after fixing the syntax issues. Please post a proper MVCE!

– Nils Werner
Nov 22 '18 at 12:09

|
show 1 more comment

I have a function for calculating probability like below:

def multinormpdf(x, mu, var): # calculate probability of multi Gaussian distribution

    k = len(x)

    det = np.linalg.det(var)

    inv = np.linalg.inv(var)

    denominator = math.sqrt(((2*math.pi)**k)*det)

    numerator = np.dot((x - mean).transpose(), inv)

    numerator = np.dot(numerator, (x - mean))

    numerator = math.exp(-0.5 * numerator)

    return numerator/denominator

and I have mean vector, covariance matrix and 2D numpy array for test

mu = np.array([100, 105, 42]) # mean vector

var = np.array([[100, 124, 11], # covariance matrix

               [124, 150, 44],

               [11, 44, 130]])



arr = np.array([[42, 234, 124],  # arr is 43923794 x 3 matrix

                [123, 222, 112],

                [42, 213, 11],

                ...(so many values about 40,000,000 rows),

                [23, 55, 251]])

I have to calculate for probability for each value, so I used this code

for i in arr:

    print(multinormpdf(i, mu, var)) # I already know mean_vector and variance_matrix

But it is so slow...

Is there any faster way to calculate probability?
Or is there any way to calculate probability for test arr at once like 'batch'?

edited Nov 29 '18 at 14:05

asked Nov 21 '18 at 16:28

YeongHwa Jin

386

I have a function for calculating probability like below:

def multinormpdf(x, mu, var): # calculate probability of multi Gaussian distribution

    k = len(x)

    det = np.linalg.det(var)

    inv = np.linalg.inv(var)

    denominator = math.sqrt(((2*math.pi)**k)*det)

    numerator = np.dot((x - mean).transpose(), inv)

    numerator = np.dot(numerator, (x - mean))

    numerator = math.exp(-0.5 * numerator)

    return numerator/denominator

and I have mean vector, covariance matrix and 2D numpy array for test

mu = np.array([100, 105, 42]) # mean vector

var = np.array([[100, 124, 11], # covariance matrix

               [124, 150, 44],

               [11, 44, 130]])



arr = np.array([[42, 234, 124],  # arr is 43923794 x 3 matrix

                [123, 222, 112],

                [42, 213, 11],

                ...(so many values about 40,000,000 rows),

                [23, 55, 251]])

I have to calculate for probability for each value, so I used this code

for i in arr:

    print(multinormpdf(i, mu, var)) # I already know mean_vector and variance_matrix

But it is so slow...

Is there any faster way to calculate probability?
Or is there any way to calculate probability for test arr at once like 'batch'?

python python-3.x probability

edited Nov 29 '18 at 14:05

asked Nov 21 '18 at 16:28

YeongHwa Jin

386

edited Nov 29 '18 at 14:05

asked Nov 21 '18 at 16:28

YeongHwa Jin

386

edited Nov 29 '18 at 14:05

asked Nov 21 '18 at 16:28

YeongHwa Jin

386

asked Nov 21 '18 at 16:28

YeongHwa Jin

386

asked Nov 21 '18 at 16:28

YeongHwa Jin

386

Where does normpdf come from?

– Nils Werner
Nov 21 '18 at 21:54

@Nils Werner I think that is not important. But I updated code for normpdf

– YeongHwa Jin
Nov 22 '18 at 2:36

Can you give a complete example with input and output?

– Nils Werner
Nov 22 '18 at 7:17

@Nils Werner I uploaded more explanation

– YeongHwa Jin
Nov 22 '18 at 12:04

Your code is not valid Python code, and does not work, even after fixing the syntax issues. Please post a proper MVCE!

– Nils Werner
Nov 22 '18 at 12:09

|
show 1 more comment

Where does normpdf come from?

– Nils Werner
Nov 21 '18 at 21:54

@Nils Werner I think that is not important. But I updated code for normpdf

– YeongHwa Jin
Nov 22 '18 at 2:36

Can you give a complete example with input and output?

– Nils Werner
Nov 22 '18 at 7:17

@Nils Werner I uploaded more explanation

– YeongHwa Jin
Nov 22 '18 at 12:04

Your code is not valid Python code, and does not work, even after fixing the syntax issues. Please post a proper MVCE!

– Nils Werner
Nov 22 '18 at 12:09

Where does normpdf come from?

– Nils Werner
Nov 21 '18 at 21:54

@Nils Werner I think that is not important. But I updated code for normpdf

– YeongHwa Jin
Nov 22 '18 at 2:36

Can you give a complete example with input and output?

– Nils Werner
Nov 22 '18 at 7:17

@Nils Werner I uploaded more explanation

– YeongHwa Jin
Nov 22 '18 at 12:04

Your code is not valid Python code, and does not work, even after fixing the syntax issues. Please post a proper MVCE!

– Nils Werner
Nov 22 '18 at 12:09

|
show 1 more comment

2 Answers
2

active

oldest

votes

You can vectorize your function easily:

import numpy as np



def fast_multinormpdf(x, mu, var):

    mu = np.asarray(mu)

    var = np.asarray(var)

    k = x.shape[-1]

    det = np.linalg.det(var)

    inv = np.linalg.inv(var)

    denominator = np.sqrt(((2*np.pi)**k)*det)

    numerator = np.dot((x - mu), inv)

    numerator = np.sum((x - mu) * numerator, axis=-1)

    numerator = np.exp(-0.5 * numerator)

    return numerator/denominator





arr = np.array([[42, 234, 124],

                [123, 222, 112],

                [42, 213, 11],

                [42, 213, 11]])



mu = [0, 0, 1]

var = [[1, 100, 100],

       [100, 1, 100],

       [100, 100, 1]]



slow_out = np.array([multinormpdf(i, mu, var) for i in arr])

fast_out = fast_multinormpdf(arr, mu, var)



np.allclose(slow_out, fast_out) # True

With fast_multinormpdf being about 1000 times faster than your unvectorized function:

long_arr = np.tile(arr, (10000, 1))



%timeit np.array([multinormpdf(i, mu, var) for i in long_arr])

# 2.12 s ± 93.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit fast_multinormpdf(long_arr, mu, var)

# 2.56 ms ± 76.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

edited Nov 22 '18 at 9:16

answered Nov 21 '18 at 21:58

Nils Werner

17.7k14161

Thanks Nils! I appreciate your advice!

– YeongHwa Jin
Nov 22 '18 at 13:24

add a comment |

You can try numba. Just decorate your function with @numba.vectorize.

@numba.vectorize

def multinormpdf(x, mu, var):

    # ...

    return caculated_probability



new_arr = multinormpdf(arr)

If your multinormpdf doesn't contains any unsupported functions, it can be accelerated. See here: https://numba.pydata.org/numba-doc/dev/reference/numpysupported.html

Moreover, you can use the experimental feature target='parallel' like this.

@numba.vectorize(target='parallel')

answered Nov 21 '18 at 20:28

anch2150

113

My input for multinormpdf is already numpy array (like [42, 234, 124] or [123, 222, 112]), not scalar(So maybe function is like multinormpdf([51, 23 ,251], mu_vector, cov_matrix)). Can I use @numba.vectorize yet?

– YeongHwa Jin
Nov 22 '18 at 3:41

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53416511%2fhow-can-i-calculate-probability-for-all-each-numpy-value-at-once%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

You can vectorize your function easily:

import numpy as np



def fast_multinormpdf(x, mu, var):

    mu = np.asarray(mu)

    var = np.asarray(var)

    k = x.shape[-1]

    det = np.linalg.det(var)

    inv = np.linalg.inv(var)

    denominator = np.sqrt(((2*np.pi)**k)*det)

    numerator = np.dot((x - mu), inv)

    numerator = np.sum((x - mu) * numerator, axis=-1)

    numerator = np.exp(-0.5 * numerator)

    return numerator/denominator





arr = np.array([[42, 234, 124],

                [123, 222, 112],

                [42, 213, 11],

                [42, 213, 11]])



mu = [0, 0, 1]

var = [[1, 100, 100],

       [100, 1, 100],

       [100, 100, 1]]



slow_out = np.array([multinormpdf(i, mu, var) for i in arr])

fast_out = fast_multinormpdf(arr, mu, var)



np.allclose(slow_out, fast_out) # True

With fast_multinormpdf being about 1000 times faster than your unvectorized function:

long_arr = np.tile(arr, (10000, 1))



%timeit np.array([multinormpdf(i, mu, var) for i in long_arr])

# 2.12 s ± 93.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit fast_multinormpdf(long_arr, mu, var)

# 2.56 ms ± 76.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

edited Nov 22 '18 at 9:16

answered Nov 21 '18 at 21:58

Nils Werner

17.7k14161

Thanks Nils! I appreciate your advice!

– YeongHwa Jin
Nov 22 '18 at 13:24

add a comment |

You can vectorize your function easily:

import numpy as np



def fast_multinormpdf(x, mu, var):

    mu = np.asarray(mu)

    var = np.asarray(var)

    k = x.shape[-1]

    det = np.linalg.det(var)

    inv = np.linalg.inv(var)

    denominator = np.sqrt(((2*np.pi)**k)*det)

    numerator = np.dot((x - mu), inv)

    numerator = np.sum((x - mu) * numerator, axis=-1)

    numerator = np.exp(-0.5 * numerator)

    return numerator/denominator





arr = np.array([[42, 234, 124],

                [123, 222, 112],

                [42, 213, 11],

                [42, 213, 11]])



mu = [0, 0, 1]

var = [[1, 100, 100],

       [100, 1, 100],

       [100, 100, 1]]



slow_out = np.array([multinormpdf(i, mu, var) for i in arr])

fast_out = fast_multinormpdf(arr, mu, var)



np.allclose(slow_out, fast_out) # True

With fast_multinormpdf being about 1000 times faster than your unvectorized function:

long_arr = np.tile(arr, (10000, 1))



%timeit np.array([multinormpdf(i, mu, var) for i in long_arr])

# 2.12 s ± 93.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit fast_multinormpdf(long_arr, mu, var)

# 2.56 ms ± 76.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

edited Nov 22 '18 at 9:16

answered Nov 21 '18 at 21:58

Nils Werner

17.7k14161

Thanks Nils! I appreciate your advice!

– YeongHwa Jin
Nov 22 '18 at 13:24

add a comment |

You can vectorize your function easily:

import numpy as np



def fast_multinormpdf(x, mu, var):

    mu = np.asarray(mu)

    var = np.asarray(var)

    k = x.shape[-1]

    det = np.linalg.det(var)

    inv = np.linalg.inv(var)

    denominator = np.sqrt(((2*np.pi)**k)*det)

    numerator = np.dot((x - mu), inv)

    numerator = np.sum((x - mu) * numerator, axis=-1)

    numerator = np.exp(-0.5 * numerator)

    return numerator/denominator





arr = np.array([[42, 234, 124],

                [123, 222, 112],

                [42, 213, 11],

                [42, 213, 11]])



mu = [0, 0, 1]

var = [[1, 100, 100],

       [100, 1, 100],

       [100, 100, 1]]



slow_out = np.array([multinormpdf(i, mu, var) for i in arr])

fast_out = fast_multinormpdf(arr, mu, var)



np.allclose(slow_out, fast_out) # True

With fast_multinormpdf being about 1000 times faster than your unvectorized function:

long_arr = np.tile(arr, (10000, 1))



%timeit np.array([multinormpdf(i, mu, var) for i in long_arr])

# 2.12 s ± 93.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit fast_multinormpdf(long_arr, mu, var)

# 2.56 ms ± 76.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

edited Nov 22 '18 at 9:16

answered Nov 21 '18 at 21:58

Nils Werner

17.7k14161

You can vectorize your function easily:

import numpy as np



def fast_multinormpdf(x, mu, var):

    mu = np.asarray(mu)

    var = np.asarray(var)

    k = x.shape[-1]

    det = np.linalg.det(var)

    inv = np.linalg.inv(var)

    denominator = np.sqrt(((2*np.pi)**k)*det)

    numerator = np.dot((x - mu), inv)

    numerator = np.sum((x - mu) * numerator, axis=-1)

    numerator = np.exp(-0.5 * numerator)

    return numerator/denominator





arr = np.array([[42, 234, 124],

                [123, 222, 112],

                [42, 213, 11],

                [42, 213, 11]])



mu = [0, 0, 1]

var = [[1, 100, 100],

       [100, 1, 100],

       [100, 100, 1]]



slow_out = np.array([multinormpdf(i, mu, var) for i in arr])

fast_out = fast_multinormpdf(arr, mu, var)



np.allclose(slow_out, fast_out) # True

With fast_multinormpdf being about 1000 times faster than your unvectorized function:

long_arr = np.tile(arr, (10000, 1))



%timeit np.array([multinormpdf(i, mu, var) for i in long_arr])

# 2.12 s ± 93.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit fast_multinormpdf(long_arr, mu, var)

# 2.56 ms ± 76.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

edited Nov 22 '18 at 9:16

answered Nov 21 '18 at 21:58

Nils Werner

17.7k14161

edited Nov 22 '18 at 9:16

answered Nov 21 '18 at 21:58

Nils Werner

17.7k14161

answered Nov 21 '18 at 21:58

Nils Werner

17.7k14161

answered Nov 21 '18 at 21:58

Nils Werner

17.7k14161

Thanks Nils! I appreciate your advice!

– YeongHwa Jin
Nov 22 '18 at 13:24

add a comment |

Thanks Nils! I appreciate your advice!

– YeongHwa Jin
Nov 22 '18 at 13:24

Thanks Nils! I appreciate your advice!

– YeongHwa Jin
Nov 22 '18 at 13:24

add a comment |

You can try numba. Just decorate your function with @numba.vectorize.

@numba.vectorize

def multinormpdf(x, mu, var):

    # ...

    return caculated_probability



new_arr = multinormpdf(arr)

If your multinormpdf doesn't contains any unsupported functions, it can be accelerated. See here: https://numba.pydata.org/numba-doc/dev/reference/numpysupported.html

Moreover, you can use the experimental feature target='parallel' like this.

@numba.vectorize(target='parallel')

answered Nov 21 '18 at 20:28

anch2150

113

My input for multinormpdf is already numpy array (like [42, 234, 124] or [123, 222, 112]), not scalar(So maybe function is like multinormpdf([51, 23 ,251], mu_vector, cov_matrix)). Can I use @numba.vectorize yet?

– YeongHwa Jin
Nov 22 '18 at 3:41

add a comment |

You can try numba. Just decorate your function with @numba.vectorize.

@numba.vectorize

def multinormpdf(x, mu, var):

    # ...

    return caculated_probability



new_arr = multinormpdf(arr)

If your multinormpdf doesn't contains any unsupported functions, it can be accelerated. See here: https://numba.pydata.org/numba-doc/dev/reference/numpysupported.html

Moreover, you can use the experimental feature target='parallel' like this.

@numba.vectorize(target='parallel')

answered Nov 21 '18 at 20:28

anch2150

113

My input for multinormpdf is already numpy array (like [42, 234, 124] or [123, 222, 112]), not scalar(So maybe function is like multinormpdf([51, 23 ,251], mu_vector, cov_matrix)). Can I use @numba.vectorize yet?

– YeongHwa Jin
Nov 22 '18 at 3:41

add a comment |

You can try numba. Just decorate your function with @numba.vectorize.

@numba.vectorize

def multinormpdf(x, mu, var):

    # ...

    return caculated_probability



new_arr = multinormpdf(arr)

If your multinormpdf doesn't contains any unsupported functions, it can be accelerated. See here: https://numba.pydata.org/numba-doc/dev/reference/numpysupported.html

Moreover, you can use the experimental feature target='parallel' like this.

@numba.vectorize(target='parallel')

answered Nov 21 '18 at 20:28

anch2150

113

You can try numba. Just decorate your function with @numba.vectorize.

@numba.vectorize

def multinormpdf(x, mu, var):

    # ...

    return caculated_probability



new_arr = multinormpdf(arr)

If your multinormpdf doesn't contains any unsupported functions, it can be accelerated. See here: https://numba.pydata.org/numba-doc/dev/reference/numpysupported.html

Moreover, you can use the experimental feature target='parallel' like this.

@numba.vectorize(target='parallel')

answered Nov 21 '18 at 20:28

anch2150

113

answered Nov 21 '18 at 20:28

anch2150

113

answered Nov 21 '18 at 20:28

anch2150

113

answered Nov 21 '18 at 20:28

anch2150

113

My input for multinormpdf is already numpy array (like [42, 234, 124] or [123, 222, 112]), not scalar(So maybe function is like multinormpdf([51, 23 ,251], mu_vector, cov_matrix)). Can I use @numba.vectorize yet?

– YeongHwa Jin
Nov 22 '18 at 3:41

add a comment |

My input for multinormpdf is already numpy array (like [42, 234, 124] or [123, 222, 112]), not scalar(So maybe function is like multinormpdf([51, 23 ,251], mu_vector, cov_matrix)). Can I use @numba.vectorize yet?

– YeongHwa Jin
Nov 22 '18 at 3:41

My input for multinormpdf is already numpy array (like [42, 234, 124] or [123, 222, 112]), not scalar(So maybe function is like multinormpdf([51, 23 ,251], mu_vector, cov_matrix)). Can I use @numba.vectorize yet?

– YeongHwa Jin
Nov 22 '18 at 3:41

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Argthtjtr