How can I calculate probability for all each numpy value at once?
I have a function for calculating probability like below:
def multinormpdf(x, mu, var): # calculate probability of multi Gaussian distribution
k = len(x)
det = np.linalg.det(var)
inv = np.linalg.inv(var)
denominator = math.sqrt(((2*math.pi)**k)*det)
numerator = np.dot((x - mean).transpose(), inv)
numerator = np.dot(numerator, (x - mean))
numerator = math.exp(-0.5 * numerator)
return numerator/denominator
and I have mean vector, covariance matrix and 2D numpy array for test
mu = np.array([100, 105, 42]) # mean vector
var = np.array([[100, 124, 11], # covariance matrix
[124, 150, 44],
[11, 44, 130]])
arr = np.array([[42, 234, 124], # arr is 43923794 x 3 matrix
[123, 222, 112],
[42, 213, 11],
...(so many values about 40,000,000 rows),
[23, 55, 251]])
I have to calculate for probability for each value, so I used this code
for i in arr:
print(multinormpdf(i, mu, var)) # I already know mean_vector and variance_matrix
But it is so slow...
Is there any faster way to calculate probability?
Or is there any way to calculate probability for test arr at once like 'batch'?
python python-3.x probability
|
show 1 more comment
I have a function for calculating probability like below:
def multinormpdf(x, mu, var): # calculate probability of multi Gaussian distribution
k = len(x)
det = np.linalg.det(var)
inv = np.linalg.inv(var)
denominator = math.sqrt(((2*math.pi)**k)*det)
numerator = np.dot((x - mean).transpose(), inv)
numerator = np.dot(numerator, (x - mean))
numerator = math.exp(-0.5 * numerator)
return numerator/denominator
and I have mean vector, covariance matrix and 2D numpy array for test
mu = np.array([100, 105, 42]) # mean vector
var = np.array([[100, 124, 11], # covariance matrix
[124, 150, 44],
[11, 44, 130]])
arr = np.array([[42, 234, 124], # arr is 43923794 x 3 matrix
[123, 222, 112],
[42, 213, 11],
...(so many values about 40,000,000 rows),
[23, 55, 251]])
I have to calculate for probability for each value, so I used this code
for i in arr:
print(multinormpdf(i, mu, var)) # I already know mean_vector and variance_matrix
But it is so slow...
Is there any faster way to calculate probability?
Or is there any way to calculate probability for test arr at once like 'batch'?
python python-3.x probability
Where doesnormpdf
come from?
– Nils Werner
Nov 21 '18 at 21:54
@Nils Werner I think that is not important. But I updated code fornormpdf
– YeongHwa Jin
Nov 22 '18 at 2:36
Can you give a complete example with input and output?
– Nils Werner
Nov 22 '18 at 7:17
@Nils Werner I uploaded more explanation
– YeongHwa Jin
Nov 22 '18 at 12:04
Your code is not valid Python code, and does not work, even after fixing the syntax issues. Please post a proper MVCE!
– Nils Werner
Nov 22 '18 at 12:09
|
show 1 more comment
I have a function for calculating probability like below:
def multinormpdf(x, mu, var): # calculate probability of multi Gaussian distribution
k = len(x)
det = np.linalg.det(var)
inv = np.linalg.inv(var)
denominator = math.sqrt(((2*math.pi)**k)*det)
numerator = np.dot((x - mean).transpose(), inv)
numerator = np.dot(numerator, (x - mean))
numerator = math.exp(-0.5 * numerator)
return numerator/denominator
and I have mean vector, covariance matrix and 2D numpy array for test
mu = np.array([100, 105, 42]) # mean vector
var = np.array([[100, 124, 11], # covariance matrix
[124, 150, 44],
[11, 44, 130]])
arr = np.array([[42, 234, 124], # arr is 43923794 x 3 matrix
[123, 222, 112],
[42, 213, 11],
...(so many values about 40,000,000 rows),
[23, 55, 251]])
I have to calculate for probability for each value, so I used this code
for i in arr:
print(multinormpdf(i, mu, var)) # I already know mean_vector and variance_matrix
But it is so slow...
Is there any faster way to calculate probability?
Or is there any way to calculate probability for test arr at once like 'batch'?
python python-3.x probability
I have a function for calculating probability like below:
def multinormpdf(x, mu, var): # calculate probability of multi Gaussian distribution
k = len(x)
det = np.linalg.det(var)
inv = np.linalg.inv(var)
denominator = math.sqrt(((2*math.pi)**k)*det)
numerator = np.dot((x - mean).transpose(), inv)
numerator = np.dot(numerator, (x - mean))
numerator = math.exp(-0.5 * numerator)
return numerator/denominator
and I have mean vector, covariance matrix and 2D numpy array for test
mu = np.array([100, 105, 42]) # mean vector
var = np.array([[100, 124, 11], # covariance matrix
[124, 150, 44],
[11, 44, 130]])
arr = np.array([[42, 234, 124], # arr is 43923794 x 3 matrix
[123, 222, 112],
[42, 213, 11],
...(so many values about 40,000,000 rows),
[23, 55, 251]])
I have to calculate for probability for each value, so I used this code
for i in arr:
print(multinormpdf(i, mu, var)) # I already know mean_vector and variance_matrix
But it is so slow...
Is there any faster way to calculate probability?
Or is there any way to calculate probability for test arr at once like 'batch'?
python python-3.x probability
python python-3.x probability
edited Nov 29 '18 at 14:05
YeongHwa Jin
asked Nov 21 '18 at 16:28
YeongHwa JinYeongHwa Jin
386
386
Where doesnormpdf
come from?
– Nils Werner
Nov 21 '18 at 21:54
@Nils Werner I think that is not important. But I updated code fornormpdf
– YeongHwa Jin
Nov 22 '18 at 2:36
Can you give a complete example with input and output?
– Nils Werner
Nov 22 '18 at 7:17
@Nils Werner I uploaded more explanation
– YeongHwa Jin
Nov 22 '18 at 12:04
Your code is not valid Python code, and does not work, even after fixing the syntax issues. Please post a proper MVCE!
– Nils Werner
Nov 22 '18 at 12:09
|
show 1 more comment
Where doesnormpdf
come from?
– Nils Werner
Nov 21 '18 at 21:54
@Nils Werner I think that is not important. But I updated code fornormpdf
– YeongHwa Jin
Nov 22 '18 at 2:36
Can you give a complete example with input and output?
– Nils Werner
Nov 22 '18 at 7:17
@Nils Werner I uploaded more explanation
– YeongHwa Jin
Nov 22 '18 at 12:04
Your code is not valid Python code, and does not work, even after fixing the syntax issues. Please post a proper MVCE!
– Nils Werner
Nov 22 '18 at 12:09
Where does
normpdf
come from?– Nils Werner
Nov 21 '18 at 21:54
Where does
normpdf
come from?– Nils Werner
Nov 21 '18 at 21:54
@Nils Werner I think that is not important. But I updated code for
normpdf
– YeongHwa Jin
Nov 22 '18 at 2:36
@Nils Werner I think that is not important. But I updated code for
normpdf
– YeongHwa Jin
Nov 22 '18 at 2:36
Can you give a complete example with input and output?
– Nils Werner
Nov 22 '18 at 7:17
Can you give a complete example with input and output?
– Nils Werner
Nov 22 '18 at 7:17
@Nils Werner I uploaded more explanation
– YeongHwa Jin
Nov 22 '18 at 12:04
@Nils Werner I uploaded more explanation
– YeongHwa Jin
Nov 22 '18 at 12:04
Your code is not valid Python code, and does not work, even after fixing the syntax issues. Please post a proper MVCE!
– Nils Werner
Nov 22 '18 at 12:09
Your code is not valid Python code, and does not work, even after fixing the syntax issues. Please post a proper MVCE!
– Nils Werner
Nov 22 '18 at 12:09
|
show 1 more comment
2 Answers
2
active
oldest
votes
You can vectorize your function easily:
import numpy as np
def fast_multinormpdf(x, mu, var):
mu = np.asarray(mu)
var = np.asarray(var)
k = x.shape[-1]
det = np.linalg.det(var)
inv = np.linalg.inv(var)
denominator = np.sqrt(((2*np.pi)**k)*det)
numerator = np.dot((x - mu), inv)
numerator = np.sum((x - mu) * numerator, axis=-1)
numerator = np.exp(-0.5 * numerator)
return numerator/denominator
arr = np.array([[42, 234, 124],
[123, 222, 112],
[42, 213, 11],
[42, 213, 11]])
mu = [0, 0, 1]
var = [[1, 100, 100],
[100, 1, 100],
[100, 100, 1]]
slow_out = np.array([multinormpdf(i, mu, var) for i in arr])
fast_out = fast_multinormpdf(arr, mu, var)
np.allclose(slow_out, fast_out) # True
With fast_multinormpdf
being about 1000 times faster than your unvectorized function:
long_arr = np.tile(arr, (10000, 1))
%timeit np.array([multinormpdf(i, mu, var) for i in long_arr])
# 2.12 s ± 93.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit fast_multinormpdf(long_arr, mu, var)
# 2.56 ms ± 76.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Thanks Nils! I appreciate your advice!
– YeongHwa Jin
Nov 22 '18 at 13:24
add a comment |
You can try numba. Just decorate your function with @numba.vectorize
.
@numba.vectorize
def multinormpdf(x, mu, var):
# ...
return caculated_probability
new_arr = multinormpdf(arr)
If your multinormpdf
doesn't contains any unsupported functions, it can be accelerated. See here: https://numba.pydata.org/numba-doc/dev/reference/numpysupported.html
Moreover, you can use the experimental feature target='parallel'
like this.
@numba.vectorize(target='parallel')
My input formultinormpdf
is already numpy array (like [42, 234, 124] or [123, 222, 112]), not scalar(So maybe function is likemultinormpdf([51, 23 ,251], mu_vector, cov_matrix)
). Can I use @numba.vectorize yet?
– YeongHwa Jin
Nov 22 '18 at 3:41
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53416511%2fhow-can-i-calculate-probability-for-all-each-numpy-value-at-once%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can vectorize your function easily:
import numpy as np
def fast_multinormpdf(x, mu, var):
mu = np.asarray(mu)
var = np.asarray(var)
k = x.shape[-1]
det = np.linalg.det(var)
inv = np.linalg.inv(var)
denominator = np.sqrt(((2*np.pi)**k)*det)
numerator = np.dot((x - mu), inv)
numerator = np.sum((x - mu) * numerator, axis=-1)
numerator = np.exp(-0.5 * numerator)
return numerator/denominator
arr = np.array([[42, 234, 124],
[123, 222, 112],
[42, 213, 11],
[42, 213, 11]])
mu = [0, 0, 1]
var = [[1, 100, 100],
[100, 1, 100],
[100, 100, 1]]
slow_out = np.array([multinormpdf(i, mu, var) for i in arr])
fast_out = fast_multinormpdf(arr, mu, var)
np.allclose(slow_out, fast_out) # True
With fast_multinormpdf
being about 1000 times faster than your unvectorized function:
long_arr = np.tile(arr, (10000, 1))
%timeit np.array([multinormpdf(i, mu, var) for i in long_arr])
# 2.12 s ± 93.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit fast_multinormpdf(long_arr, mu, var)
# 2.56 ms ± 76.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Thanks Nils! I appreciate your advice!
– YeongHwa Jin
Nov 22 '18 at 13:24
add a comment |
You can vectorize your function easily:
import numpy as np
def fast_multinormpdf(x, mu, var):
mu = np.asarray(mu)
var = np.asarray(var)
k = x.shape[-1]
det = np.linalg.det(var)
inv = np.linalg.inv(var)
denominator = np.sqrt(((2*np.pi)**k)*det)
numerator = np.dot((x - mu), inv)
numerator = np.sum((x - mu) * numerator, axis=-1)
numerator = np.exp(-0.5 * numerator)
return numerator/denominator
arr = np.array([[42, 234, 124],
[123, 222, 112],
[42, 213, 11],
[42, 213, 11]])
mu = [0, 0, 1]
var = [[1, 100, 100],
[100, 1, 100],
[100, 100, 1]]
slow_out = np.array([multinormpdf(i, mu, var) for i in arr])
fast_out = fast_multinormpdf(arr, mu, var)
np.allclose(slow_out, fast_out) # True
With fast_multinormpdf
being about 1000 times faster than your unvectorized function:
long_arr = np.tile(arr, (10000, 1))
%timeit np.array([multinormpdf(i, mu, var) for i in long_arr])
# 2.12 s ± 93.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit fast_multinormpdf(long_arr, mu, var)
# 2.56 ms ± 76.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Thanks Nils! I appreciate your advice!
– YeongHwa Jin
Nov 22 '18 at 13:24
add a comment |
You can vectorize your function easily:
import numpy as np
def fast_multinormpdf(x, mu, var):
mu = np.asarray(mu)
var = np.asarray(var)
k = x.shape[-1]
det = np.linalg.det(var)
inv = np.linalg.inv(var)
denominator = np.sqrt(((2*np.pi)**k)*det)
numerator = np.dot((x - mu), inv)
numerator = np.sum((x - mu) * numerator, axis=-1)
numerator = np.exp(-0.5 * numerator)
return numerator/denominator
arr = np.array([[42, 234, 124],
[123, 222, 112],
[42, 213, 11],
[42, 213, 11]])
mu = [0, 0, 1]
var = [[1, 100, 100],
[100, 1, 100],
[100, 100, 1]]
slow_out = np.array([multinormpdf(i, mu, var) for i in arr])
fast_out = fast_multinormpdf(arr, mu, var)
np.allclose(slow_out, fast_out) # True
With fast_multinormpdf
being about 1000 times faster than your unvectorized function:
long_arr = np.tile(arr, (10000, 1))
%timeit np.array([multinormpdf(i, mu, var) for i in long_arr])
# 2.12 s ± 93.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit fast_multinormpdf(long_arr, mu, var)
# 2.56 ms ± 76.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
You can vectorize your function easily:
import numpy as np
def fast_multinormpdf(x, mu, var):
mu = np.asarray(mu)
var = np.asarray(var)
k = x.shape[-1]
det = np.linalg.det(var)
inv = np.linalg.inv(var)
denominator = np.sqrt(((2*np.pi)**k)*det)
numerator = np.dot((x - mu), inv)
numerator = np.sum((x - mu) * numerator, axis=-1)
numerator = np.exp(-0.5 * numerator)
return numerator/denominator
arr = np.array([[42, 234, 124],
[123, 222, 112],
[42, 213, 11],
[42, 213, 11]])
mu = [0, 0, 1]
var = [[1, 100, 100],
[100, 1, 100],
[100, 100, 1]]
slow_out = np.array([multinormpdf(i, mu, var) for i in arr])
fast_out = fast_multinormpdf(arr, mu, var)
np.allclose(slow_out, fast_out) # True
With fast_multinormpdf
being about 1000 times faster than your unvectorized function:
long_arr = np.tile(arr, (10000, 1))
%timeit np.array([multinormpdf(i, mu, var) for i in long_arr])
# 2.12 s ± 93.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit fast_multinormpdf(long_arr, mu, var)
# 2.56 ms ± 76.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
edited Nov 22 '18 at 9:16
answered Nov 21 '18 at 21:58
Nils WernerNils Werner
17.7k14161
17.7k14161
Thanks Nils! I appreciate your advice!
– YeongHwa Jin
Nov 22 '18 at 13:24
add a comment |
Thanks Nils! I appreciate your advice!
– YeongHwa Jin
Nov 22 '18 at 13:24
Thanks Nils! I appreciate your advice!
– YeongHwa Jin
Nov 22 '18 at 13:24
Thanks Nils! I appreciate your advice!
– YeongHwa Jin
Nov 22 '18 at 13:24
add a comment |
You can try numba. Just decorate your function with @numba.vectorize
.
@numba.vectorize
def multinormpdf(x, mu, var):
# ...
return caculated_probability
new_arr = multinormpdf(arr)
If your multinormpdf
doesn't contains any unsupported functions, it can be accelerated. See here: https://numba.pydata.org/numba-doc/dev/reference/numpysupported.html
Moreover, you can use the experimental feature target='parallel'
like this.
@numba.vectorize(target='parallel')
My input formultinormpdf
is already numpy array (like [42, 234, 124] or [123, 222, 112]), not scalar(So maybe function is likemultinormpdf([51, 23 ,251], mu_vector, cov_matrix)
). Can I use @numba.vectorize yet?
– YeongHwa Jin
Nov 22 '18 at 3:41
add a comment |
You can try numba. Just decorate your function with @numba.vectorize
.
@numba.vectorize
def multinormpdf(x, mu, var):
# ...
return caculated_probability
new_arr = multinormpdf(arr)
If your multinormpdf
doesn't contains any unsupported functions, it can be accelerated. See here: https://numba.pydata.org/numba-doc/dev/reference/numpysupported.html
Moreover, you can use the experimental feature target='parallel'
like this.
@numba.vectorize(target='parallel')
My input formultinormpdf
is already numpy array (like [42, 234, 124] or [123, 222, 112]), not scalar(So maybe function is likemultinormpdf([51, 23 ,251], mu_vector, cov_matrix)
). Can I use @numba.vectorize yet?
– YeongHwa Jin
Nov 22 '18 at 3:41
add a comment |
You can try numba. Just decorate your function with @numba.vectorize
.
@numba.vectorize
def multinormpdf(x, mu, var):
# ...
return caculated_probability
new_arr = multinormpdf(arr)
If your multinormpdf
doesn't contains any unsupported functions, it can be accelerated. See here: https://numba.pydata.org/numba-doc/dev/reference/numpysupported.html
Moreover, you can use the experimental feature target='parallel'
like this.
@numba.vectorize(target='parallel')
You can try numba. Just decorate your function with @numba.vectorize
.
@numba.vectorize
def multinormpdf(x, mu, var):
# ...
return caculated_probability
new_arr = multinormpdf(arr)
If your multinormpdf
doesn't contains any unsupported functions, it can be accelerated. See here: https://numba.pydata.org/numba-doc/dev/reference/numpysupported.html
Moreover, you can use the experimental feature target='parallel'
like this.
@numba.vectorize(target='parallel')
answered Nov 21 '18 at 20:28
anch2150anch2150
113
113
My input formultinormpdf
is already numpy array (like [42, 234, 124] or [123, 222, 112]), not scalar(So maybe function is likemultinormpdf([51, 23 ,251], mu_vector, cov_matrix)
). Can I use @numba.vectorize yet?
– YeongHwa Jin
Nov 22 '18 at 3:41
add a comment |
My input formultinormpdf
is already numpy array (like [42, 234, 124] or [123, 222, 112]), not scalar(So maybe function is likemultinormpdf([51, 23 ,251], mu_vector, cov_matrix)
). Can I use @numba.vectorize yet?
– YeongHwa Jin
Nov 22 '18 at 3:41
My input for
multinormpdf
is already numpy array (like [42, 234, 124] or [123, 222, 112]), not scalar(So maybe function is like multinormpdf([51, 23 ,251], mu_vector, cov_matrix)
). Can I use @numba.vectorize yet?– YeongHwa Jin
Nov 22 '18 at 3:41
My input for
multinormpdf
is already numpy array (like [42, 234, 124] or [123, 222, 112]), not scalar(So maybe function is like multinormpdf([51, 23 ,251], mu_vector, cov_matrix)
). Can I use @numba.vectorize yet?– YeongHwa Jin
Nov 22 '18 at 3:41
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53416511%2fhow-can-i-calculate-probability-for-all-each-numpy-value-at-once%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Where does
normpdf
come from?– Nils Werner
Nov 21 '18 at 21:54
@Nils Werner I think that is not important. But I updated code for
normpdf
– YeongHwa Jin
Nov 22 '18 at 2:36
Can you give a complete example with input and output?
– Nils Werner
Nov 22 '18 at 7:17
@Nils Werner I uploaded more explanation
– YeongHwa Jin
Nov 22 '18 at 12:04
Your code is not valid Python code, and does not work, even after fixing the syntax issues. Please post a proper MVCE!
– Nils Werner
Nov 22 '18 at 12:09