find and replace closest values in a numpy array with respect to second array
up vote
1
down vote
favorite
I have a large 2D np.array
(vec
).
I would like to replace each value in vec
with the closest value from a shorter array vals
.
I have tried the following
replaced_vals=vals[np.argmin(np.abs(vec[:, np.newaxis] - vals), axis=0)]
but it does not work because the size of vec
and vals
are different.
Example input
vec = np.array([10.1,10.7,11.4,102,1100]
vals = np.array([10.0,11.0,100.0])
Desired output:
replaced_vals = [10.0,11.0,11.0,100.0,100.0]
python numpy
add a comment |
up vote
1
down vote
favorite
I have a large 2D np.array
(vec
).
I would like to replace each value in vec
with the closest value from a shorter array vals
.
I have tried the following
replaced_vals=vals[np.argmin(np.abs(vec[:, np.newaxis] - vals), axis=0)]
but it does not work because the size of vec
and vals
are different.
Example input
vec = np.array([10.1,10.7,11.4,102,1100]
vals = np.array([10.0,11.0,100.0])
Desired output:
replaced_vals = [10.0,11.0,11.0,100.0,100.0]
python numpy
Memory efficient solution
-vals[closest_argmin(vec,vals)]
.
– Divakar
Nov 19 at 9:36
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I have a large 2D np.array
(vec
).
I would like to replace each value in vec
with the closest value from a shorter array vals
.
I have tried the following
replaced_vals=vals[np.argmin(np.abs(vec[:, np.newaxis] - vals), axis=0)]
but it does not work because the size of vec
and vals
are different.
Example input
vec = np.array([10.1,10.7,11.4,102,1100]
vals = np.array([10.0,11.0,100.0])
Desired output:
replaced_vals = [10.0,11.0,11.0,100.0,100.0]
python numpy
I have a large 2D np.array
(vec
).
I would like to replace each value in vec
with the closest value from a shorter array vals
.
I have tried the following
replaced_vals=vals[np.argmin(np.abs(vec[:, np.newaxis] - vals), axis=0)]
but it does not work because the size of vec
and vals
are different.
Example input
vec = np.array([10.1,10.7,11.4,102,1100]
vals = np.array([10.0,11.0,100.0])
Desired output:
replaced_vals = [10.0,11.0,11.0,100.0,100.0]
python numpy
python numpy
edited Nov 19 at 9:26
jpp
85.9k194898
85.9k194898
asked Nov 19 at 9:24
00__00__00
1,38011126
1,38011126
Memory efficient solution
-vals[closest_argmin(vec,vals)]
.
– Divakar
Nov 19 at 9:36
add a comment |
Memory efficient solution
-vals[closest_argmin(vec,vals)]
.
– Divakar
Nov 19 at 9:36
Memory efficient solution
- vals[closest_argmin(vec,vals)]
.– Divakar
Nov 19 at 9:36
Memory efficient solution
- vals[closest_argmin(vec,vals)]
.– Divakar
Nov 19 at 9:36
add a comment |
3 Answers
3
active
oldest
votes
up vote
2
down vote
accepted
If your vals
array is sorted, a more memory efficient, and possibly generally more efficient, solution is possible via np.searchsorted
:
def jpp(vec, vals):
ss = np.searchsorted(vals, vec)
a = vals[ss - 1]
b = vals[np.minimum(len(vals) - 1, ss)]
return np.where(np.fabs(vec - a) < np.fabs(vec - b), a, b)
vec = np.array([10.1,10.7,11.4,102,1100])
vals = np.array([10.0,11.0,100.0])
print(jpp(vec, vals))
[ 10. 11. 11. 100. 100.]
Performance benchmarking
# Python 3.6.0, NumPy 1.11.3
n = 10**6
vec = np.array([10.1,10.7,11.4,102,1100]*n)
vals = np.array([10.0,11.0,100.0])
# @ThomasPinetz's solution, memory inefficient
def tho(vec, vals):
return vals[np.argmin(np.abs(vec[:, np.newaxis] - vals), axis=1)]
def jpp(vec, vals):
ss = np.searchsorted(vals, vec)
a = vals[ss - 1]
b = vals[np.minimum(len(vals) - 1, ss)]
return np.where(np.fabs(vec - a) < np.fabs(vec - b), a, b)
# @Divakar's solution, adapted from first related Q&A link
def diva(A, B):
L = B.size
sorted_idx = np.searchsorted(B, A)
sorted_idx[sorted_idx==L] = L-1
mask = (sorted_idx > 0) &
((np.abs(A - B[sorted_idx-1]) < np.abs(A - B[sorted_idx])) )
return B[sorted_idx-mask]
assert np.array_equal(tho(vec, vals), jpp(vec, vals))
assert np.array_equal(tho(vec, vals), diva(vec, vals))
%timeit tho(vec, vals) # 366 ms per loop
%timeit jpp(vec, vals) # 295 ms per loop
%timeit diva(vec, vals) # 334 ms per loop
Related Q&A
- Find nearest indices for one array against all values in another array - Python / NumPy
- Find nearest value in numpy array
add a comment |
up vote
2
down vote
You have to look along the other axis to get the desired values like this:
replaced_vals=vals[np.argmin(np.abs(vec[:, np.newaxis] - vals), axis=1)]
Output for your problem:
array([ 10., 11., 11., 100., 100.])
add a comment |
up vote
1
down vote
if vals
is sorted, x_k
from vec
must be rounded to y_i
from vals
if :
(y_(i-1)+y_i)/2 <= x_k < (y_i+y_(i+1))/2.
so, yet another solution using np.searchsorted
, but minimizing operations and at least twice faster :
def bm(vec, vals):
half = vals.copy() / 2
half[:-1] += half[1:]
half[-1] = np.inf
ss = np.searchsorted(half,vec)
return vals[ss]
%timeit bm(vec, vals) # 84 ms per loop
If vals
is also sorted you can finish the job with numba
for another gap :
from numba import njit
@njit
def bmm(vec,vals):
half=vals.copy()/2
half[:-1] += half[1:]
half[-1]=np.inf
res=np.empty_like(vec)
i=0
for k in range(vec.size):
while half[i]<vec[k]:
i+=1
res[k]=vals[i]
return res
%timeit bmm(vec, vals) # 31 ms per loop
add a comment |
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
accepted
If your vals
array is sorted, a more memory efficient, and possibly generally more efficient, solution is possible via np.searchsorted
:
def jpp(vec, vals):
ss = np.searchsorted(vals, vec)
a = vals[ss - 1]
b = vals[np.minimum(len(vals) - 1, ss)]
return np.where(np.fabs(vec - a) < np.fabs(vec - b), a, b)
vec = np.array([10.1,10.7,11.4,102,1100])
vals = np.array([10.0,11.0,100.0])
print(jpp(vec, vals))
[ 10. 11. 11. 100. 100.]
Performance benchmarking
# Python 3.6.0, NumPy 1.11.3
n = 10**6
vec = np.array([10.1,10.7,11.4,102,1100]*n)
vals = np.array([10.0,11.0,100.0])
# @ThomasPinetz's solution, memory inefficient
def tho(vec, vals):
return vals[np.argmin(np.abs(vec[:, np.newaxis] - vals), axis=1)]
def jpp(vec, vals):
ss = np.searchsorted(vals, vec)
a = vals[ss - 1]
b = vals[np.minimum(len(vals) - 1, ss)]
return np.where(np.fabs(vec - a) < np.fabs(vec - b), a, b)
# @Divakar's solution, adapted from first related Q&A link
def diva(A, B):
L = B.size
sorted_idx = np.searchsorted(B, A)
sorted_idx[sorted_idx==L] = L-1
mask = (sorted_idx > 0) &
((np.abs(A - B[sorted_idx-1]) < np.abs(A - B[sorted_idx])) )
return B[sorted_idx-mask]
assert np.array_equal(tho(vec, vals), jpp(vec, vals))
assert np.array_equal(tho(vec, vals), diva(vec, vals))
%timeit tho(vec, vals) # 366 ms per loop
%timeit jpp(vec, vals) # 295 ms per loop
%timeit diva(vec, vals) # 334 ms per loop
Related Q&A
- Find nearest indices for one array against all values in another array - Python / NumPy
- Find nearest value in numpy array
add a comment |
up vote
2
down vote
accepted
If your vals
array is sorted, a more memory efficient, and possibly generally more efficient, solution is possible via np.searchsorted
:
def jpp(vec, vals):
ss = np.searchsorted(vals, vec)
a = vals[ss - 1]
b = vals[np.minimum(len(vals) - 1, ss)]
return np.where(np.fabs(vec - a) < np.fabs(vec - b), a, b)
vec = np.array([10.1,10.7,11.4,102,1100])
vals = np.array([10.0,11.0,100.0])
print(jpp(vec, vals))
[ 10. 11. 11. 100. 100.]
Performance benchmarking
# Python 3.6.0, NumPy 1.11.3
n = 10**6
vec = np.array([10.1,10.7,11.4,102,1100]*n)
vals = np.array([10.0,11.0,100.0])
# @ThomasPinetz's solution, memory inefficient
def tho(vec, vals):
return vals[np.argmin(np.abs(vec[:, np.newaxis] - vals), axis=1)]
def jpp(vec, vals):
ss = np.searchsorted(vals, vec)
a = vals[ss - 1]
b = vals[np.minimum(len(vals) - 1, ss)]
return np.where(np.fabs(vec - a) < np.fabs(vec - b), a, b)
# @Divakar's solution, adapted from first related Q&A link
def diva(A, B):
L = B.size
sorted_idx = np.searchsorted(B, A)
sorted_idx[sorted_idx==L] = L-1
mask = (sorted_idx > 0) &
((np.abs(A - B[sorted_idx-1]) < np.abs(A - B[sorted_idx])) )
return B[sorted_idx-mask]
assert np.array_equal(tho(vec, vals), jpp(vec, vals))
assert np.array_equal(tho(vec, vals), diva(vec, vals))
%timeit tho(vec, vals) # 366 ms per loop
%timeit jpp(vec, vals) # 295 ms per loop
%timeit diva(vec, vals) # 334 ms per loop
Related Q&A
- Find nearest indices for one array against all values in another array - Python / NumPy
- Find nearest value in numpy array
add a comment |
up vote
2
down vote
accepted
up vote
2
down vote
accepted
If your vals
array is sorted, a more memory efficient, and possibly generally more efficient, solution is possible via np.searchsorted
:
def jpp(vec, vals):
ss = np.searchsorted(vals, vec)
a = vals[ss - 1]
b = vals[np.minimum(len(vals) - 1, ss)]
return np.where(np.fabs(vec - a) < np.fabs(vec - b), a, b)
vec = np.array([10.1,10.7,11.4,102,1100])
vals = np.array([10.0,11.0,100.0])
print(jpp(vec, vals))
[ 10. 11. 11. 100. 100.]
Performance benchmarking
# Python 3.6.0, NumPy 1.11.3
n = 10**6
vec = np.array([10.1,10.7,11.4,102,1100]*n)
vals = np.array([10.0,11.0,100.0])
# @ThomasPinetz's solution, memory inefficient
def tho(vec, vals):
return vals[np.argmin(np.abs(vec[:, np.newaxis] - vals), axis=1)]
def jpp(vec, vals):
ss = np.searchsorted(vals, vec)
a = vals[ss - 1]
b = vals[np.minimum(len(vals) - 1, ss)]
return np.where(np.fabs(vec - a) < np.fabs(vec - b), a, b)
# @Divakar's solution, adapted from first related Q&A link
def diva(A, B):
L = B.size
sorted_idx = np.searchsorted(B, A)
sorted_idx[sorted_idx==L] = L-1
mask = (sorted_idx > 0) &
((np.abs(A - B[sorted_idx-1]) < np.abs(A - B[sorted_idx])) )
return B[sorted_idx-mask]
assert np.array_equal(tho(vec, vals), jpp(vec, vals))
assert np.array_equal(tho(vec, vals), diva(vec, vals))
%timeit tho(vec, vals) # 366 ms per loop
%timeit jpp(vec, vals) # 295 ms per loop
%timeit diva(vec, vals) # 334 ms per loop
Related Q&A
- Find nearest indices for one array against all values in another array - Python / NumPy
- Find nearest value in numpy array
If your vals
array is sorted, a more memory efficient, and possibly generally more efficient, solution is possible via np.searchsorted
:
def jpp(vec, vals):
ss = np.searchsorted(vals, vec)
a = vals[ss - 1]
b = vals[np.minimum(len(vals) - 1, ss)]
return np.where(np.fabs(vec - a) < np.fabs(vec - b), a, b)
vec = np.array([10.1,10.7,11.4,102,1100])
vals = np.array([10.0,11.0,100.0])
print(jpp(vec, vals))
[ 10. 11. 11. 100. 100.]
Performance benchmarking
# Python 3.6.0, NumPy 1.11.3
n = 10**6
vec = np.array([10.1,10.7,11.4,102,1100]*n)
vals = np.array([10.0,11.0,100.0])
# @ThomasPinetz's solution, memory inefficient
def tho(vec, vals):
return vals[np.argmin(np.abs(vec[:, np.newaxis] - vals), axis=1)]
def jpp(vec, vals):
ss = np.searchsorted(vals, vec)
a = vals[ss - 1]
b = vals[np.minimum(len(vals) - 1, ss)]
return np.where(np.fabs(vec - a) < np.fabs(vec - b), a, b)
# @Divakar's solution, adapted from first related Q&A link
def diva(A, B):
L = B.size
sorted_idx = np.searchsorted(B, A)
sorted_idx[sorted_idx==L] = L-1
mask = (sorted_idx > 0) &
((np.abs(A - B[sorted_idx-1]) < np.abs(A - B[sorted_idx])) )
return B[sorted_idx-mask]
assert np.array_equal(tho(vec, vals), jpp(vec, vals))
assert np.array_equal(tho(vec, vals), diva(vec, vals))
%timeit tho(vec, vals) # 366 ms per loop
%timeit jpp(vec, vals) # 295 ms per loop
%timeit diva(vec, vals) # 334 ms per loop
Related Q&A
- Find nearest indices for one array against all values in another array - Python / NumPy
- Find nearest value in numpy array
edited Nov 19 at 10:09
answered Nov 19 at 9:46
jpp
85.9k194898
85.9k194898
add a comment |
add a comment |
up vote
2
down vote
You have to look along the other axis to get the desired values like this:
replaced_vals=vals[np.argmin(np.abs(vec[:, np.newaxis] - vals), axis=1)]
Output for your problem:
array([ 10., 11., 11., 100., 100.])
add a comment |
up vote
2
down vote
You have to look along the other axis to get the desired values like this:
replaced_vals=vals[np.argmin(np.abs(vec[:, np.newaxis] - vals), axis=1)]
Output for your problem:
array([ 10., 11., 11., 100., 100.])
add a comment |
up vote
2
down vote
up vote
2
down vote
You have to look along the other axis to get the desired values like this:
replaced_vals=vals[np.argmin(np.abs(vec[:, np.newaxis] - vals), axis=1)]
Output for your problem:
array([ 10., 11., 11., 100., 100.])
You have to look along the other axis to get the desired values like this:
replaced_vals=vals[np.argmin(np.abs(vec[:, np.newaxis] - vals), axis=1)]
Output for your problem:
array([ 10., 11., 11., 100., 100.])
answered Nov 19 at 9:27
Thomas Pinetz
4,07911327
4,07911327
add a comment |
add a comment |
up vote
1
down vote
if vals
is sorted, x_k
from vec
must be rounded to y_i
from vals
if :
(y_(i-1)+y_i)/2 <= x_k < (y_i+y_(i+1))/2.
so, yet another solution using np.searchsorted
, but minimizing operations and at least twice faster :
def bm(vec, vals):
half = vals.copy() / 2
half[:-1] += half[1:]
half[-1] = np.inf
ss = np.searchsorted(half,vec)
return vals[ss]
%timeit bm(vec, vals) # 84 ms per loop
If vals
is also sorted you can finish the job with numba
for another gap :
from numba import njit
@njit
def bmm(vec,vals):
half=vals.copy()/2
half[:-1] += half[1:]
half[-1]=np.inf
res=np.empty_like(vec)
i=0
for k in range(vec.size):
while half[i]<vec[k]:
i+=1
res[k]=vals[i]
return res
%timeit bmm(vec, vals) # 31 ms per loop
add a comment |
up vote
1
down vote
if vals
is sorted, x_k
from vec
must be rounded to y_i
from vals
if :
(y_(i-1)+y_i)/2 <= x_k < (y_i+y_(i+1))/2.
so, yet another solution using np.searchsorted
, but minimizing operations and at least twice faster :
def bm(vec, vals):
half = vals.copy() / 2
half[:-1] += half[1:]
half[-1] = np.inf
ss = np.searchsorted(half,vec)
return vals[ss]
%timeit bm(vec, vals) # 84 ms per loop
If vals
is also sorted you can finish the job with numba
for another gap :
from numba import njit
@njit
def bmm(vec,vals):
half=vals.copy()/2
half[:-1] += half[1:]
half[-1]=np.inf
res=np.empty_like(vec)
i=0
for k in range(vec.size):
while half[i]<vec[k]:
i+=1
res[k]=vals[i]
return res
%timeit bmm(vec, vals) # 31 ms per loop
add a comment |
up vote
1
down vote
up vote
1
down vote
if vals
is sorted, x_k
from vec
must be rounded to y_i
from vals
if :
(y_(i-1)+y_i)/2 <= x_k < (y_i+y_(i+1))/2.
so, yet another solution using np.searchsorted
, but minimizing operations and at least twice faster :
def bm(vec, vals):
half = vals.copy() / 2
half[:-1] += half[1:]
half[-1] = np.inf
ss = np.searchsorted(half,vec)
return vals[ss]
%timeit bm(vec, vals) # 84 ms per loop
If vals
is also sorted you can finish the job with numba
for another gap :
from numba import njit
@njit
def bmm(vec,vals):
half=vals.copy()/2
half[:-1] += half[1:]
half[-1]=np.inf
res=np.empty_like(vec)
i=0
for k in range(vec.size):
while half[i]<vec[k]:
i+=1
res[k]=vals[i]
return res
%timeit bmm(vec, vals) # 31 ms per loop
if vals
is sorted, x_k
from vec
must be rounded to y_i
from vals
if :
(y_(i-1)+y_i)/2 <= x_k < (y_i+y_(i+1))/2.
so, yet another solution using np.searchsorted
, but minimizing operations and at least twice faster :
def bm(vec, vals):
half = vals.copy() / 2
half[:-1] += half[1:]
half[-1] = np.inf
ss = np.searchsorted(half,vec)
return vals[ss]
%timeit bm(vec, vals) # 84 ms per loop
If vals
is also sorted you can finish the job with numba
for another gap :
from numba import njit
@njit
def bmm(vec,vals):
half=vals.copy()/2
half[:-1] += half[1:]
half[-1]=np.inf
res=np.empty_like(vec)
i=0
for k in range(vec.size):
while half[i]<vec[k]:
i+=1
res[k]=vals[i]
return res
%timeit bmm(vec, vals) # 31 ms per loop
edited Nov 19 at 20:40
answered Nov 19 at 18:09
B. M.
12.3k11934
12.3k11934
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53371580%2ffind-and-replace-closest-values-in-a-numpy-array-with-respect-to-second-array%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Memory efficient solution
-vals[closest_argmin(vec,vals)]
.– Divakar
Nov 19 at 9:36