for example, have 2 ndarrays, shape of train_dataset
(10000, 28, 28)
, shape of val_dateset
(2000, 28, 28)
.
except using iterations, there efficient way use numpy array functions find overlap between 2 ndarrays?
memory permitting use broadcasting
, -
val_dateset[(train_dataset[:,none] == val_dateset).all(axis=(2,3)).any(0)]
sample run -
in [55]: train_dataset out[55]: array([[[1, 1], [1, 1]], [[1, 0], [0, 0]], [[0, 0], [0, 1]], [[0, 1], [0, 0]], [[1, 1], [1, 0]]]) in [56]: val_dateset out[56]: array([[[0, 1], [1, 0]], [[1, 1], [1, 1]], [[0, 0], [0, 1]]]) in [57]: val_dateset[(train_dataset[:,none] == val_dateset).all(axis=(2,3)).any(0)] out[57]: array([[[1, 1], [1, 1]], [[0, 0], [0, 1]]])
if elements integers, collapse every block of axis=(1,2)
in input arrays scalar assuming them linearly index-able numbers , efficiently use np.in1d
or np.intersect1d
find matches.
Comments
Post a Comment