Tensorflow dense tensor to sparse binarized hash trick tensor

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}

I want to transform this dataset in such a way that each tensor has a given size n and that a feature at index i of this new tensor is set to 1 if and only if there is a i in the original feature (modulo n).

I hope the following example will make things clearer

Let's suppose I have a dataset like:

t = tf.constant([

  [0, 3, 4],

  [12, 2 ,4]])

ds = tf.data.Dataset.from_tensors(t)

I want to get the sparse equivalent of (if n = 9)

t = tf.constant([

  [1, 0, 0, 1, 1, 0, 0, 0, 0], # index set to 1 are 0, 3 and 4

  [0, 0, 1, 1, 1, 0, 0, 0, 0]]) # index set to 1 are 2, 4, and 12%9 = 3

I already know how to obtain a not sparse representation (Tensorflow: tensor binarization) and as I will end up with n > 1 million, I do not want to pass by the dense tensor to get the sparse one

thanks

asked Nov 23 '18 at 16:34

taktak004

409625

So the input is still dense in this case, right?

– jdehesa
Nov 23 '18 at 16:45

yes, input is still dense

– taktak004
Dec 1 '18 at 23:15

add a comment |

I hope the following example will make things clearer

Let's suppose I have a dataset like:

t = tf.constant([

  [0, 3, 4],

  [12, 2 ,4]])

ds = tf.data.Dataset.from_tensors(t)

I want to get the sparse equivalent of (if n = 9)

t = tf.constant([

  [1, 0, 0, 1, 1, 0, 0, 0, 0], # index set to 1 are 0, 3 and 4

  [0, 0, 1, 1, 1, 0, 0, 0, 0]]) # index set to 1 are 2, 4, and 12%9 = 3

I already know how to obtain a not sparse representation (Tensorflow: tensor binarization) and as I will end up with n > 1 million, I do not want to pass by the dense tensor to get the sparse one

thanks

asked Nov 23 '18 at 16:34

taktak004

409625

So the input is still dense in this case, right?

– jdehesa
Nov 23 '18 at 16:45

yes, input is still dense

– taktak004
Dec 1 '18 at 23:15

add a comment |

I hope the following example will make things clearer

Let's suppose I have a dataset like:

t = tf.constant([

  [0, 3, 4],

  [12, 2 ,4]])

ds = tf.data.Dataset.from_tensors(t)

I want to get the sparse equivalent of (if n = 9)

t = tf.constant([

  [1, 0, 0, 1, 1, 0, 0, 0, 0], # index set to 1 are 0, 3 and 4

  [0, 0, 1, 1, 1, 0, 0, 0, 0]]) # index set to 1 are 2, 4, and 12%9 = 3

I already know how to obtain a not sparse representation (Tensorflow: tensor binarization) and as I will end up with n > 1 million, I do not want to pass by the dense tensor to get the sparse one

thanks

asked Nov 23 '18 at 16:34

taktak004

409625

I hope the following example will make things clearer

Let's suppose I have a dataset like:

t = tf.constant([

  [0, 3, 4],

  [12, 2 ,4]])

ds = tf.data.Dataset.from_tensors(t)

I want to get the sparse equivalent of (if n = 9)

t = tf.constant([

  [1, 0, 0, 1, 1, 0, 0, 0, 0], # index set to 1 are 0, 3 and 4

  [0, 0, 1, 1, 1, 0, 0, 0, 0]]) # index set to 1 are 2, 4, and 12%9 = 3

I already know how to obtain a not sparse representation (Tensorflow: tensor binarization) and as I will end up with n > 1 million, I do not want to pass by the dense tensor to get the sparse one

thanks

python tensorflow sparse-matrix

asked Nov 23 '18 at 16:34

taktak004

409625

asked Nov 23 '18 at 16:34

taktak004

409625

asked Nov 23 '18 at 16:34

taktak004

409625

asked Nov 23 '18 at 16:34

taktak004

409625

asked Nov 23 '18 at 16:34

taktak004

409625

So the input is still dense in this case, right?

– jdehesa
Nov 23 '18 at 16:45

yes, input is still dense

– taktak004
Dec 1 '18 at 23:15

add a comment |

So the input is still dense in this case, right?

– jdehesa
Nov 23 '18 at 16:45

yes, input is still dense

– taktak004
Dec 1 '18 at 23:15

So the input is still dense in this case, right?

– jdehesa
Nov 23 '18 at 16:45

yes, input is still dense

– taktak004
Dec 1 '18 at 23:15

add a comment |

1 Answer
1

active

oldest

votes

Here is a possible implementation for that:

import tensorflow as tf



def binarization_sparse(t, n):

    # Input size

    t_shape = tf.shape(t)

    t_rows = t_shape[0]

    t_cols = t_shape[1]

    # Make sparse row indices for each value

    row_idx = tf.tile(tf.range(t_rows)[: ,tf.newaxis], [1, t_cols])

    # Sparse column indices

    col_idx = t % n

    # "Flat" indices - needed to discard repetitions

    total_idx = row_idx * n + col_idx

    # Remove repeated elements

    out_idx, _ = tf.unique(tf.reshape(total_idx, [-1]))

    # Back to row and column indices

    sparse_idx = tf.stack([out_idx // n, out_idx % n], axis=-1)

    # Sparse values

    sparse_values = tf.ones([tf.shape(sparse_idx)[0]], dtype=t.dtype)

    # Make sparse tensor

    out = tf.sparse.SparseTensor(tf.cast(sparse_idx, tf.int64),

                                 sparse_values,

                                 [t_rows, n])

    # Reorder indices

    out = tf.sparse.reorder(out)

    return out



# Test

with tf.Graph().as_default(), tf.Session() as sess:

    t = tf.constant([

        [ 0,  3,  4],

        [12,  2,  4]

    ])

    # Sparse result

    t_m1h_sp = binarization_sparse(t, 9)

    # Convert to dense to check output

    t_m1h = tf.sparse.to_dense(t_m1h_sp)

    print(sess.run(t_m1h))

Output:

[[1 0 0 1 1 0 0 0 0]

 [0 0 1 1 1 0 0 0 0]]

I added the logic to remove repeated elements because in principle it could happen, but if you have a guarantee that there are no repetitions (including modulo), you may skip that step. Also, I reorder the sparse tensor at the end. That is not strictly necessary here, but (I think) sparse operations sometimes expect the indices to be ordered (and sparse_idx may not be ordered).

Also, this solution is specific to 2D inputs. For 1D inputs would be simpler, and it can be written for higher-dimensional inputs as well if necessary. I think a completely general solution is possible but it would be more complicated (specially if you want to consider tensors with unknown number of dimensions).

answered Nov 23 '18 at 17:25

jdehesa

27.6k43759

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53450218%2ftensorflow-dense-tensor-to-sparse-binarized-hash-trick-tensor%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Here is a possible implementation for that:

import tensorflow as tf



def binarization_sparse(t, n):

    # Input size

    t_shape = tf.shape(t)

    t_rows = t_shape[0]

    t_cols = t_shape[1]

    # Make sparse row indices for each value

    row_idx = tf.tile(tf.range(t_rows)[: ,tf.newaxis], [1, t_cols])

    # Sparse column indices

    col_idx = t % n

    # "Flat" indices - needed to discard repetitions

    total_idx = row_idx * n + col_idx

    # Remove repeated elements

    out_idx, _ = tf.unique(tf.reshape(total_idx, [-1]))

    # Back to row and column indices

    sparse_idx = tf.stack([out_idx // n, out_idx % n], axis=-1)

    # Sparse values

    sparse_values = tf.ones([tf.shape(sparse_idx)[0]], dtype=t.dtype)

    # Make sparse tensor

    out = tf.sparse.SparseTensor(tf.cast(sparse_idx, tf.int64),

                                 sparse_values,

                                 [t_rows, n])

    # Reorder indices

    out = tf.sparse.reorder(out)

    return out



# Test

with tf.Graph().as_default(), tf.Session() as sess:

    t = tf.constant([

        [ 0,  3,  4],

        [12,  2,  4]

    ])

    # Sparse result

    t_m1h_sp = binarization_sparse(t, 9)

    # Convert to dense to check output

    t_m1h = tf.sparse.to_dense(t_m1h_sp)

    print(sess.run(t_m1h))

Output:

[[1 0 0 1 1 0 0 0 0]

 [0 0 1 1 1 0 0 0 0]]

answered Nov 23 '18 at 17:25

jdehesa

27.6k43759

add a comment |

Here is a possible implementation for that:

import tensorflow as tf



def binarization_sparse(t, n):

    # Input size

    t_shape = tf.shape(t)

    t_rows = t_shape[0]

    t_cols = t_shape[1]

    # Make sparse row indices for each value

    row_idx = tf.tile(tf.range(t_rows)[: ,tf.newaxis], [1, t_cols])

    # Sparse column indices

    col_idx = t % n

    # "Flat" indices - needed to discard repetitions

    total_idx = row_idx * n + col_idx

    # Remove repeated elements

    out_idx, _ = tf.unique(tf.reshape(total_idx, [-1]))

    # Back to row and column indices

    sparse_idx = tf.stack([out_idx // n, out_idx % n], axis=-1)

    # Sparse values

    sparse_values = tf.ones([tf.shape(sparse_idx)[0]], dtype=t.dtype)

    # Make sparse tensor

    out = tf.sparse.SparseTensor(tf.cast(sparse_idx, tf.int64),

                                 sparse_values,

                                 [t_rows, n])

    # Reorder indices

    out = tf.sparse.reorder(out)

    return out



# Test

with tf.Graph().as_default(), tf.Session() as sess:

    t = tf.constant([

        [ 0,  3,  4],

        [12,  2,  4]

    ])

    # Sparse result

    t_m1h_sp = binarization_sparse(t, 9)

    # Convert to dense to check output

    t_m1h = tf.sparse.to_dense(t_m1h_sp)

    print(sess.run(t_m1h))

Output:

[[1 0 0 1 1 0 0 0 0]

 [0 0 1 1 1 0 0 0 0]]

answered Nov 23 '18 at 17:25

jdehesa

27.6k43759

add a comment |

Here is a possible implementation for that:

import tensorflow as tf



def binarization_sparse(t, n):

    # Input size

    t_shape = tf.shape(t)

    t_rows = t_shape[0]

    t_cols = t_shape[1]

    # Make sparse row indices for each value

    row_idx = tf.tile(tf.range(t_rows)[: ,tf.newaxis], [1, t_cols])

    # Sparse column indices

    col_idx = t % n

    # "Flat" indices - needed to discard repetitions

    total_idx = row_idx * n + col_idx

    # Remove repeated elements

    out_idx, _ = tf.unique(tf.reshape(total_idx, [-1]))

    # Back to row and column indices

    sparse_idx = tf.stack([out_idx // n, out_idx % n], axis=-1)

    # Sparse values

    sparse_values = tf.ones([tf.shape(sparse_idx)[0]], dtype=t.dtype)

    # Make sparse tensor

    out = tf.sparse.SparseTensor(tf.cast(sparse_idx, tf.int64),

                                 sparse_values,

                                 [t_rows, n])

    # Reorder indices

    out = tf.sparse.reorder(out)

    return out



# Test

with tf.Graph().as_default(), tf.Session() as sess:

    t = tf.constant([

        [ 0,  3,  4],

        [12,  2,  4]

    ])

    # Sparse result

    t_m1h_sp = binarization_sparse(t, 9)

    # Convert to dense to check output

    t_m1h = tf.sparse.to_dense(t_m1h_sp)

    print(sess.run(t_m1h))

Output:

[[1 0 0 1 1 0 0 0 0]

 [0 0 1 1 1 0 0 0 0]]

answered Nov 23 '18 at 17:25

jdehesa

27.6k43759

Here is a possible implementation for that:

import tensorflow as tf



def binarization_sparse(t, n):

    # Input size

    t_shape = tf.shape(t)

    t_rows = t_shape[0]

    t_cols = t_shape[1]

    # Make sparse row indices for each value

    row_idx = tf.tile(tf.range(t_rows)[: ,tf.newaxis], [1, t_cols])

    # Sparse column indices

    col_idx = t % n

    # "Flat" indices - needed to discard repetitions

    total_idx = row_idx * n + col_idx

    # Remove repeated elements

    out_idx, _ = tf.unique(tf.reshape(total_idx, [-1]))

    # Back to row and column indices

    sparse_idx = tf.stack([out_idx // n, out_idx % n], axis=-1)

    # Sparse values

    sparse_values = tf.ones([tf.shape(sparse_idx)[0]], dtype=t.dtype)

    # Make sparse tensor

    out = tf.sparse.SparseTensor(tf.cast(sparse_idx, tf.int64),

                                 sparse_values,

                                 [t_rows, n])

    # Reorder indices

    out = tf.sparse.reorder(out)

    return out



# Test

with tf.Graph().as_default(), tf.Session() as sess:

    t = tf.constant([

        [ 0,  3,  4],

        [12,  2,  4]

    ])

    # Sparse result

    t_m1h_sp = binarization_sparse(t, 9)

    # Convert to dense to check output

    t_m1h = tf.sparse.to_dense(t_m1h_sp)

    print(sess.run(t_m1h))

Output:

[[1 0 0 1 1 0 0 0 0]

 [0 0 1 1 1 0 0 0 0]]

answered Nov 23 '18 at 17:25

jdehesa

27.6k43759

answered Nov 23 '18 at 17:25

jdehesa

27.6k43759

answered Nov 23 '18 at 17:25

jdehesa

27.6k43759

answered Nov 23 '18 at 17:25

jdehesa

27.6k43759

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

SYSMwhvDE0CMM17tD,yu3vGshF7zdi,Uctr5L5ge0L1Yj LpkvNNY83mQ,vx8qp,OhDnQ,70CO98d52j,s4AeIe

搜尋此網誌

Argthtjtr