Time-efficient matrix elements grouping and summing












5












$begingroup$


I'm interested in finding the quickest way of grouping the elements of a large matrix in sub-groups of NxM elements and them summing them together.
To be completely clear, I'm actually not interested in the "regrouped" matrice, but only in the final results where the elements are summed.



I'll show you an example below:



Say I have the following matrix 9x8:



test = Array[Subscript[a, ##] &, {8, 9}]



enter image description here




I regroup it in sub-matrices NxM, in this example 3x2:



subtest = Partition[test, {2, 3}]



enter image description here




and then I sum them together (as suggested in the comment by @:



out = MapAt[Total[#, -1] &, subtest, {All, All}];



enter image description here




I could use other ways of summing the subgroups, as for example:



    out = Total /@ Flatten /@ # & /@ subtest;


Or using two nested tables, or for loops, etc.



My question is what is the fastest method for doing this? I need to do it on a 48k x 48k matrix, so I'd really need something reasonably quick.
Should I look into compiling nested for loops in C (not sure, I haven't ever tried)?



Something worth mentioning is that the entries of the matrix are all integers larger or equal to 0.



I'll add a (redundant) example with numeric values, thac can be however modified to larger matrices:



test = RandomInteger[1, {8, 9}];



{{0, 0, 1, 0, 1, 1, 0, 0, 0}, {1, 1, 0, 1, 0, 0, 1, 0, 0}, {0, 0, 1,
1, 1, 1, 0, 0, 0}, {0, 0, 1, 0, 0, 1, 1, 0, 1}, {0, 0, 1, 1, 0, 0,
1, 1, 1}, {1, 0, 0, 1, 0, 1, 1, 1, 1}, {1, 0, 1, 1, 1, 0, 1, 1,
0}, {0, 0, 1, 1, 1, 0, 1, 0, 1}}




m = 3
n = 2
out = MapAt[Total[#, -1] &, Partition[test, {n, m}], {All, All}]



{{3, 3, 1}, {2, 4, 2}, {2, 3, 6}, {3, 4, 4}}











share|improve this question











$endgroup$








  • 3




    $begingroup$
    No need to map if you use the second argument of Total: Total[Partition[test, {2, 3}], {3, 4}].
    $endgroup$
    – J. M. is computer-less
    8 hours ago










  • $begingroup$
    oh yes, you are absolutely right, I will change my post accordingly, thanks! Still, I'm not 100% sure this is the fastest solution
    $endgroup$
    – Fraccalo
    7 hours ago












  • $begingroup$
    Converting entries to machine precision numbers will speed up the process. Integers in Mathematica are of arbitrary precision and often slows down the calculation.
    $endgroup$
    – ukar
    6 hours ago










  • $begingroup$
    @ukar True, but I think that converting a 48k x 48k matrix with N with take a lot of time, worth doing some testing though
    $endgroup$
    – Fraccalo
    6 hours ago






  • 1




    $begingroup$
    @ukar: Wrong. When the integers do not exceed the bounda for machine integera (64 bit integers), Mathematica has a chance to use packed arrays and machine integer computations. And indeed, the matrices generated by RandomInteger[1, {n, n}] are packed which can be checked with Developer`PackedArrayQ[test].
    $endgroup$
    – Henrik Schumacher
    6 hours ago


















5












$begingroup$


I'm interested in finding the quickest way of grouping the elements of a large matrix in sub-groups of NxM elements and them summing them together.
To be completely clear, I'm actually not interested in the "regrouped" matrice, but only in the final results where the elements are summed.



I'll show you an example below:



Say I have the following matrix 9x8:



test = Array[Subscript[a, ##] &, {8, 9}]



enter image description here




I regroup it in sub-matrices NxM, in this example 3x2:



subtest = Partition[test, {2, 3}]



enter image description here




and then I sum them together (as suggested in the comment by @:



out = MapAt[Total[#, -1] &, subtest, {All, All}];



enter image description here




I could use other ways of summing the subgroups, as for example:



    out = Total /@ Flatten /@ # & /@ subtest;


Or using two nested tables, or for loops, etc.



My question is what is the fastest method for doing this? I need to do it on a 48k x 48k matrix, so I'd really need something reasonably quick.
Should I look into compiling nested for loops in C (not sure, I haven't ever tried)?



Something worth mentioning is that the entries of the matrix are all integers larger or equal to 0.



I'll add a (redundant) example with numeric values, thac can be however modified to larger matrices:



test = RandomInteger[1, {8, 9}];



{{0, 0, 1, 0, 1, 1, 0, 0, 0}, {1, 1, 0, 1, 0, 0, 1, 0, 0}, {0, 0, 1,
1, 1, 1, 0, 0, 0}, {0, 0, 1, 0, 0, 1, 1, 0, 1}, {0, 0, 1, 1, 0, 0,
1, 1, 1}, {1, 0, 0, 1, 0, 1, 1, 1, 1}, {1, 0, 1, 1, 1, 0, 1, 1,
0}, {0, 0, 1, 1, 1, 0, 1, 0, 1}}




m = 3
n = 2
out = MapAt[Total[#, -1] &, Partition[test, {n, m}], {All, All}]



{{3, 3, 1}, {2, 4, 2}, {2, 3, 6}, {3, 4, 4}}











share|improve this question











$endgroup$








  • 3




    $begingroup$
    No need to map if you use the second argument of Total: Total[Partition[test, {2, 3}], {3, 4}].
    $endgroup$
    – J. M. is computer-less
    8 hours ago










  • $begingroup$
    oh yes, you are absolutely right, I will change my post accordingly, thanks! Still, I'm not 100% sure this is the fastest solution
    $endgroup$
    – Fraccalo
    7 hours ago












  • $begingroup$
    Converting entries to machine precision numbers will speed up the process. Integers in Mathematica are of arbitrary precision and often slows down the calculation.
    $endgroup$
    – ukar
    6 hours ago










  • $begingroup$
    @ukar True, but I think that converting a 48k x 48k matrix with N with take a lot of time, worth doing some testing though
    $endgroup$
    – Fraccalo
    6 hours ago






  • 1




    $begingroup$
    @ukar: Wrong. When the integers do not exceed the bounda for machine integera (64 bit integers), Mathematica has a chance to use packed arrays and machine integer computations. And indeed, the matrices generated by RandomInteger[1, {n, n}] are packed which can be checked with Developer`PackedArrayQ[test].
    $endgroup$
    – Henrik Schumacher
    6 hours ago
















5












5








5


2



$begingroup$


I'm interested in finding the quickest way of grouping the elements of a large matrix in sub-groups of NxM elements and them summing them together.
To be completely clear, I'm actually not interested in the "regrouped" matrice, but only in the final results where the elements are summed.



I'll show you an example below:



Say I have the following matrix 9x8:



test = Array[Subscript[a, ##] &, {8, 9}]



enter image description here




I regroup it in sub-matrices NxM, in this example 3x2:



subtest = Partition[test, {2, 3}]



enter image description here




and then I sum them together (as suggested in the comment by @:



out = MapAt[Total[#, -1] &, subtest, {All, All}];



enter image description here




I could use other ways of summing the subgroups, as for example:



    out = Total /@ Flatten /@ # & /@ subtest;


Or using two nested tables, or for loops, etc.



My question is what is the fastest method for doing this? I need to do it on a 48k x 48k matrix, so I'd really need something reasonably quick.
Should I look into compiling nested for loops in C (not sure, I haven't ever tried)?



Something worth mentioning is that the entries of the matrix are all integers larger or equal to 0.



I'll add a (redundant) example with numeric values, thac can be however modified to larger matrices:



test = RandomInteger[1, {8, 9}];



{{0, 0, 1, 0, 1, 1, 0, 0, 0}, {1, 1, 0, 1, 0, 0, 1, 0, 0}, {0, 0, 1,
1, 1, 1, 0, 0, 0}, {0, 0, 1, 0, 0, 1, 1, 0, 1}, {0, 0, 1, 1, 0, 0,
1, 1, 1}, {1, 0, 0, 1, 0, 1, 1, 1, 1}, {1, 0, 1, 1, 1, 0, 1, 1,
0}, {0, 0, 1, 1, 1, 0, 1, 0, 1}}




m = 3
n = 2
out = MapAt[Total[#, -1] &, Partition[test, {n, m}], {All, All}]



{{3, 3, 1}, {2, 4, 2}, {2, 3, 6}, {3, 4, 4}}











share|improve this question











$endgroup$




I'm interested in finding the quickest way of grouping the elements of a large matrix in sub-groups of NxM elements and them summing them together.
To be completely clear, I'm actually not interested in the "regrouped" matrice, but only in the final results where the elements are summed.



I'll show you an example below:



Say I have the following matrix 9x8:



test = Array[Subscript[a, ##] &, {8, 9}]



enter image description here




I regroup it in sub-matrices NxM, in this example 3x2:



subtest = Partition[test, {2, 3}]



enter image description here




and then I sum them together (as suggested in the comment by @:



out = MapAt[Total[#, -1] &, subtest, {All, All}];



enter image description here




I could use other ways of summing the subgroups, as for example:



    out = Total /@ Flatten /@ # & /@ subtest;


Or using two nested tables, or for loops, etc.



My question is what is the fastest method for doing this? I need to do it on a 48k x 48k matrix, so I'd really need something reasonably quick.
Should I look into compiling nested for loops in C (not sure, I haven't ever tried)?



Something worth mentioning is that the entries of the matrix are all integers larger or equal to 0.



I'll add a (redundant) example with numeric values, thac can be however modified to larger matrices:



test = RandomInteger[1, {8, 9}];



{{0, 0, 1, 0, 1, 1, 0, 0, 0}, {1, 1, 0, 1, 0, 0, 1, 0, 0}, {0, 0, 1,
1, 1, 1, 0, 0, 0}, {0, 0, 1, 0, 0, 1, 1, 0, 1}, {0, 0, 1, 1, 0, 0,
1, 1, 1}, {1, 0, 0, 1, 0, 1, 1, 1, 1}, {1, 0, 1, 1, 1, 0, 1, 1,
0}, {0, 0, 1, 1, 1, 0, 1, 0, 1}}




m = 3
n = 2
out = MapAt[Total[#, -1] &, Partition[test, {n, m}], {All, All}]



{{3, 3, 1}, {2, 4, 2}, {2, 3, 6}, {3, 4, 4}}








matrix






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 6 hours ago







Fraccalo

















asked 8 hours ago









FraccaloFraccalo

2,481518




2,481518








  • 3




    $begingroup$
    No need to map if you use the second argument of Total: Total[Partition[test, {2, 3}], {3, 4}].
    $endgroup$
    – J. M. is computer-less
    8 hours ago










  • $begingroup$
    oh yes, you are absolutely right, I will change my post accordingly, thanks! Still, I'm not 100% sure this is the fastest solution
    $endgroup$
    – Fraccalo
    7 hours ago












  • $begingroup$
    Converting entries to machine precision numbers will speed up the process. Integers in Mathematica are of arbitrary precision and often slows down the calculation.
    $endgroup$
    – ukar
    6 hours ago










  • $begingroup$
    @ukar True, but I think that converting a 48k x 48k matrix with N with take a lot of time, worth doing some testing though
    $endgroup$
    – Fraccalo
    6 hours ago






  • 1




    $begingroup$
    @ukar: Wrong. When the integers do not exceed the bounda for machine integera (64 bit integers), Mathematica has a chance to use packed arrays and machine integer computations. And indeed, the matrices generated by RandomInteger[1, {n, n}] are packed which can be checked with Developer`PackedArrayQ[test].
    $endgroup$
    – Henrik Schumacher
    6 hours ago
















  • 3




    $begingroup$
    No need to map if you use the second argument of Total: Total[Partition[test, {2, 3}], {3, 4}].
    $endgroup$
    – J. M. is computer-less
    8 hours ago










  • $begingroup$
    oh yes, you are absolutely right, I will change my post accordingly, thanks! Still, I'm not 100% sure this is the fastest solution
    $endgroup$
    – Fraccalo
    7 hours ago












  • $begingroup$
    Converting entries to machine precision numbers will speed up the process. Integers in Mathematica are of arbitrary precision and often slows down the calculation.
    $endgroup$
    – ukar
    6 hours ago










  • $begingroup$
    @ukar True, but I think that converting a 48k x 48k matrix with N with take a lot of time, worth doing some testing though
    $endgroup$
    – Fraccalo
    6 hours ago






  • 1




    $begingroup$
    @ukar: Wrong. When the integers do not exceed the bounda for machine integera (64 bit integers), Mathematica has a chance to use packed arrays and machine integer computations. And indeed, the matrices generated by RandomInteger[1, {n, n}] are packed which can be checked with Developer`PackedArrayQ[test].
    $endgroup$
    – Henrik Schumacher
    6 hours ago










3




3




$begingroup$
No need to map if you use the second argument of Total: Total[Partition[test, {2, 3}], {3, 4}].
$endgroup$
– J. M. is computer-less
8 hours ago




$begingroup$
No need to map if you use the second argument of Total: Total[Partition[test, {2, 3}], {3, 4}].
$endgroup$
– J. M. is computer-less
8 hours ago












$begingroup$
oh yes, you are absolutely right, I will change my post accordingly, thanks! Still, I'm not 100% sure this is the fastest solution
$endgroup$
– Fraccalo
7 hours ago






$begingroup$
oh yes, you are absolutely right, I will change my post accordingly, thanks! Still, I'm not 100% sure this is the fastest solution
$endgroup$
– Fraccalo
7 hours ago














$begingroup$
Converting entries to machine precision numbers will speed up the process. Integers in Mathematica are of arbitrary precision and often slows down the calculation.
$endgroup$
– ukar
6 hours ago




$begingroup$
Converting entries to machine precision numbers will speed up the process. Integers in Mathematica are of arbitrary precision and often slows down the calculation.
$endgroup$
– ukar
6 hours ago












$begingroup$
@ukar True, but I think that converting a 48k x 48k matrix with N with take a lot of time, worth doing some testing though
$endgroup$
– Fraccalo
6 hours ago




$begingroup$
@ukar True, but I think that converting a 48k x 48k matrix with N with take a lot of time, worth doing some testing though
$endgroup$
– Fraccalo
6 hours ago




1




1




$begingroup$
@ukar: Wrong. When the integers do not exceed the bounda for machine integera (64 bit integers), Mathematica has a chance to use packed arrays and machine integer computations. And indeed, the matrices generated by RandomInteger[1, {n, n}] are packed which can be checked with Developer`PackedArrayQ[test].
$endgroup$
– Henrik Schumacher
6 hours ago






$begingroup$
@ukar: Wrong. When the integers do not exceed the bounda for machine integera (64 bit integers), Mathematica has a chance to use packed arrays and machine integer computations. And indeed, the matrices generated by RandomInteger[1, {n, n}] are packed which can be checked with Developer`PackedArrayQ[test].
$endgroup$
– Henrik Schumacher
6 hours ago












2 Answers
2






active

oldest

votes


















4












$begingroup$

Partition[test, {2,3}] is quite slow in this case because it has to rearrange the elements in the data vector that represents the entries of a packed array in the backend:



Flatten[test] == Flatten[Partition[test, {2, 3}]]



False




Using Span (;;) as follows employs 6 monotonically increasing read operations; in this specific case, these operations are faster than using Partition:



n = 24000;
test = RandomInteger[1, {n, n}];

a = Total[Partition[test, {2, 3}], {3, 4}]; //
AbsoluteTiming // First
b = Sum[test[[i ;; ;; 2, j ;; ;; 3]], {i, 1, 2}, {j, 1, 3}]; //
AbsoluteTiming // First
a == b



245.89



117.943



True




However, this performance advantage seems to decay when the matrix test becomes bigger (so swapping is required). E.g., for $n = 4800$, method b is aboutten times faster thana`, but for $n = 24000$, it's only a factor of 4.6 and here it has degraded to a factor of 2 or so...



SparseArray method



Have I said already that I love SparseArrays?



AbsoluteTiming[
c = Dot[
KroneckerProduct[
IdentityMatrix[n/2, SparseArray],
ConstantArray[1, {1, 2}]
],
Dot[
test,
KroneckerProduct[
IdentityMatrix[n/3, SparseArray],
ConstantArray[1, {3, 1}]
]
]
]
][[1]]
a == c



76.3822



True




The story goes on...



A combination of the SparseArray method from above with a CompiledFunction):



cf = Compile[{{x, _Integer, 1}, {k, _Integer}},
Table[
Sum[Compile`GetElement[x, i + j], {j, 1, k}],
{i, 0, Length[x] - 1, k}],
CompilationTarget -> "C",
RuntimeAttributes -> {Listable},
Parallelization -> True,
RuntimeOptions -> "Speed"
];
d = KroneckerProduct[
IdentityMatrix[n/2, SparseArray],
ConstantArray[1, {1, 2}]
].cf[test, 3]; // AbsoluteTiming // First
a == d



33.5677



True







share|improve this answer











$endgroup$













  • $begingroup$
    Thanks! The sparse array will need some studying on my side for fully understand what's going on there, but it look quite promising in terms of speed up!
    $endgroup$
    – Fraccalo
    6 hours ago












  • $begingroup$
    @Fraccalo Have look my latest edits. I seem to have found a method that is twice as fast.
    $endgroup$
    – Henrik Schumacher
    5 hours ago






  • 1




    $begingroup$
    This looks amazing, thank you so much @Henrik Schumacher! I'll go through the details first thing tomorrow morning, tons of stuff to learn (I'm not very familiar with the compile function, and haven't used sparse arrays more then a couple of times so far, so I really have lot of new things to learn here :) )
    $endgroup$
    – Fraccalo
    3 hours ago



















0












$begingroup$

These two aren't faster, but I found them of interest in that they pose the problem in a different way. The Downsample command is easy to describe and use, but slower than then the direct ;; command as I am building the matrices.



From above, for comparison:



n = 6000;
test = RandomInteger[100, {n, n}];

a = Total[Partition[test, {2, 3}], {3, 4}]; // AbsoluteTiming // First
b = Sum[test[[i ;; ;; 2, j ;; ;; 3]], {i, 1, 2}, {j, 1, 3}]; // AbsoluteTiming // First
a == b



1.72742



0.402294



True




New methods



c = Sum[Downsample[test, {2, 3}, {i, j}], {i, 2}, {j, 3}]) //  AbsoluteTiming // First
a == c



2.12463



True




An experiment with ListConvolve. If I could get it to "bound" through the target matrix, it could be pretty fast, as I am throwing out 5/6 of the effort below. I know ListConvolve does take advantage of sparse matrices. Not sure how to exploit that.



kernel = {{1, 1, 1}, {1, 1, 1}};
d = Downsample[ListCorrelate[kernel, test], {2, 3}]; // AbsoluteTiming // First
a == d



3.21



True







share|improve this answer









$endgroup$













    Your Answer





    StackExchange.ifUsing("editor", function () {
    return StackExchange.using("mathjaxEditing", function () {
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    });
    });
    }, "mathjax-editing");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "387"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f192187%2ftime-efficient-matrix-elements-grouping-and-summing%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    4












    $begingroup$

    Partition[test, {2,3}] is quite slow in this case because it has to rearrange the elements in the data vector that represents the entries of a packed array in the backend:



    Flatten[test] == Flatten[Partition[test, {2, 3}]]



    False




    Using Span (;;) as follows employs 6 monotonically increasing read operations; in this specific case, these operations are faster than using Partition:



    n = 24000;
    test = RandomInteger[1, {n, n}];

    a = Total[Partition[test, {2, 3}], {3, 4}]; //
    AbsoluteTiming // First
    b = Sum[test[[i ;; ;; 2, j ;; ;; 3]], {i, 1, 2}, {j, 1, 3}]; //
    AbsoluteTiming // First
    a == b



    245.89



    117.943



    True




    However, this performance advantage seems to decay when the matrix test becomes bigger (so swapping is required). E.g., for $n = 4800$, method b is aboutten times faster thana`, but for $n = 24000$, it's only a factor of 4.6 and here it has degraded to a factor of 2 or so...



    SparseArray method



    Have I said already that I love SparseArrays?



    AbsoluteTiming[
    c = Dot[
    KroneckerProduct[
    IdentityMatrix[n/2, SparseArray],
    ConstantArray[1, {1, 2}]
    ],
    Dot[
    test,
    KroneckerProduct[
    IdentityMatrix[n/3, SparseArray],
    ConstantArray[1, {3, 1}]
    ]
    ]
    ]
    ][[1]]
    a == c



    76.3822



    True




    The story goes on...



    A combination of the SparseArray method from above with a CompiledFunction):



    cf = Compile[{{x, _Integer, 1}, {k, _Integer}},
    Table[
    Sum[Compile`GetElement[x, i + j], {j, 1, k}],
    {i, 0, Length[x] - 1, k}],
    CompilationTarget -> "C",
    RuntimeAttributes -> {Listable},
    Parallelization -> True,
    RuntimeOptions -> "Speed"
    ];
    d = KroneckerProduct[
    IdentityMatrix[n/2, SparseArray],
    ConstantArray[1, {1, 2}]
    ].cf[test, 3]; // AbsoluteTiming // First
    a == d



    33.5677



    True







    share|improve this answer











    $endgroup$













    • $begingroup$
      Thanks! The sparse array will need some studying on my side for fully understand what's going on there, but it look quite promising in terms of speed up!
      $endgroup$
      – Fraccalo
      6 hours ago












    • $begingroup$
      @Fraccalo Have look my latest edits. I seem to have found a method that is twice as fast.
      $endgroup$
      – Henrik Schumacher
      5 hours ago






    • 1




      $begingroup$
      This looks amazing, thank you so much @Henrik Schumacher! I'll go through the details first thing tomorrow morning, tons of stuff to learn (I'm not very familiar with the compile function, and haven't used sparse arrays more then a couple of times so far, so I really have lot of new things to learn here :) )
      $endgroup$
      – Fraccalo
      3 hours ago
















    4












    $begingroup$

    Partition[test, {2,3}] is quite slow in this case because it has to rearrange the elements in the data vector that represents the entries of a packed array in the backend:



    Flatten[test] == Flatten[Partition[test, {2, 3}]]



    False




    Using Span (;;) as follows employs 6 monotonically increasing read operations; in this specific case, these operations are faster than using Partition:



    n = 24000;
    test = RandomInteger[1, {n, n}];

    a = Total[Partition[test, {2, 3}], {3, 4}]; //
    AbsoluteTiming // First
    b = Sum[test[[i ;; ;; 2, j ;; ;; 3]], {i, 1, 2}, {j, 1, 3}]; //
    AbsoluteTiming // First
    a == b



    245.89



    117.943



    True




    However, this performance advantage seems to decay when the matrix test becomes bigger (so swapping is required). E.g., for $n = 4800$, method b is aboutten times faster thana`, but for $n = 24000$, it's only a factor of 4.6 and here it has degraded to a factor of 2 or so...



    SparseArray method



    Have I said already that I love SparseArrays?



    AbsoluteTiming[
    c = Dot[
    KroneckerProduct[
    IdentityMatrix[n/2, SparseArray],
    ConstantArray[1, {1, 2}]
    ],
    Dot[
    test,
    KroneckerProduct[
    IdentityMatrix[n/3, SparseArray],
    ConstantArray[1, {3, 1}]
    ]
    ]
    ]
    ][[1]]
    a == c



    76.3822



    True




    The story goes on...



    A combination of the SparseArray method from above with a CompiledFunction):



    cf = Compile[{{x, _Integer, 1}, {k, _Integer}},
    Table[
    Sum[Compile`GetElement[x, i + j], {j, 1, k}],
    {i, 0, Length[x] - 1, k}],
    CompilationTarget -> "C",
    RuntimeAttributes -> {Listable},
    Parallelization -> True,
    RuntimeOptions -> "Speed"
    ];
    d = KroneckerProduct[
    IdentityMatrix[n/2, SparseArray],
    ConstantArray[1, {1, 2}]
    ].cf[test, 3]; // AbsoluteTiming // First
    a == d



    33.5677



    True







    share|improve this answer











    $endgroup$













    • $begingroup$
      Thanks! The sparse array will need some studying on my side for fully understand what's going on there, but it look quite promising in terms of speed up!
      $endgroup$
      – Fraccalo
      6 hours ago












    • $begingroup$
      @Fraccalo Have look my latest edits. I seem to have found a method that is twice as fast.
      $endgroup$
      – Henrik Schumacher
      5 hours ago






    • 1




      $begingroup$
      This looks amazing, thank you so much @Henrik Schumacher! I'll go through the details first thing tomorrow morning, tons of stuff to learn (I'm not very familiar with the compile function, and haven't used sparse arrays more then a couple of times so far, so I really have lot of new things to learn here :) )
      $endgroup$
      – Fraccalo
      3 hours ago














    4












    4








    4





    $begingroup$

    Partition[test, {2,3}] is quite slow in this case because it has to rearrange the elements in the data vector that represents the entries of a packed array in the backend:



    Flatten[test] == Flatten[Partition[test, {2, 3}]]



    False




    Using Span (;;) as follows employs 6 monotonically increasing read operations; in this specific case, these operations are faster than using Partition:



    n = 24000;
    test = RandomInteger[1, {n, n}];

    a = Total[Partition[test, {2, 3}], {3, 4}]; //
    AbsoluteTiming // First
    b = Sum[test[[i ;; ;; 2, j ;; ;; 3]], {i, 1, 2}, {j, 1, 3}]; //
    AbsoluteTiming // First
    a == b



    245.89



    117.943



    True




    However, this performance advantage seems to decay when the matrix test becomes bigger (so swapping is required). E.g., for $n = 4800$, method b is aboutten times faster thana`, but for $n = 24000$, it's only a factor of 4.6 and here it has degraded to a factor of 2 or so...



    SparseArray method



    Have I said already that I love SparseArrays?



    AbsoluteTiming[
    c = Dot[
    KroneckerProduct[
    IdentityMatrix[n/2, SparseArray],
    ConstantArray[1, {1, 2}]
    ],
    Dot[
    test,
    KroneckerProduct[
    IdentityMatrix[n/3, SparseArray],
    ConstantArray[1, {3, 1}]
    ]
    ]
    ]
    ][[1]]
    a == c



    76.3822



    True




    The story goes on...



    A combination of the SparseArray method from above with a CompiledFunction):



    cf = Compile[{{x, _Integer, 1}, {k, _Integer}},
    Table[
    Sum[Compile`GetElement[x, i + j], {j, 1, k}],
    {i, 0, Length[x] - 1, k}],
    CompilationTarget -> "C",
    RuntimeAttributes -> {Listable},
    Parallelization -> True,
    RuntimeOptions -> "Speed"
    ];
    d = KroneckerProduct[
    IdentityMatrix[n/2, SparseArray],
    ConstantArray[1, {1, 2}]
    ].cf[test, 3]; // AbsoluteTiming // First
    a == d



    33.5677



    True







    share|improve this answer











    $endgroup$



    Partition[test, {2,3}] is quite slow in this case because it has to rearrange the elements in the data vector that represents the entries of a packed array in the backend:



    Flatten[test] == Flatten[Partition[test, {2, 3}]]



    False




    Using Span (;;) as follows employs 6 monotonically increasing read operations; in this specific case, these operations are faster than using Partition:



    n = 24000;
    test = RandomInteger[1, {n, n}];

    a = Total[Partition[test, {2, 3}], {3, 4}]; //
    AbsoluteTiming // First
    b = Sum[test[[i ;; ;; 2, j ;; ;; 3]], {i, 1, 2}, {j, 1, 3}]; //
    AbsoluteTiming // First
    a == b



    245.89



    117.943



    True




    However, this performance advantage seems to decay when the matrix test becomes bigger (so swapping is required). E.g., for $n = 4800$, method b is aboutten times faster thana`, but for $n = 24000$, it's only a factor of 4.6 and here it has degraded to a factor of 2 or so...



    SparseArray method



    Have I said already that I love SparseArrays?



    AbsoluteTiming[
    c = Dot[
    KroneckerProduct[
    IdentityMatrix[n/2, SparseArray],
    ConstantArray[1, {1, 2}]
    ],
    Dot[
    test,
    KroneckerProduct[
    IdentityMatrix[n/3, SparseArray],
    ConstantArray[1, {3, 1}]
    ]
    ]
    ]
    ][[1]]
    a == c



    76.3822



    True




    The story goes on...



    A combination of the SparseArray method from above with a CompiledFunction):



    cf = Compile[{{x, _Integer, 1}, {k, _Integer}},
    Table[
    Sum[Compile`GetElement[x, i + j], {j, 1, k}],
    {i, 0, Length[x] - 1, k}],
    CompilationTarget -> "C",
    RuntimeAttributes -> {Listable},
    Parallelization -> True,
    RuntimeOptions -> "Speed"
    ];
    d = KroneckerProduct[
    IdentityMatrix[n/2, SparseArray],
    ConstantArray[1, {1, 2}]
    ].cf[test, 3]; // AbsoluteTiming // First
    a == d



    33.5677



    True








    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited 5 hours ago

























    answered 6 hours ago









    Henrik SchumacherHenrik Schumacher

    54.9k474153




    54.9k474153












    • $begingroup$
      Thanks! The sparse array will need some studying on my side for fully understand what's going on there, but it look quite promising in terms of speed up!
      $endgroup$
      – Fraccalo
      6 hours ago












    • $begingroup$
      @Fraccalo Have look my latest edits. I seem to have found a method that is twice as fast.
      $endgroup$
      – Henrik Schumacher
      5 hours ago






    • 1




      $begingroup$
      This looks amazing, thank you so much @Henrik Schumacher! I'll go through the details first thing tomorrow morning, tons of stuff to learn (I'm not very familiar with the compile function, and haven't used sparse arrays more then a couple of times so far, so I really have lot of new things to learn here :) )
      $endgroup$
      – Fraccalo
      3 hours ago


















    • $begingroup$
      Thanks! The sparse array will need some studying on my side for fully understand what's going on there, but it look quite promising in terms of speed up!
      $endgroup$
      – Fraccalo
      6 hours ago












    • $begingroup$
      @Fraccalo Have look my latest edits. I seem to have found a method that is twice as fast.
      $endgroup$
      – Henrik Schumacher
      5 hours ago






    • 1




      $begingroup$
      This looks amazing, thank you so much @Henrik Schumacher! I'll go through the details first thing tomorrow morning, tons of stuff to learn (I'm not very familiar with the compile function, and haven't used sparse arrays more then a couple of times so far, so I really have lot of new things to learn here :) )
      $endgroup$
      – Fraccalo
      3 hours ago
















    $begingroup$
    Thanks! The sparse array will need some studying on my side for fully understand what's going on there, but it look quite promising in terms of speed up!
    $endgroup$
    – Fraccalo
    6 hours ago






    $begingroup$
    Thanks! The sparse array will need some studying on my side for fully understand what's going on there, but it look quite promising in terms of speed up!
    $endgroup$
    – Fraccalo
    6 hours ago














    $begingroup$
    @Fraccalo Have look my latest edits. I seem to have found a method that is twice as fast.
    $endgroup$
    – Henrik Schumacher
    5 hours ago




    $begingroup$
    @Fraccalo Have look my latest edits. I seem to have found a method that is twice as fast.
    $endgroup$
    – Henrik Schumacher
    5 hours ago




    1




    1




    $begingroup$
    This looks amazing, thank you so much @Henrik Schumacher! I'll go through the details first thing tomorrow morning, tons of stuff to learn (I'm not very familiar with the compile function, and haven't used sparse arrays more then a couple of times so far, so I really have lot of new things to learn here :) )
    $endgroup$
    – Fraccalo
    3 hours ago




    $begingroup$
    This looks amazing, thank you so much @Henrik Schumacher! I'll go through the details first thing tomorrow morning, tons of stuff to learn (I'm not very familiar with the compile function, and haven't used sparse arrays more then a couple of times so far, so I really have lot of new things to learn here :) )
    $endgroup$
    – Fraccalo
    3 hours ago











    0












    $begingroup$

    These two aren't faster, but I found them of interest in that they pose the problem in a different way. The Downsample command is easy to describe and use, but slower than then the direct ;; command as I am building the matrices.



    From above, for comparison:



    n = 6000;
    test = RandomInteger[100, {n, n}];

    a = Total[Partition[test, {2, 3}], {3, 4}]; // AbsoluteTiming // First
    b = Sum[test[[i ;; ;; 2, j ;; ;; 3]], {i, 1, 2}, {j, 1, 3}]; // AbsoluteTiming // First
    a == b



    1.72742



    0.402294



    True




    New methods



    c = Sum[Downsample[test, {2, 3}, {i, j}], {i, 2}, {j, 3}]) //  AbsoluteTiming // First
    a == c



    2.12463



    True




    An experiment with ListConvolve. If I could get it to "bound" through the target matrix, it could be pretty fast, as I am throwing out 5/6 of the effort below. I know ListConvolve does take advantage of sparse matrices. Not sure how to exploit that.



    kernel = {{1, 1, 1}, {1, 1, 1}};
    d = Downsample[ListCorrelate[kernel, test], {2, 3}]; // AbsoluteTiming // First
    a == d



    3.21



    True







    share|improve this answer









    $endgroup$


















      0












      $begingroup$

      These two aren't faster, but I found them of interest in that they pose the problem in a different way. The Downsample command is easy to describe and use, but slower than then the direct ;; command as I am building the matrices.



      From above, for comparison:



      n = 6000;
      test = RandomInteger[100, {n, n}];

      a = Total[Partition[test, {2, 3}], {3, 4}]; // AbsoluteTiming // First
      b = Sum[test[[i ;; ;; 2, j ;; ;; 3]], {i, 1, 2}, {j, 1, 3}]; // AbsoluteTiming // First
      a == b



      1.72742



      0.402294



      True




      New methods



      c = Sum[Downsample[test, {2, 3}, {i, j}], {i, 2}, {j, 3}]) //  AbsoluteTiming // First
      a == c



      2.12463



      True




      An experiment with ListConvolve. If I could get it to "bound" through the target matrix, it could be pretty fast, as I am throwing out 5/6 of the effort below. I know ListConvolve does take advantage of sparse matrices. Not sure how to exploit that.



      kernel = {{1, 1, 1}, {1, 1, 1}};
      d = Downsample[ListCorrelate[kernel, test], {2, 3}]; // AbsoluteTiming // First
      a == d



      3.21



      True







      share|improve this answer









      $endgroup$
















        0












        0








        0





        $begingroup$

        These two aren't faster, but I found them of interest in that they pose the problem in a different way. The Downsample command is easy to describe and use, but slower than then the direct ;; command as I am building the matrices.



        From above, for comparison:



        n = 6000;
        test = RandomInteger[100, {n, n}];

        a = Total[Partition[test, {2, 3}], {3, 4}]; // AbsoluteTiming // First
        b = Sum[test[[i ;; ;; 2, j ;; ;; 3]], {i, 1, 2}, {j, 1, 3}]; // AbsoluteTiming // First
        a == b



        1.72742



        0.402294



        True




        New methods



        c = Sum[Downsample[test, {2, 3}, {i, j}], {i, 2}, {j, 3}]) //  AbsoluteTiming // First
        a == c



        2.12463



        True




        An experiment with ListConvolve. If I could get it to "bound" through the target matrix, it could be pretty fast, as I am throwing out 5/6 of the effort below. I know ListConvolve does take advantage of sparse matrices. Not sure how to exploit that.



        kernel = {{1, 1, 1}, {1, 1, 1}};
        d = Downsample[ListCorrelate[kernel, test], {2, 3}]; // AbsoluteTiming // First
        a == d



        3.21



        True







        share|improve this answer









        $endgroup$



        These two aren't faster, but I found them of interest in that they pose the problem in a different way. The Downsample command is easy to describe and use, but slower than then the direct ;; command as I am building the matrices.



        From above, for comparison:



        n = 6000;
        test = RandomInteger[100, {n, n}];

        a = Total[Partition[test, {2, 3}], {3, 4}]; // AbsoluteTiming // First
        b = Sum[test[[i ;; ;; 2, j ;; ;; 3]], {i, 1, 2}, {j, 1, 3}]; // AbsoluteTiming // First
        a == b



        1.72742



        0.402294



        True




        New methods



        c = Sum[Downsample[test, {2, 3}, {i, j}], {i, 2}, {j, 3}]) //  AbsoluteTiming // First
        a == c



        2.12463



        True




        An experiment with ListConvolve. If I could get it to "bound" through the target matrix, it could be pretty fast, as I am throwing out 5/6 of the effort below. I know ListConvolve does take advantage of sparse matrices. Not sure how to exploit that.



        kernel = {{1, 1, 1}, {1, 1, 1}};
        d = Downsample[ListCorrelate[kernel, test], {2, 3}]; // AbsoluteTiming // First
        a == d



        3.21



        True








        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered 3 hours ago









        MikeYMikeY

        3,068413




        3,068413






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Mathematica Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f192187%2ftime-efficient-matrix-elements-grouping-and-summing%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Paul Cézanne

            UIScrollView CustomStickyHeader Resize height generates problems when scroll is too fast

            Angular material date-picker (MatDatepicker) auto completes the date on focus out