Generating and rendering extremely large graphs











up vote
15
down vote

favorite
5












I'm interested in generating and rendering extremely large graphs, i.e., ones having $10^6 to 10^8$ vertices. (These graphs will ultimately be rendered in PDF or PNG and printed on large, poster-sized paper.) Although the code may work for small graphs, there are inevitably limitations for large graphs due to computer memory (especially when embedding or laying-out the graph, as we'll see).



Consider the small graph generated by the Collatz conjecture with 400000 nodes (see code below, based on the cited problem):



400000 node Collatz graph



where the GraphLayout -> {"PackingLayout" -> "ClosestPacking"}. This can be computed directly on a Mac laptop. But for larger numbers of vertices, the graph cannot be rendered.



Note especially that this is just a representative graph... I need a solution for many different large graphs, ones without simple geometric structure.



The problem arises when one wants to generate and lay out (or "embed") $10^7$ vertices. The problem is not so much the computation of the vertices and edges (which can be done by writing to a large external file), but instead laying out the graph. The Layout nominally needs all the vertexes in active memory in order to perform the layout, and that is impossible on most machines.



[I've closed all other applications to provide Mathematica with the maximum memory, but even this is not enough for the large graphs I wish to render. Perhaps this entire problem is solved by moving the enormous computation to the Wolfram Cloud... and if this is the case, I guess I'll merely purchase an account and run my code there. But I'm hoping there is a better way... one that can be used on laptops and desktop machines (with large hard memories).]



One might try to layout portions of the overall graph and then "glue" them together, but the local optimization of layout of each portion is likely not appropriate for the full graph: Different branches will ultimately overlap, breaking the rule of the global optimization routine.



My idea is to generate the enormous graph $G$ (but not lay it out). Then select some small random subset of the vertexes, $V_1$, say 1% of those in $G$, to form a smaller, representative graph $G_1$ where the edges in $G_1$ are estimates of the "distance" between each vertex based on their step separation in $G$. Then, layout this representative $G_1$ (which should be no problem because it is small). Then, anchor the vertexes of $G_1$ in place in the final layout and then render all the vertexes of $G$. In this way, the full, final graph $G$ is rendered appropriately, and the memory constraints are not exceeded.



Perhaps there is a way to extract VertexCoordinates and iteratively layout vertices in regions, hopping through the overall graph. (This seems extremely inefficient.)



Or use smart memory management... or another technique.





CollatzSequence[list_] := 
Module[{memory, tmp, chain, result = Internal`Bag},
memory[1] = False;
memory[n_] := (memory[n] = False; True);
Do[chain = Internal`Bag;
tmp = l;
While[memory[tmp], Internal`StuffBag[chain, tmp];
tmp = If[EvenQ[tmp], tmp/2, 3 tmp + 1];];
Internal`StuffBag[chain, tmp];
Internal`StuffBag[result, chain], {l, list}];
Internal`BagPart[#, All] & /@ Internal`BagPart[result, All]];

myCollatzGraph = Rasterize[
Graph[Flatten[(Rule @@@ Partition[#, 2, 1]) & /@
CollatzSequence[Range[400000]]],
GraphLayout -> {"PackingLayout" -> "ClosestPacking"},
VertexStyle -> Opacity[0.2, RGBColor[.9, .1, .1]],
EdgeStyle -> RGBColor[.1, .1, .9],
ImageSize -> 2000],
ImageResolution -> 72];









share|improve this question




















  • 1




    Isn't this graph a tree? That is what the Collatz conjecture states, right? I mean, this is a structure that one can try to exploit for rendering. Towards rendering: I am afraid that the ultimate constraint will be the poor 32-bit frontend that can render only graphics that are not too complex (or will crash)... One may circumvent the frontend by directly generating code for a vector graphics, though.
    – Henrik Schumacher
    Dec 3 at 8:29










  • One additional potential trick would be to use IGSmoothen from IGraph/M, which replaced chains of edges as in 1->2->3->4->5 ... with a single weighted edge 1->5. This would reduce the size of the graph by 5-6 fold, and the weights can be used to maintain the layout. However, I noticed that IGSmoothen is currently not fast enough. I will look into optimizing it. It should work much faster on graphs of this size.
    – Szabolcs
    Dec 3 at 9:18










  • Unbelievably, the IGSmoothen is slow for one single reason: Delete is slow. I will fix this for the next release (due tonight).
    – Szabolcs
    Dec 3 at 9:33















up vote
15
down vote

favorite
5












I'm interested in generating and rendering extremely large graphs, i.e., ones having $10^6 to 10^8$ vertices. (These graphs will ultimately be rendered in PDF or PNG and printed on large, poster-sized paper.) Although the code may work for small graphs, there are inevitably limitations for large graphs due to computer memory (especially when embedding or laying-out the graph, as we'll see).



Consider the small graph generated by the Collatz conjecture with 400000 nodes (see code below, based on the cited problem):



400000 node Collatz graph



where the GraphLayout -> {"PackingLayout" -> "ClosestPacking"}. This can be computed directly on a Mac laptop. But for larger numbers of vertices, the graph cannot be rendered.



Note especially that this is just a representative graph... I need a solution for many different large graphs, ones without simple geometric structure.



The problem arises when one wants to generate and lay out (or "embed") $10^7$ vertices. The problem is not so much the computation of the vertices and edges (which can be done by writing to a large external file), but instead laying out the graph. The Layout nominally needs all the vertexes in active memory in order to perform the layout, and that is impossible on most machines.



[I've closed all other applications to provide Mathematica with the maximum memory, but even this is not enough for the large graphs I wish to render. Perhaps this entire problem is solved by moving the enormous computation to the Wolfram Cloud... and if this is the case, I guess I'll merely purchase an account and run my code there. But I'm hoping there is a better way... one that can be used on laptops and desktop machines (with large hard memories).]



One might try to layout portions of the overall graph and then "glue" them together, but the local optimization of layout of each portion is likely not appropriate for the full graph: Different branches will ultimately overlap, breaking the rule of the global optimization routine.



My idea is to generate the enormous graph $G$ (but not lay it out). Then select some small random subset of the vertexes, $V_1$, say 1% of those in $G$, to form a smaller, representative graph $G_1$ where the edges in $G_1$ are estimates of the "distance" between each vertex based on their step separation in $G$. Then, layout this representative $G_1$ (which should be no problem because it is small). Then, anchor the vertexes of $G_1$ in place in the final layout and then render all the vertexes of $G$. In this way, the full, final graph $G$ is rendered appropriately, and the memory constraints are not exceeded.



Perhaps there is a way to extract VertexCoordinates and iteratively layout vertices in regions, hopping through the overall graph. (This seems extremely inefficient.)



Or use smart memory management... or another technique.





CollatzSequence[list_] := 
Module[{memory, tmp, chain, result = Internal`Bag},
memory[1] = False;
memory[n_] := (memory[n] = False; True);
Do[chain = Internal`Bag;
tmp = l;
While[memory[tmp], Internal`StuffBag[chain, tmp];
tmp = If[EvenQ[tmp], tmp/2, 3 tmp + 1];];
Internal`StuffBag[chain, tmp];
Internal`StuffBag[result, chain], {l, list}];
Internal`BagPart[#, All] & /@ Internal`BagPart[result, All]];

myCollatzGraph = Rasterize[
Graph[Flatten[(Rule @@@ Partition[#, 2, 1]) & /@
CollatzSequence[Range[400000]]],
GraphLayout -> {"PackingLayout" -> "ClosestPacking"},
VertexStyle -> Opacity[0.2, RGBColor[.9, .1, .1]],
EdgeStyle -> RGBColor[.1, .1, .9],
ImageSize -> 2000],
ImageResolution -> 72];









share|improve this question




















  • 1




    Isn't this graph a tree? That is what the Collatz conjecture states, right? I mean, this is a structure that one can try to exploit for rendering. Towards rendering: I am afraid that the ultimate constraint will be the poor 32-bit frontend that can render only graphics that are not too complex (or will crash)... One may circumvent the frontend by directly generating code for a vector graphics, though.
    – Henrik Schumacher
    Dec 3 at 8:29










  • One additional potential trick would be to use IGSmoothen from IGraph/M, which replaced chains of edges as in 1->2->3->4->5 ... with a single weighted edge 1->5. This would reduce the size of the graph by 5-6 fold, and the weights can be used to maintain the layout. However, I noticed that IGSmoothen is currently not fast enough. I will look into optimizing it. It should work much faster on graphs of this size.
    – Szabolcs
    Dec 3 at 9:18










  • Unbelievably, the IGSmoothen is slow for one single reason: Delete is slow. I will fix this for the next release (due tonight).
    – Szabolcs
    Dec 3 at 9:33













up vote
15
down vote

favorite
5









up vote
15
down vote

favorite
5






5





I'm interested in generating and rendering extremely large graphs, i.e., ones having $10^6 to 10^8$ vertices. (These graphs will ultimately be rendered in PDF or PNG and printed on large, poster-sized paper.) Although the code may work for small graphs, there are inevitably limitations for large graphs due to computer memory (especially when embedding or laying-out the graph, as we'll see).



Consider the small graph generated by the Collatz conjecture with 400000 nodes (see code below, based on the cited problem):



400000 node Collatz graph



where the GraphLayout -> {"PackingLayout" -> "ClosestPacking"}. This can be computed directly on a Mac laptop. But for larger numbers of vertices, the graph cannot be rendered.



Note especially that this is just a representative graph... I need a solution for many different large graphs, ones without simple geometric structure.



The problem arises when one wants to generate and lay out (or "embed") $10^7$ vertices. The problem is not so much the computation of the vertices and edges (which can be done by writing to a large external file), but instead laying out the graph. The Layout nominally needs all the vertexes in active memory in order to perform the layout, and that is impossible on most machines.



[I've closed all other applications to provide Mathematica with the maximum memory, but even this is not enough for the large graphs I wish to render. Perhaps this entire problem is solved by moving the enormous computation to the Wolfram Cloud... and if this is the case, I guess I'll merely purchase an account and run my code there. But I'm hoping there is a better way... one that can be used on laptops and desktop machines (with large hard memories).]



One might try to layout portions of the overall graph and then "glue" them together, but the local optimization of layout of each portion is likely not appropriate for the full graph: Different branches will ultimately overlap, breaking the rule of the global optimization routine.



My idea is to generate the enormous graph $G$ (but not lay it out). Then select some small random subset of the vertexes, $V_1$, say 1% of those in $G$, to form a smaller, representative graph $G_1$ where the edges in $G_1$ are estimates of the "distance" between each vertex based on their step separation in $G$. Then, layout this representative $G_1$ (which should be no problem because it is small). Then, anchor the vertexes of $G_1$ in place in the final layout and then render all the vertexes of $G$. In this way, the full, final graph $G$ is rendered appropriately, and the memory constraints are not exceeded.



Perhaps there is a way to extract VertexCoordinates and iteratively layout vertices in regions, hopping through the overall graph. (This seems extremely inefficient.)



Or use smart memory management... or another technique.





CollatzSequence[list_] := 
Module[{memory, tmp, chain, result = Internal`Bag},
memory[1] = False;
memory[n_] := (memory[n] = False; True);
Do[chain = Internal`Bag;
tmp = l;
While[memory[tmp], Internal`StuffBag[chain, tmp];
tmp = If[EvenQ[tmp], tmp/2, 3 tmp + 1];];
Internal`StuffBag[chain, tmp];
Internal`StuffBag[result, chain], {l, list}];
Internal`BagPart[#, All] & /@ Internal`BagPart[result, All]];

myCollatzGraph = Rasterize[
Graph[Flatten[(Rule @@@ Partition[#, 2, 1]) & /@
CollatzSequence[Range[400000]]],
GraphLayout -> {"PackingLayout" -> "ClosestPacking"},
VertexStyle -> Opacity[0.2, RGBColor[.9, .1, .1]],
EdgeStyle -> RGBColor[.1, .1, .9],
ImageSize -> 2000],
ImageResolution -> 72];









share|improve this question















I'm interested in generating and rendering extremely large graphs, i.e., ones having $10^6 to 10^8$ vertices. (These graphs will ultimately be rendered in PDF or PNG and printed on large, poster-sized paper.) Although the code may work for small graphs, there are inevitably limitations for large graphs due to computer memory (especially when embedding or laying-out the graph, as we'll see).



Consider the small graph generated by the Collatz conjecture with 400000 nodes (see code below, based on the cited problem):



400000 node Collatz graph



where the GraphLayout -> {"PackingLayout" -> "ClosestPacking"}. This can be computed directly on a Mac laptop. But for larger numbers of vertices, the graph cannot be rendered.



Note especially that this is just a representative graph... I need a solution for many different large graphs, ones without simple geometric structure.



The problem arises when one wants to generate and lay out (or "embed") $10^7$ vertices. The problem is not so much the computation of the vertices and edges (which can be done by writing to a large external file), but instead laying out the graph. The Layout nominally needs all the vertexes in active memory in order to perform the layout, and that is impossible on most machines.



[I've closed all other applications to provide Mathematica with the maximum memory, but even this is not enough for the large graphs I wish to render. Perhaps this entire problem is solved by moving the enormous computation to the Wolfram Cloud... and if this is the case, I guess I'll merely purchase an account and run my code there. But I'm hoping there is a better way... one that can be used on laptops and desktop machines (with large hard memories).]



One might try to layout portions of the overall graph and then "glue" them together, but the local optimization of layout of each portion is likely not appropriate for the full graph: Different branches will ultimately overlap, breaking the rule of the global optimization routine.



My idea is to generate the enormous graph $G$ (but not lay it out). Then select some small random subset of the vertexes, $V_1$, say 1% of those in $G$, to form a smaller, representative graph $G_1$ where the edges in $G_1$ are estimates of the "distance" between each vertex based on their step separation in $G$. Then, layout this representative $G_1$ (which should be no problem because it is small). Then, anchor the vertexes of $G_1$ in place in the final layout and then render all the vertexes of $G$. In this way, the full, final graph $G$ is rendered appropriately, and the memory constraints are not exceeded.



Perhaps there is a way to extract VertexCoordinates and iteratively layout vertices in regions, hopping through the overall graph. (This seems extremely inefficient.)



Or use smart memory management... or another technique.





CollatzSequence[list_] := 
Module[{memory, tmp, chain, result = Internal`Bag},
memory[1] = False;
memory[n_] := (memory[n] = False; True);
Do[chain = Internal`Bag;
tmp = l;
While[memory[tmp], Internal`StuffBag[chain, tmp];
tmp = If[EvenQ[tmp], tmp/2, 3 tmp + 1];];
Internal`StuffBag[chain, tmp];
Internal`StuffBag[result, chain], {l, list}];
Internal`BagPart[#, All] & /@ Internal`BagPart[result, All]];

myCollatzGraph = Rasterize[
Graph[Flatten[(Rule @@@ Partition[#, 2, 1]) & /@
CollatzSequence[Range[400000]]],
GraphLayout -> {"PackingLayout" -> "ClosestPacking"},
VertexStyle -> Opacity[0.2, RGBColor[.9, .1, .1]],
EdgeStyle -> RGBColor[.1, .1, .9],
ImageSize -> 2000],
ImageResolution -> 72];






graphs-and-networks memory






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Dec 3 at 6:26

























asked Dec 3 at 3:59









David G. Stork

22.9k22051




22.9k22051








  • 1




    Isn't this graph a tree? That is what the Collatz conjecture states, right? I mean, this is a structure that one can try to exploit for rendering. Towards rendering: I am afraid that the ultimate constraint will be the poor 32-bit frontend that can render only graphics that are not too complex (or will crash)... One may circumvent the frontend by directly generating code for a vector graphics, though.
    – Henrik Schumacher
    Dec 3 at 8:29










  • One additional potential trick would be to use IGSmoothen from IGraph/M, which replaced chains of edges as in 1->2->3->4->5 ... with a single weighted edge 1->5. This would reduce the size of the graph by 5-6 fold, and the weights can be used to maintain the layout. However, I noticed that IGSmoothen is currently not fast enough. I will look into optimizing it. It should work much faster on graphs of this size.
    – Szabolcs
    Dec 3 at 9:18










  • Unbelievably, the IGSmoothen is slow for one single reason: Delete is slow. I will fix this for the next release (due tonight).
    – Szabolcs
    Dec 3 at 9:33














  • 1




    Isn't this graph a tree? That is what the Collatz conjecture states, right? I mean, this is a structure that one can try to exploit for rendering. Towards rendering: I am afraid that the ultimate constraint will be the poor 32-bit frontend that can render only graphics that are not too complex (or will crash)... One may circumvent the frontend by directly generating code for a vector graphics, though.
    – Henrik Schumacher
    Dec 3 at 8:29










  • One additional potential trick would be to use IGSmoothen from IGraph/M, which replaced chains of edges as in 1->2->3->4->5 ... with a single weighted edge 1->5. This would reduce the size of the graph by 5-6 fold, and the weights can be used to maintain the layout. However, I noticed that IGSmoothen is currently not fast enough. I will look into optimizing it. It should work much faster on graphs of this size.
    – Szabolcs
    Dec 3 at 9:18










  • Unbelievably, the IGSmoothen is slow for one single reason: Delete is slow. I will fix this for the next release (due tonight).
    – Szabolcs
    Dec 3 at 9:33








1




1




Isn't this graph a tree? That is what the Collatz conjecture states, right? I mean, this is a structure that one can try to exploit for rendering. Towards rendering: I am afraid that the ultimate constraint will be the poor 32-bit frontend that can render only graphics that are not too complex (or will crash)... One may circumvent the frontend by directly generating code for a vector graphics, though.
– Henrik Schumacher
Dec 3 at 8:29




Isn't this graph a tree? That is what the Collatz conjecture states, right? I mean, this is a structure that one can try to exploit for rendering. Towards rendering: I am afraid that the ultimate constraint will be the poor 32-bit frontend that can render only graphics that are not too complex (or will crash)... One may circumvent the frontend by directly generating code for a vector graphics, though.
– Henrik Schumacher
Dec 3 at 8:29












One additional potential trick would be to use IGSmoothen from IGraph/M, which replaced chains of edges as in 1->2->3->4->5 ... with a single weighted edge 1->5. This would reduce the size of the graph by 5-6 fold, and the weights can be used to maintain the layout. However, I noticed that IGSmoothen is currently not fast enough. I will look into optimizing it. It should work much faster on graphs of this size.
– Szabolcs
Dec 3 at 9:18




One additional potential trick would be to use IGSmoothen from IGraph/M, which replaced chains of edges as in 1->2->3->4->5 ... with a single weighted edge 1->5. This would reduce the size of the graph by 5-6 fold, and the weights can be used to maintain the layout. However, I noticed that IGSmoothen is currently not fast enough. I will look into optimizing it. It should work much faster on graphs of this size.
– Szabolcs
Dec 3 at 9:18












Unbelievably, the IGSmoothen is slow for one single reason: Delete is slow. I will fix this for the next release (due tonight).
– Szabolcs
Dec 3 at 9:33




Unbelievably, the IGSmoothen is slow for one single reason: Delete is slow. I will fix this for the next release (due tonight).
– Szabolcs
Dec 3 at 9:33










1 Answer
1






active

oldest

votes

















up vote
13
down vote













I suggest not to use Graph to render, as it is slow. Instead, compute the vertex coordinates, then render manually.



The rough steps are as follows:



Create an "index graph", i.e. a graph where the vertex names coincide with the vertex indices:



n = 10000;
g = IndexGraph@Graph[
Flatten[(Rule @@@ Partition[#, 2, 1]) & /@
CollatzSequence@Range[n]]
];


Compute the vertex coordinates:



coord = GraphEmbedding[g, "SpringElectricalEmbedding"];


This is going to be the most expensive step (computationally), but it will likely not take too much memory. Here, you will need to experiment to find a layout that produces a good enough result and finishes in time. "SpringElectricalLayout" will be slow, but it has many sub-options which can be used to find a good compromise between computation time and layout quality. I would start experimenting with Tolerance and MaxIteration.



Finally, render the graphics:



Image@Graphics@GraphicsComplex[coord, Line[List @@@ EdgeList[g]]]


Be sure to use GraphicsComplex and merge all lines and all points into a single Line and Point expression. Finally, we rasterize the result so the front end does not need to keep the vector graphics representation in memory.





Example:



n = 400000;
g = IndexGraph@Graph[
Flatten[(Rule @@@ Partition[#, 2, 1]) & /@
CollatzSequence@Range[n]]
];

VertexCount[g]
(* 869322 *)

coord = GraphEmbedding[
g, {"SpringElectricalEmbedding", "Tolerance" -> 1,
"MaxIteration" -> 5}]; // AbsoluteTiming
(* {48.8368, Null} *)

Image@Graphics@GraphicsComplex[coord, Line[List @@@ EdgeList[g]]]


enter image description here



Image@Graphics[{GraphicsComplex[
coord, {Line[List @@@ EdgeList[g]], Red, AbsolutePointSize[1],
Point[coord]}]}, ImageSize -> 1000]


enter image description here



The 32-bit macOS front end with absolutely not run out of memory while rendering graphics of this size, if specified efficiently, as here.



The bottleneck is the computation of coordinates, where you need to find a good compromise between computation time and output quality. This example took just under a minute on my machine. The time would likely stay under 5 minutes if MaxIteration is increased 5-fold (along with a reduction of Tolerance, which also put a limit on how many steps are computed).





Note:



I notice you used GraphLayout -> {"PackingLayout" -> "ClosestPacking"}. This makes no sense for a connected graph like this. The PackingLayout controls how connected components are assembled after each one has been laid out. This is a connected graph. It has only one component.






share|improve this answer























  • Thanks so very much (+1, at the very least). Let me play around with your solution to see just how large a final graph I can create... whether I can reach the needed $10^8$ vertices. I'm really not worried about time... just size. I'd be happy indeed if this approach can render $10^8$ vertices, even if it took several hours on my hardware. (Alas, I'm off to China in a few hours and won't get back to this for a few days...)
    – David G. Stork
    Dec 3 at 9:50










  • @DavidG.Stork I just fixed IGSmoothen and will try to expand on the answer later. It will be useful if you are fine with not explicitly rendering vertices (only edges). For such a large graph it often makes sense to omit vertices.
    – Szabolcs
    Dec 3 at 9:53










  • Your code didn't complete for $n=10^6$ after five hours. Let me know if IGSmoothen works. (I may be out of contact for a few days... but am very interested in a working solution to this problem.
    – David G. Stork
    Dec 3 at 14:31










  • @DavidG.Stork You need to adjust the MaxIteration to get a good tradeoff between timing and quality. I suggest you also experiment with IGLayoutFruchtermanReingold from IGraph/M. It has an option to continue the layout from the existing coordinates. So you can run it a bit, take a look, and run more if needed. I did not have time to do that, but here's one with IGSmoothen, today's IGraph/M release needed! dropbox.com/s/lwclsetrm89lkib/collatz-graph.nb?dl=0
    – Szabolcs
    Dec 3 at 14:45











Your Answer





StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "387"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f187195%2fgenerating-and-rendering-extremely-large-graphs%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
13
down vote













I suggest not to use Graph to render, as it is slow. Instead, compute the vertex coordinates, then render manually.



The rough steps are as follows:



Create an "index graph", i.e. a graph where the vertex names coincide with the vertex indices:



n = 10000;
g = IndexGraph@Graph[
Flatten[(Rule @@@ Partition[#, 2, 1]) & /@
CollatzSequence@Range[n]]
];


Compute the vertex coordinates:



coord = GraphEmbedding[g, "SpringElectricalEmbedding"];


This is going to be the most expensive step (computationally), but it will likely not take too much memory. Here, you will need to experiment to find a layout that produces a good enough result and finishes in time. "SpringElectricalLayout" will be slow, but it has many sub-options which can be used to find a good compromise between computation time and layout quality. I would start experimenting with Tolerance and MaxIteration.



Finally, render the graphics:



Image@Graphics@GraphicsComplex[coord, Line[List @@@ EdgeList[g]]]


Be sure to use GraphicsComplex and merge all lines and all points into a single Line and Point expression. Finally, we rasterize the result so the front end does not need to keep the vector graphics representation in memory.





Example:



n = 400000;
g = IndexGraph@Graph[
Flatten[(Rule @@@ Partition[#, 2, 1]) & /@
CollatzSequence@Range[n]]
];

VertexCount[g]
(* 869322 *)

coord = GraphEmbedding[
g, {"SpringElectricalEmbedding", "Tolerance" -> 1,
"MaxIteration" -> 5}]; // AbsoluteTiming
(* {48.8368, Null} *)

Image@Graphics@GraphicsComplex[coord, Line[List @@@ EdgeList[g]]]


enter image description here



Image@Graphics[{GraphicsComplex[
coord, {Line[List @@@ EdgeList[g]], Red, AbsolutePointSize[1],
Point[coord]}]}, ImageSize -> 1000]


enter image description here



The 32-bit macOS front end with absolutely not run out of memory while rendering graphics of this size, if specified efficiently, as here.



The bottleneck is the computation of coordinates, where you need to find a good compromise between computation time and output quality. This example took just under a minute on my machine. The time would likely stay under 5 minutes if MaxIteration is increased 5-fold (along with a reduction of Tolerance, which also put a limit on how many steps are computed).





Note:



I notice you used GraphLayout -> {"PackingLayout" -> "ClosestPacking"}. This makes no sense for a connected graph like this. The PackingLayout controls how connected components are assembled after each one has been laid out. This is a connected graph. It has only one component.






share|improve this answer























  • Thanks so very much (+1, at the very least). Let me play around with your solution to see just how large a final graph I can create... whether I can reach the needed $10^8$ vertices. I'm really not worried about time... just size. I'd be happy indeed if this approach can render $10^8$ vertices, even if it took several hours on my hardware. (Alas, I'm off to China in a few hours and won't get back to this for a few days...)
    – David G. Stork
    Dec 3 at 9:50










  • @DavidG.Stork I just fixed IGSmoothen and will try to expand on the answer later. It will be useful if you are fine with not explicitly rendering vertices (only edges). For such a large graph it often makes sense to omit vertices.
    – Szabolcs
    Dec 3 at 9:53










  • Your code didn't complete for $n=10^6$ after five hours. Let me know if IGSmoothen works. (I may be out of contact for a few days... but am very interested in a working solution to this problem.
    – David G. Stork
    Dec 3 at 14:31










  • @DavidG.Stork You need to adjust the MaxIteration to get a good tradeoff between timing and quality. I suggest you also experiment with IGLayoutFruchtermanReingold from IGraph/M. It has an option to continue the layout from the existing coordinates. So you can run it a bit, take a look, and run more if needed. I did not have time to do that, but here's one with IGSmoothen, today's IGraph/M release needed! dropbox.com/s/lwclsetrm89lkib/collatz-graph.nb?dl=0
    – Szabolcs
    Dec 3 at 14:45















up vote
13
down vote













I suggest not to use Graph to render, as it is slow. Instead, compute the vertex coordinates, then render manually.



The rough steps are as follows:



Create an "index graph", i.e. a graph where the vertex names coincide with the vertex indices:



n = 10000;
g = IndexGraph@Graph[
Flatten[(Rule @@@ Partition[#, 2, 1]) & /@
CollatzSequence@Range[n]]
];


Compute the vertex coordinates:



coord = GraphEmbedding[g, "SpringElectricalEmbedding"];


This is going to be the most expensive step (computationally), but it will likely not take too much memory. Here, you will need to experiment to find a layout that produces a good enough result and finishes in time. "SpringElectricalLayout" will be slow, but it has many sub-options which can be used to find a good compromise between computation time and layout quality. I would start experimenting with Tolerance and MaxIteration.



Finally, render the graphics:



Image@Graphics@GraphicsComplex[coord, Line[List @@@ EdgeList[g]]]


Be sure to use GraphicsComplex and merge all lines and all points into a single Line and Point expression. Finally, we rasterize the result so the front end does not need to keep the vector graphics representation in memory.





Example:



n = 400000;
g = IndexGraph@Graph[
Flatten[(Rule @@@ Partition[#, 2, 1]) & /@
CollatzSequence@Range[n]]
];

VertexCount[g]
(* 869322 *)

coord = GraphEmbedding[
g, {"SpringElectricalEmbedding", "Tolerance" -> 1,
"MaxIteration" -> 5}]; // AbsoluteTiming
(* {48.8368, Null} *)

Image@Graphics@GraphicsComplex[coord, Line[List @@@ EdgeList[g]]]


enter image description here



Image@Graphics[{GraphicsComplex[
coord, {Line[List @@@ EdgeList[g]], Red, AbsolutePointSize[1],
Point[coord]}]}, ImageSize -> 1000]


enter image description here



The 32-bit macOS front end with absolutely not run out of memory while rendering graphics of this size, if specified efficiently, as here.



The bottleneck is the computation of coordinates, where you need to find a good compromise between computation time and output quality. This example took just under a minute on my machine. The time would likely stay under 5 minutes if MaxIteration is increased 5-fold (along with a reduction of Tolerance, which also put a limit on how many steps are computed).





Note:



I notice you used GraphLayout -> {"PackingLayout" -> "ClosestPacking"}. This makes no sense for a connected graph like this. The PackingLayout controls how connected components are assembled after each one has been laid out. This is a connected graph. It has only one component.






share|improve this answer























  • Thanks so very much (+1, at the very least). Let me play around with your solution to see just how large a final graph I can create... whether I can reach the needed $10^8$ vertices. I'm really not worried about time... just size. I'd be happy indeed if this approach can render $10^8$ vertices, even if it took several hours on my hardware. (Alas, I'm off to China in a few hours and won't get back to this for a few days...)
    – David G. Stork
    Dec 3 at 9:50










  • @DavidG.Stork I just fixed IGSmoothen and will try to expand on the answer later. It will be useful if you are fine with not explicitly rendering vertices (only edges). For such a large graph it often makes sense to omit vertices.
    – Szabolcs
    Dec 3 at 9:53










  • Your code didn't complete for $n=10^6$ after five hours. Let me know if IGSmoothen works. (I may be out of contact for a few days... but am very interested in a working solution to this problem.
    – David G. Stork
    Dec 3 at 14:31










  • @DavidG.Stork You need to adjust the MaxIteration to get a good tradeoff between timing and quality. I suggest you also experiment with IGLayoutFruchtermanReingold from IGraph/M. It has an option to continue the layout from the existing coordinates. So you can run it a bit, take a look, and run more if needed. I did not have time to do that, but here's one with IGSmoothen, today's IGraph/M release needed! dropbox.com/s/lwclsetrm89lkib/collatz-graph.nb?dl=0
    – Szabolcs
    Dec 3 at 14:45













up vote
13
down vote










up vote
13
down vote









I suggest not to use Graph to render, as it is slow. Instead, compute the vertex coordinates, then render manually.



The rough steps are as follows:



Create an "index graph", i.e. a graph where the vertex names coincide with the vertex indices:



n = 10000;
g = IndexGraph@Graph[
Flatten[(Rule @@@ Partition[#, 2, 1]) & /@
CollatzSequence@Range[n]]
];


Compute the vertex coordinates:



coord = GraphEmbedding[g, "SpringElectricalEmbedding"];


This is going to be the most expensive step (computationally), but it will likely not take too much memory. Here, you will need to experiment to find a layout that produces a good enough result and finishes in time. "SpringElectricalLayout" will be slow, but it has many sub-options which can be used to find a good compromise between computation time and layout quality. I would start experimenting with Tolerance and MaxIteration.



Finally, render the graphics:



Image@Graphics@GraphicsComplex[coord, Line[List @@@ EdgeList[g]]]


Be sure to use GraphicsComplex and merge all lines and all points into a single Line and Point expression. Finally, we rasterize the result so the front end does not need to keep the vector graphics representation in memory.





Example:



n = 400000;
g = IndexGraph@Graph[
Flatten[(Rule @@@ Partition[#, 2, 1]) & /@
CollatzSequence@Range[n]]
];

VertexCount[g]
(* 869322 *)

coord = GraphEmbedding[
g, {"SpringElectricalEmbedding", "Tolerance" -> 1,
"MaxIteration" -> 5}]; // AbsoluteTiming
(* {48.8368, Null} *)

Image@Graphics@GraphicsComplex[coord, Line[List @@@ EdgeList[g]]]


enter image description here



Image@Graphics[{GraphicsComplex[
coord, {Line[List @@@ EdgeList[g]], Red, AbsolutePointSize[1],
Point[coord]}]}, ImageSize -> 1000]


enter image description here



The 32-bit macOS front end with absolutely not run out of memory while rendering graphics of this size, if specified efficiently, as here.



The bottleneck is the computation of coordinates, where you need to find a good compromise between computation time and output quality. This example took just under a minute on my machine. The time would likely stay under 5 minutes if MaxIteration is increased 5-fold (along with a reduction of Tolerance, which also put a limit on how many steps are computed).





Note:



I notice you used GraphLayout -> {"PackingLayout" -> "ClosestPacking"}. This makes no sense for a connected graph like this. The PackingLayout controls how connected components are assembled after each one has been laid out. This is a connected graph. It has only one component.






share|improve this answer














I suggest not to use Graph to render, as it is slow. Instead, compute the vertex coordinates, then render manually.



The rough steps are as follows:



Create an "index graph", i.e. a graph where the vertex names coincide with the vertex indices:



n = 10000;
g = IndexGraph@Graph[
Flatten[(Rule @@@ Partition[#, 2, 1]) & /@
CollatzSequence@Range[n]]
];


Compute the vertex coordinates:



coord = GraphEmbedding[g, "SpringElectricalEmbedding"];


This is going to be the most expensive step (computationally), but it will likely not take too much memory. Here, you will need to experiment to find a layout that produces a good enough result and finishes in time. "SpringElectricalLayout" will be slow, but it has many sub-options which can be used to find a good compromise between computation time and layout quality. I would start experimenting with Tolerance and MaxIteration.



Finally, render the graphics:



Image@Graphics@GraphicsComplex[coord, Line[List @@@ EdgeList[g]]]


Be sure to use GraphicsComplex and merge all lines and all points into a single Line and Point expression. Finally, we rasterize the result so the front end does not need to keep the vector graphics representation in memory.





Example:



n = 400000;
g = IndexGraph@Graph[
Flatten[(Rule @@@ Partition[#, 2, 1]) & /@
CollatzSequence@Range[n]]
];

VertexCount[g]
(* 869322 *)

coord = GraphEmbedding[
g, {"SpringElectricalEmbedding", "Tolerance" -> 1,
"MaxIteration" -> 5}]; // AbsoluteTiming
(* {48.8368, Null} *)

Image@Graphics@GraphicsComplex[coord, Line[List @@@ EdgeList[g]]]


enter image description here



Image@Graphics[{GraphicsComplex[
coord, {Line[List @@@ EdgeList[g]], Red, AbsolutePointSize[1],
Point[coord]}]}, ImageSize -> 1000]


enter image description here



The 32-bit macOS front end with absolutely not run out of memory while rendering graphics of this size, if specified efficiently, as here.



The bottleneck is the computation of coordinates, where you need to find a good compromise between computation time and output quality. This example took just under a minute on my machine. The time would likely stay under 5 minutes if MaxIteration is increased 5-fold (along with a reduction of Tolerance, which also put a limit on how many steps are computed).





Note:



I notice you used GraphLayout -> {"PackingLayout" -> "ClosestPacking"}. This makes no sense for a connected graph like this. The PackingLayout controls how connected components are assembled after each one has been laid out. This is a connected graph. It has only one component.







share|improve this answer














share|improve this answer



share|improve this answer








edited Dec 3 at 9:27









b3m2a1

26.1k256152




26.1k256152










answered Dec 3 at 9:04









Szabolcs

158k13432926




158k13432926












  • Thanks so very much (+1, at the very least). Let me play around with your solution to see just how large a final graph I can create... whether I can reach the needed $10^8$ vertices. I'm really not worried about time... just size. I'd be happy indeed if this approach can render $10^8$ vertices, even if it took several hours on my hardware. (Alas, I'm off to China in a few hours and won't get back to this for a few days...)
    – David G. Stork
    Dec 3 at 9:50










  • @DavidG.Stork I just fixed IGSmoothen and will try to expand on the answer later. It will be useful if you are fine with not explicitly rendering vertices (only edges). For such a large graph it often makes sense to omit vertices.
    – Szabolcs
    Dec 3 at 9:53










  • Your code didn't complete for $n=10^6$ after five hours. Let me know if IGSmoothen works. (I may be out of contact for a few days... but am very interested in a working solution to this problem.
    – David G. Stork
    Dec 3 at 14:31










  • @DavidG.Stork You need to adjust the MaxIteration to get a good tradeoff between timing and quality. I suggest you also experiment with IGLayoutFruchtermanReingold from IGraph/M. It has an option to continue the layout from the existing coordinates. So you can run it a bit, take a look, and run more if needed. I did not have time to do that, but here's one with IGSmoothen, today's IGraph/M release needed! dropbox.com/s/lwclsetrm89lkib/collatz-graph.nb?dl=0
    – Szabolcs
    Dec 3 at 14:45


















  • Thanks so very much (+1, at the very least). Let me play around with your solution to see just how large a final graph I can create... whether I can reach the needed $10^8$ vertices. I'm really not worried about time... just size. I'd be happy indeed if this approach can render $10^8$ vertices, even if it took several hours on my hardware. (Alas, I'm off to China in a few hours and won't get back to this for a few days...)
    – David G. Stork
    Dec 3 at 9:50










  • @DavidG.Stork I just fixed IGSmoothen and will try to expand on the answer later. It will be useful if you are fine with not explicitly rendering vertices (only edges). For such a large graph it often makes sense to omit vertices.
    – Szabolcs
    Dec 3 at 9:53










  • Your code didn't complete for $n=10^6$ after five hours. Let me know if IGSmoothen works. (I may be out of contact for a few days... but am very interested in a working solution to this problem.
    – David G. Stork
    Dec 3 at 14:31










  • @DavidG.Stork You need to adjust the MaxIteration to get a good tradeoff between timing and quality. I suggest you also experiment with IGLayoutFruchtermanReingold from IGraph/M. It has an option to continue the layout from the existing coordinates. So you can run it a bit, take a look, and run more if needed. I did not have time to do that, but here's one with IGSmoothen, today's IGraph/M release needed! dropbox.com/s/lwclsetrm89lkib/collatz-graph.nb?dl=0
    – Szabolcs
    Dec 3 at 14:45
















Thanks so very much (+1, at the very least). Let me play around with your solution to see just how large a final graph I can create... whether I can reach the needed $10^8$ vertices. I'm really not worried about time... just size. I'd be happy indeed if this approach can render $10^8$ vertices, even if it took several hours on my hardware. (Alas, I'm off to China in a few hours and won't get back to this for a few days...)
– David G. Stork
Dec 3 at 9:50




Thanks so very much (+1, at the very least). Let me play around with your solution to see just how large a final graph I can create... whether I can reach the needed $10^8$ vertices. I'm really not worried about time... just size. I'd be happy indeed if this approach can render $10^8$ vertices, even if it took several hours on my hardware. (Alas, I'm off to China in a few hours and won't get back to this for a few days...)
– David G. Stork
Dec 3 at 9:50












@DavidG.Stork I just fixed IGSmoothen and will try to expand on the answer later. It will be useful if you are fine with not explicitly rendering vertices (only edges). For such a large graph it often makes sense to omit vertices.
– Szabolcs
Dec 3 at 9:53




@DavidG.Stork I just fixed IGSmoothen and will try to expand on the answer later. It will be useful if you are fine with not explicitly rendering vertices (only edges). For such a large graph it often makes sense to omit vertices.
– Szabolcs
Dec 3 at 9:53












Your code didn't complete for $n=10^6$ after five hours. Let me know if IGSmoothen works. (I may be out of contact for a few days... but am very interested in a working solution to this problem.
– David G. Stork
Dec 3 at 14:31




Your code didn't complete for $n=10^6$ after five hours. Let me know if IGSmoothen works. (I may be out of contact for a few days... but am very interested in a working solution to this problem.
– David G. Stork
Dec 3 at 14:31












@DavidG.Stork You need to adjust the MaxIteration to get a good tradeoff between timing and quality. I suggest you also experiment with IGLayoutFruchtermanReingold from IGraph/M. It has an option to continue the layout from the existing coordinates. So you can run it a bit, take a look, and run more if needed. I did not have time to do that, but here's one with IGSmoothen, today's IGraph/M release needed! dropbox.com/s/lwclsetrm89lkib/collatz-graph.nb?dl=0
– Szabolcs
Dec 3 at 14:45




@DavidG.Stork You need to adjust the MaxIteration to get a good tradeoff between timing and quality. I suggest you also experiment with IGLayoutFruchtermanReingold from IGraph/M. It has an option to continue the layout from the existing coordinates. So you can run it a bit, take a look, and run more if needed. I did not have time to do that, but here's one with IGSmoothen, today's IGraph/M release needed! dropbox.com/s/lwclsetrm89lkib/collatz-graph.nb?dl=0
– Szabolcs
Dec 3 at 14:45


















draft saved

draft discarded




















































Thanks for contributing an answer to Mathematica Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f187195%2fgenerating-and-rendering-extremely-large-graphs%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

"Incorrect syntax near the keyword 'ON'. (on update cascade, on delete cascade,)

Alcedinidae

RAC Tourist Trophy