Testing a Collatz Conjecture Conjecture (Python)
$begingroup$
I had an intuition that the maximal trajectory lengths for collatz conjecture numbers would be produced by numbers of the form (2**n)-1
so I wrote a program to test it (I was wrong). I thought I would put it up for review to catch any coding sins I committed, and see how others would improve my code.
Orginal Code
'''
* I had a conjecture that maximal trajectory lengths for the collatz conjecture were yielded by numbers of the form (2**n)-1. This program attempts to test it.
* "conjecture()" finds the number with the highest trajectory lengths in a given range. The ranges are of the form [2**n, 2**(n+1)).
'''
import math, pprint
f = open("CollatzConjecture.txt")
text = f.read()
def key(value, dct):
for key in dct.keys():
if dct[key] == value:
return key
return None
def conjecture(text):
traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("n")} #Processes the text file into a dict mapping numbers to their trajectory length.
result = {}
bound = math.floor(math.log(len(traj),2))
traj2 = {num: traj[num] for num in list(traj.keys())[:2**bound+1]} #Slices the dictionary to receive the section of interest.
for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
slce = list(traj2.items())[int((math.log(st,2)-1)**2):st] #Slices "traj2" into powers of 2.
result[st] = key(max([n[1] for n in slce]), traj2)
return pprint.pformat(result)
print(conjecture(text))
"CollatzConjecture.txt" is a text file containing number trajectoryLength
on separate lines. I wanted to find the number corresponding to the maximum trajectory length between successive powers of 2 in order to test my hypothesis.
python beginner collatz-sequence
$endgroup$
|
show 5 more comments
$begingroup$
I had an intuition that the maximal trajectory lengths for collatz conjecture numbers would be produced by numbers of the form (2**n)-1
so I wrote a program to test it (I was wrong). I thought I would put it up for review to catch any coding sins I committed, and see how others would improve my code.
Orginal Code
'''
* I had a conjecture that maximal trajectory lengths for the collatz conjecture were yielded by numbers of the form (2**n)-1. This program attempts to test it.
* "conjecture()" finds the number with the highest trajectory lengths in a given range. The ranges are of the form [2**n, 2**(n+1)).
'''
import math, pprint
f = open("CollatzConjecture.txt")
text = f.read()
def key(value, dct):
for key in dct.keys():
if dct[key] == value:
return key
return None
def conjecture(text):
traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("n")} #Processes the text file into a dict mapping numbers to their trajectory length.
result = {}
bound = math.floor(math.log(len(traj),2))
traj2 = {num: traj[num] for num in list(traj.keys())[:2**bound+1]} #Slices the dictionary to receive the section of interest.
for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
slce = list(traj2.items())[int((math.log(st,2)-1)**2):st] #Slices "traj2" into powers of 2.
result[st] = key(max([n[1] for n in slce]), traj2)
return pprint.pformat(result)
print(conjecture(text))
"CollatzConjecture.txt" is a text file containing number trajectoryLength
on separate lines. I wanted to find the number corresponding to the maximum trajectory length between successive powers of 2 in order to test my hypothesis.
python beginner collatz-sequence
$endgroup$
$begingroup$
Can you add an example input textfile?
$endgroup$
– Ludisposed
yesterday
1
$begingroup$
Where is thekey
function actually used?
$endgroup$
– Graipher
yesterday
$begingroup$
@Ludisposed I'll link the actual text file I used: oeis.org/A006577/b006577.txt
$endgroup$
– Tobi Alafin
yesterday
1
$begingroup$
Now that you have improved code it might make sense to ask a new question with the updated code, linking back to this one, as stated in the link provided by Mast.
$endgroup$
– Graipher
yesterday
3
$begingroup$
@TobiAlafin: Hm, what a nice edit mess :) I'll let you figure it out, but if I were you, I would just roll it back to your first first revision (again) and then forget about this question. Instead spend the effort on making the follow-up question look good.
$endgroup$
– Graipher
yesterday
|
show 5 more comments
$begingroup$
I had an intuition that the maximal trajectory lengths for collatz conjecture numbers would be produced by numbers of the form (2**n)-1
so I wrote a program to test it (I was wrong). I thought I would put it up for review to catch any coding sins I committed, and see how others would improve my code.
Orginal Code
'''
* I had a conjecture that maximal trajectory lengths for the collatz conjecture were yielded by numbers of the form (2**n)-1. This program attempts to test it.
* "conjecture()" finds the number with the highest trajectory lengths in a given range. The ranges are of the form [2**n, 2**(n+1)).
'''
import math, pprint
f = open("CollatzConjecture.txt")
text = f.read()
def key(value, dct):
for key in dct.keys():
if dct[key] == value:
return key
return None
def conjecture(text):
traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("n")} #Processes the text file into a dict mapping numbers to their trajectory length.
result = {}
bound = math.floor(math.log(len(traj),2))
traj2 = {num: traj[num] for num in list(traj.keys())[:2**bound+1]} #Slices the dictionary to receive the section of interest.
for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
slce = list(traj2.items())[int((math.log(st,2)-1)**2):st] #Slices "traj2" into powers of 2.
result[st] = key(max([n[1] for n in slce]), traj2)
return pprint.pformat(result)
print(conjecture(text))
"CollatzConjecture.txt" is a text file containing number trajectoryLength
on separate lines. I wanted to find the number corresponding to the maximum trajectory length between successive powers of 2 in order to test my hypothesis.
python beginner collatz-sequence
$endgroup$
I had an intuition that the maximal trajectory lengths for collatz conjecture numbers would be produced by numbers of the form (2**n)-1
so I wrote a program to test it (I was wrong). I thought I would put it up for review to catch any coding sins I committed, and see how others would improve my code.
Orginal Code
'''
* I had a conjecture that maximal trajectory lengths for the collatz conjecture were yielded by numbers of the form (2**n)-1. This program attempts to test it.
* "conjecture()" finds the number with the highest trajectory lengths in a given range. The ranges are of the form [2**n, 2**(n+1)).
'''
import math, pprint
f = open("CollatzConjecture.txt")
text = f.read()
def key(value, dct):
for key in dct.keys():
if dct[key] == value:
return key
return None
def conjecture(text):
traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("n")} #Processes the text file into a dict mapping numbers to their trajectory length.
result = {}
bound = math.floor(math.log(len(traj),2))
traj2 = {num: traj[num] for num in list(traj.keys())[:2**bound+1]} #Slices the dictionary to receive the section of interest.
for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
slce = list(traj2.items())[int((math.log(st,2)-1)**2):st] #Slices "traj2" into powers of 2.
result[st] = key(max([n[1] for n in slce]), traj2)
return pprint.pformat(result)
print(conjecture(text))
"CollatzConjecture.txt" is a text file containing number trajectoryLength
on separate lines. I wanted to find the number corresponding to the maximum trajectory length between successive powers of 2 in order to test my hypothesis.
python beginner collatz-sequence
python beginner collatz-sequence
edited yesterday
Tobi Alafin
asked yesterday
Tobi AlafinTobi Alafin
23017
23017
$begingroup$
Can you add an example input textfile?
$endgroup$
– Ludisposed
yesterday
1
$begingroup$
Where is thekey
function actually used?
$endgroup$
– Graipher
yesterday
$begingroup$
@Ludisposed I'll link the actual text file I used: oeis.org/A006577/b006577.txt
$endgroup$
– Tobi Alafin
yesterday
1
$begingroup$
Now that you have improved code it might make sense to ask a new question with the updated code, linking back to this one, as stated in the link provided by Mast.
$endgroup$
– Graipher
yesterday
3
$begingroup$
@TobiAlafin: Hm, what a nice edit mess :) I'll let you figure it out, but if I were you, I would just roll it back to your first first revision (again) and then forget about this question. Instead spend the effort on making the follow-up question look good.
$endgroup$
– Graipher
yesterday
|
show 5 more comments
$begingroup$
Can you add an example input textfile?
$endgroup$
– Ludisposed
yesterday
1
$begingroup$
Where is thekey
function actually used?
$endgroup$
– Graipher
yesterday
$begingroup$
@Ludisposed I'll link the actual text file I used: oeis.org/A006577/b006577.txt
$endgroup$
– Tobi Alafin
yesterday
1
$begingroup$
Now that you have improved code it might make sense to ask a new question with the updated code, linking back to this one, as stated in the link provided by Mast.
$endgroup$
– Graipher
yesterday
3
$begingroup$
@TobiAlafin: Hm, what a nice edit mess :) I'll let you figure it out, but if I were you, I would just roll it back to your first first revision (again) and then forget about this question. Instead spend the effort on making the follow-up question look good.
$endgroup$
– Graipher
yesterday
$begingroup$
Can you add an example input textfile?
$endgroup$
– Ludisposed
yesterday
$begingroup$
Can you add an example input textfile?
$endgroup$
– Ludisposed
yesterday
1
1
$begingroup$
Where is the
key
function actually used?$endgroup$
– Graipher
yesterday
$begingroup$
Where is the
key
function actually used?$endgroup$
– Graipher
yesterday
$begingroup$
@Ludisposed I'll link the actual text file I used: oeis.org/A006577/b006577.txt
$endgroup$
– Tobi Alafin
yesterday
$begingroup$
@Ludisposed I'll link the actual text file I used: oeis.org/A006577/b006577.txt
$endgroup$
– Tobi Alafin
yesterday
1
1
$begingroup$
Now that you have improved code it might make sense to ask a new question with the updated code, linking back to this one, as stated in the link provided by Mast.
$endgroup$
– Graipher
yesterday
$begingroup$
Now that you have improved code it might make sense to ask a new question with the updated code, linking back to this one, as stated in the link provided by Mast.
$endgroup$
– Graipher
yesterday
3
3
$begingroup$
@TobiAlafin: Hm, what a nice edit mess :) I'll let you figure it out, but if I were you, I would just roll it back to your first first revision (again) and then forget about this question. Instead spend the effort on making the follow-up question look good.
$endgroup$
– Graipher
yesterday
$begingroup$
@TobiAlafin: Hm, what a nice edit mess :) I'll let you figure it out, but if I were you, I would just roll it back to your first first revision (again) and then forget about this question. Instead spend the effort on making the follow-up question look good.
$endgroup$
– Graipher
yesterday
|
show 5 more comments
4 Answers
4
active
oldest
votes
$begingroup$
def key(value, dct):
for key in dct.keys():
if dct[key] == value:
return key
return None
This is really inefficient and not how dictionaries should be used.
- The point of a
dict
is O(1) lookup, this is essentially O(n) and it doesn't make sense - You should remove your blockcomments, and create a nice docstring. That will make the purpose of the code clearer
- You don't close the file, a nice addition is using the
with
statement to automatically open/close files PEP343
$endgroup$
$begingroup$
key(value, dct)
gets the key corresponding to a value.dict.get()
gets the value corresponding to a key. I'll edit my code to make it clearer.
$endgroup$
– Tobi Alafin
yesterday
4
$begingroup$
@TobiAlafin: If you need to map both ways (A -> B and B -> A), use two dictionaries to get O(1) lookups in each direction. A linear search O(n) in one direction vs. an O(1) in the other, doesn't make sense, unless that direction is almost never used, and you can't afford to double the storage cost by keeping a reverse dictionary. Or there are probably custom data structures that can index on both sets of data without duplicating the storage... like what a relational database would use for a table with 2 columns, and an index on each column. For short keys two dicts is totally fine.
$endgroup$
– Peter Cordes
yesterday
$begingroup$
Peter Cordes thank you. I had not thought of that, I'll apply that in the future.
$endgroup$
– Tobi Alafin
yesterday
add a comment |
$begingroup$
First of all, thumbs up on including a documentation block at the beginning.
When trying to run the segment, I get
Traceback (most recent call last):
File "conjecture.py", line 28, in <module>
print(conjecture(text))
File "conjecture.py", line 19, in conjecture
traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("n")} #Processes the text file into a dict mapping numbers to their trajectory length.
File "conjecture.py", line 19, in <dictcomp>
traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("n")} #Processes the text file into a dict mapping numbers to their trajectory length.
IndexError: list index out of range
Turns out that when I saved the output of the link, there were empty trailing lines. After fixing that, I could get it to run.
Furthermore, I see several issues with this implementation:
- The
conjecture
function is doing a lot of the work, with several lines that are quite filled with operations, meaning it's hard to find out what it's doing. - A dictionary can contain a value multiple times, and in fact does. For instance, both 5 and 32 have a length of 5 (5 -> 16 -> ...) and (32 -> 16 -> ...). For longer lengths, there will be more ways in which it can be obtained.
- Floating point arithmetic in a fully integral problem seems like an issue.
- The file is opened, and then not immediately closed. For this it's not really an issue, but in general one should use a contextmanager to clean up used resources as quickly as possible.
- The semantics of the output isn't really clear to me, but I'll probably figure out a bit during picking it apart.
- It doesn't use a
__main__
block, which would be beneficial for re-usable code. - The
conjecture
function returns a string, instead of a dictionary.
So let's start with your implementation:
'''
* I had a conjecture that maximal trajectory lengths for the collatz conjecture were yielded by numbers of the form (2**n)-1. This program attempts to test it.
* "conjecture()" finds the number with the highest trajectory lengths in a given range. The ranges are of the form [2**n, 2**(n+1)).
'''
import math, pprint
f = open("CollatzConjecture.txt")
text = f.read()
def key(value, dct):
for key in dct.keys():
if dct[key] == value:
return key
return None
def conjecture(text):
traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("n")} #Processes the text file into a dict mapping numbers to their trajectory length.
result = {}
bound = math.floor(math.log(len(traj),2))
traj2 = {num: traj[num] for num in list(traj.keys())[:2**bound+1]} #Slices the dictionary to receive the section of interest.
for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
slce = list(traj2.items())[int((math.log(st,2)-1)**2):st] #Slices "traj2" into powers of 2.
result[st] = key(max([n[1] for n in slce]), traj2)
return pprint.pformat(result)
print(conjecture(text))
Use a if __name__ == '__main__'
segment.
One of the best places to start with such a script (especially if you want to make it reusable!) is to place the "main" code (file-mangling) inside a main
block:
... function definitions ...
if __name__ == '__main__':
f = open("CollatzConjecture.txt")
text = f.read()
print(conjecture(text))
Move parsing the trajectories out of conjecture
.
Your conjecture has nothing to do with the piece of text, but about the length of the trajectories. We shouldn't care how these lengths are generated, calculated or parsed inside the conjecture
function. But the current implementation does.
By moving out the text
to traj
transformation, we make conjecture
more general.
...
def conjecture(traj):
result = {}
bound = math.floor(math.log(len(traj),2))
...
if __name__ == '__main__':
f = open("CollatzConjecture.txt")
text = f.read()
traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("n")} #Processes the text file into a dict mapping numbers to their trajectory length.
print(conjecture(traj))
And we can then split the parsing into a separate function, where we use a context manager.
...
def parse_lengths(filename):
with open(filename) as fp:
#Processes the text file into a dict mapping numbers to their trajectory length.
return {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in fp}
...
if __name__ == '__main__':
traj = parse_lengths("CollatzConjecture.txt")
print(conjecture(traj))
And similarly, move the pformat out of conjecture:
def conjecture(text):
result = {}
...
return result
if __name__ == '__main__':
traj = parse_lengths("CollatzConjecture.txt")
pprint.pprint(conjecture(traj))
Simplify conjecture
.
Now that we're left with a conjecture
method that does only things related to the conjecture, we can look at it a bit more:
def conjecture(traj):
result = {}
bound = math.floor(math.log(len(traj),2))
traj2 = {num: traj[num] for num in list(traj.keys())[:2**bound+1]} #Slices the dictionary to receive the section of interest.
for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
slce = list(traj2.items())[int((math.log(st,2)-1)**2):st] #Slices "traj2" into powers of 2.
result[st] = key(max([n[1] for n in slce]), traj2)
return result
Starting with this expression: int((math.log(st,2)-1)**2)
, what does it even do?! First it takes the 2-logarithm, substracts 1, and then squares it. At first I suspected that you wanted the previous power of two (and tested it). But, what actually happens is that 64 (26) becomes 25 (52), and also 2048 (211) becomes 100 (102). My basic guess is that this is incorrect, and you mean 64 => 32, 128 => 64, and so forth. Assuming that's what you want, you can just divide by two.
def conjecture(traj):
result = {}
bound = math.floor(math.log(len(traj),2))
traj2 = {num: traj[num] for num in list(traj.keys())[:2**bound+1]} #Slices the dictionary to receive the section of interest.
for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
slce = list(traj2.items())[st // 2:st] #Slices "traj2" into powers of 2.
result[st] = key(max([n[1] for n in slce]), traj2)
return result
Surprisingly, though, this does not alter the output.
Next up is the fact that I see slices being taken of dictionaries (after casting to a list). In older Python versions (before 3.6) that might cause surprising results, as dictionaries are not necessarily ordered by value, though in this case it does turn out to be that way it seems. It's better to explicitly filter out the values that you want.
Final:
In the end, I found myself settling on this.
'''
* I had a conjecture that maximal trajectory lengths for the collatz conjecture were yielded by numbers of the form (2**n)-1. This program attempts to test it.
* "conjecture()" finds the number with the highest trajectory lengths in a given range. The ranges are of the form [2**n, 2**(n+1)).
'''
import pprint
def key(value, dct):
for key in dct.keys():
if dct[key] == value:
return key
return None
def parse_lengths(filename):
with open(filename) as fp:
#Processes the text file into a dict mapping numbers to their trajectory length.
return {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in fp}
def conjecture(traj):
result = {}
traj_size = len(traj)
lower, higher = 1, 2
while True:
if higher > traj_size:
# We don't have all the data for this range available.
break
max_in_slice = max(length for num, length in traj.items() if lower <= num <= higher)
result[higher] = key(max_in_slice, traj)
# Prepare data for next iteration.
lower, higher = lower * 2, higher * 2
return result
if __name__ == '__main__':
traj = parse_lengths('CollatzConjecture.txt')
pprint.pprint(conjecture(traj))
I'm also not quite happy with this yet, as the "key" function is still bothering me as well (we could iterate over .items()
instead of .keys()
to simplify).
I hope this at least helps a bit.
$endgroup$
add a comment |
$begingroup$
It's desirable to try to break one's code into small, more reusable segments. For example, it would be preferable (in my opinion) to create a small function to open the text file, extract its contents, and return the dict
of trajectories you desire:
def load_trajectories(file_name):
traj = {}
with open(file_name, 'r') as f:
for line in f.readlines():
S = line.split(' ')
num, traj_len = int(S[0]), int(S[1])
traj[num] = traj_len
return traj
Note here that I used the context manager with open(file_name, 'r') as f:
, since in Python we need to first open a file, do things with it, and then close it. The context manager handles the opening and closing of the file for us, and within it we can access the file.
From my searching there doesn't seem to be a universally accepted way to find the "inverse" of a value in a dictionary, but I like yours for the most part. As one change, I would iterate over the key:value
pairs to simplify things a bit:
def find_key(target_value, dct):
for key, val in dct.items():
if val == target_value:
return key
raise Exception('Value not found!')
With this, it really only remains to clean up the conjecture
function. Here, we can pass the traj
dictionary as an argument, as well as the (un-exponentiated) bound.
In the parlance of your code, note that st
is always of the form 2**x
, so when we compute math.log(st, 2)
, it always evaluates to x
, so one line of your code reads (equivalently)
slce = list(traj2.items())[(x-1)**2:st]
which isn't the 'slicing into power of 2' that you desire. Moreover, it's not (necessarily) guaranteed that traj2.items()
will be turned into a list in the way you desire, so it'll be better to be more explicit:
slce = [key, val for key, val in traj.items() if key in range(2**(x-1), 2**x)]
Together, along with splitting our imports onto different lines, adding a if __name__ == '__main__':
guard, and some other minor reorganization yields the following code:
import math
import pprint
def find_key(target_value, dct):
for key, val in dct.items():
if val == target_value:
return key
raise Exception('Value not found!')
def load_trajectories(file_name):
traj = {}
with open(file_name, 'r') as f:
for line in f.readlines():
S = line.split(' ')
num, traj_len = int(S[0]), int(S[1])
traj[num] = traj_len
return traj
def conjecture(traj):
"""Checks the conjecture that the maximum 'Collatz length' among the integers in [2**n, 2**(n+1)) is 2**n - 1.
The conjecture is false.
"""
bound = math.floor(math.log(len(traj),2))
exp_bound = 2**bound
traj = {key:val for key, val in traj.items() if key < exp_bound} # Only include numbers of interest.
result = {}
for n in range(1,bound+1):
exp_n = 2**n
slce = [key, val for key, val in traj.items() if key in range(2**(n-1), exp_n)]
result[exp_n] = find_key(max([val for key, val in slce]), traj)
return pprint(pformat(result))
if __name__ == "__main__":
file_name = "CollatzConjecture.txt"
print(conjecture(file_name))
$endgroup$
$begingroup$
"I would iterate over thekey:value
pairs to simplify things a bit": it's not just a question of simplifying. It's also more efficient, because advancing the iterator doesn't involve any search so is as cheap as possible, whereas the lookup potentially has to resolve hash collisions.
$endgroup$
– Peter Taylor
21 hours ago
add a comment |
$begingroup$
The existing answers cover most of the issues that I spotted. There's just one thing that I think it's useful to add:
for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
slce = list(traj2.items())[int((math.log(st,2)-1)**2):st] #Slices "traj2" into powers of 2.
result[st] = key(max([n[1] for n in slce]), traj2)
This converts the dict
into a list every time through the loop. The suggestions, which don't rely on undocumented behaviour, instead filter the dict
each time through the loop. But you can instead invert the loop:
best = [0] * (bound + 1)
for key, value in traj2.items():
x = int(math.log(key, 2)) + 1
if value > best[x] or (value == best[x] and key < result[2**x]):
best[x] = value
result[2**x] = key
This gives straightforward linear running time and allows you explicit control over tie-breaking. (I'm not sure whether I've got the tie-breaking correct for your conjecture, but you can easily fix that).
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f212377%2ftesting-a-collatz-conjecture-conjecture-python%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
def key(value, dct):
for key in dct.keys():
if dct[key] == value:
return key
return None
This is really inefficient and not how dictionaries should be used.
- The point of a
dict
is O(1) lookup, this is essentially O(n) and it doesn't make sense - You should remove your blockcomments, and create a nice docstring. That will make the purpose of the code clearer
- You don't close the file, a nice addition is using the
with
statement to automatically open/close files PEP343
$endgroup$
$begingroup$
key(value, dct)
gets the key corresponding to a value.dict.get()
gets the value corresponding to a key. I'll edit my code to make it clearer.
$endgroup$
– Tobi Alafin
yesterday
4
$begingroup$
@TobiAlafin: If you need to map both ways (A -> B and B -> A), use two dictionaries to get O(1) lookups in each direction. A linear search O(n) in one direction vs. an O(1) in the other, doesn't make sense, unless that direction is almost never used, and you can't afford to double the storage cost by keeping a reverse dictionary. Or there are probably custom data structures that can index on both sets of data without duplicating the storage... like what a relational database would use for a table with 2 columns, and an index on each column. For short keys two dicts is totally fine.
$endgroup$
– Peter Cordes
yesterday
$begingroup$
Peter Cordes thank you. I had not thought of that, I'll apply that in the future.
$endgroup$
– Tobi Alafin
yesterday
add a comment |
$begingroup$
def key(value, dct):
for key in dct.keys():
if dct[key] == value:
return key
return None
This is really inefficient and not how dictionaries should be used.
- The point of a
dict
is O(1) lookup, this is essentially O(n) and it doesn't make sense - You should remove your blockcomments, and create a nice docstring. That will make the purpose of the code clearer
- You don't close the file, a nice addition is using the
with
statement to automatically open/close files PEP343
$endgroup$
$begingroup$
key(value, dct)
gets the key corresponding to a value.dict.get()
gets the value corresponding to a key. I'll edit my code to make it clearer.
$endgroup$
– Tobi Alafin
yesterday
4
$begingroup$
@TobiAlafin: If you need to map both ways (A -> B and B -> A), use two dictionaries to get O(1) lookups in each direction. A linear search O(n) in one direction vs. an O(1) in the other, doesn't make sense, unless that direction is almost never used, and you can't afford to double the storage cost by keeping a reverse dictionary. Or there are probably custom data structures that can index on both sets of data without duplicating the storage... like what a relational database would use for a table with 2 columns, and an index on each column. For short keys two dicts is totally fine.
$endgroup$
– Peter Cordes
yesterday
$begingroup$
Peter Cordes thank you. I had not thought of that, I'll apply that in the future.
$endgroup$
– Tobi Alafin
yesterday
add a comment |
$begingroup$
def key(value, dct):
for key in dct.keys():
if dct[key] == value:
return key
return None
This is really inefficient and not how dictionaries should be used.
- The point of a
dict
is O(1) lookup, this is essentially O(n) and it doesn't make sense - You should remove your blockcomments, and create a nice docstring. That will make the purpose of the code clearer
- You don't close the file, a nice addition is using the
with
statement to automatically open/close files PEP343
$endgroup$
def key(value, dct):
for key in dct.keys():
if dct[key] == value:
return key
return None
This is really inefficient and not how dictionaries should be used.
- The point of a
dict
is O(1) lookup, this is essentially O(n) and it doesn't make sense - You should remove your blockcomments, and create a nice docstring. That will make the purpose of the code clearer
- You don't close the file, a nice addition is using the
with
statement to automatically open/close files PEP343
edited yesterday
answered yesterday
LudisposedLudisposed
7,74221959
7,74221959
$begingroup$
key(value, dct)
gets the key corresponding to a value.dict.get()
gets the value corresponding to a key. I'll edit my code to make it clearer.
$endgroup$
– Tobi Alafin
yesterday
4
$begingroup$
@TobiAlafin: If you need to map both ways (A -> B and B -> A), use two dictionaries to get O(1) lookups in each direction. A linear search O(n) in one direction vs. an O(1) in the other, doesn't make sense, unless that direction is almost never used, and you can't afford to double the storage cost by keeping a reverse dictionary. Or there are probably custom data structures that can index on both sets of data without duplicating the storage... like what a relational database would use for a table with 2 columns, and an index on each column. For short keys two dicts is totally fine.
$endgroup$
– Peter Cordes
yesterday
$begingroup$
Peter Cordes thank you. I had not thought of that, I'll apply that in the future.
$endgroup$
– Tobi Alafin
yesterday
add a comment |
$begingroup$
key(value, dct)
gets the key corresponding to a value.dict.get()
gets the value corresponding to a key. I'll edit my code to make it clearer.
$endgroup$
– Tobi Alafin
yesterday
4
$begingroup$
@TobiAlafin: If you need to map both ways (A -> B and B -> A), use two dictionaries to get O(1) lookups in each direction. A linear search O(n) in one direction vs. an O(1) in the other, doesn't make sense, unless that direction is almost never used, and you can't afford to double the storage cost by keeping a reverse dictionary. Or there are probably custom data structures that can index on both sets of data without duplicating the storage... like what a relational database would use for a table with 2 columns, and an index on each column. For short keys two dicts is totally fine.
$endgroup$
– Peter Cordes
yesterday
$begingroup$
Peter Cordes thank you. I had not thought of that, I'll apply that in the future.
$endgroup$
– Tobi Alafin
yesterday
$begingroup$
key(value, dct)
gets the key corresponding to a value. dict.get()
gets the value corresponding to a key. I'll edit my code to make it clearer.$endgroup$
– Tobi Alafin
yesterday
$begingroup$
key(value, dct)
gets the key corresponding to a value. dict.get()
gets the value corresponding to a key. I'll edit my code to make it clearer.$endgroup$
– Tobi Alafin
yesterday
4
4
$begingroup$
@TobiAlafin: If you need to map both ways (A -> B and B -> A), use two dictionaries to get O(1) lookups in each direction. A linear search O(n) in one direction vs. an O(1) in the other, doesn't make sense, unless that direction is almost never used, and you can't afford to double the storage cost by keeping a reverse dictionary. Or there are probably custom data structures that can index on both sets of data without duplicating the storage... like what a relational database would use for a table with 2 columns, and an index on each column. For short keys two dicts is totally fine.
$endgroup$
– Peter Cordes
yesterday
$begingroup$
@TobiAlafin: If you need to map both ways (A -> B and B -> A), use two dictionaries to get O(1) lookups in each direction. A linear search O(n) in one direction vs. an O(1) in the other, doesn't make sense, unless that direction is almost never used, and you can't afford to double the storage cost by keeping a reverse dictionary. Or there are probably custom data structures that can index on both sets of data without duplicating the storage... like what a relational database would use for a table with 2 columns, and an index on each column. For short keys two dicts is totally fine.
$endgroup$
– Peter Cordes
yesterday
$begingroup$
Peter Cordes thank you. I had not thought of that, I'll apply that in the future.
$endgroup$
– Tobi Alafin
yesterday
$begingroup$
Peter Cordes thank you. I had not thought of that, I'll apply that in the future.
$endgroup$
– Tobi Alafin
yesterday
add a comment |
$begingroup$
First of all, thumbs up on including a documentation block at the beginning.
When trying to run the segment, I get
Traceback (most recent call last):
File "conjecture.py", line 28, in <module>
print(conjecture(text))
File "conjecture.py", line 19, in conjecture
traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("n")} #Processes the text file into a dict mapping numbers to their trajectory length.
File "conjecture.py", line 19, in <dictcomp>
traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("n")} #Processes the text file into a dict mapping numbers to their trajectory length.
IndexError: list index out of range
Turns out that when I saved the output of the link, there were empty trailing lines. After fixing that, I could get it to run.
Furthermore, I see several issues with this implementation:
- The
conjecture
function is doing a lot of the work, with several lines that are quite filled with operations, meaning it's hard to find out what it's doing. - A dictionary can contain a value multiple times, and in fact does. For instance, both 5 and 32 have a length of 5 (5 -> 16 -> ...) and (32 -> 16 -> ...). For longer lengths, there will be more ways in which it can be obtained.
- Floating point arithmetic in a fully integral problem seems like an issue.
- The file is opened, and then not immediately closed. For this it's not really an issue, but in general one should use a contextmanager to clean up used resources as quickly as possible.
- The semantics of the output isn't really clear to me, but I'll probably figure out a bit during picking it apart.
- It doesn't use a
__main__
block, which would be beneficial for re-usable code. - The
conjecture
function returns a string, instead of a dictionary.
So let's start with your implementation:
'''
* I had a conjecture that maximal trajectory lengths for the collatz conjecture were yielded by numbers of the form (2**n)-1. This program attempts to test it.
* "conjecture()" finds the number with the highest trajectory lengths in a given range. The ranges are of the form [2**n, 2**(n+1)).
'''
import math, pprint
f = open("CollatzConjecture.txt")
text = f.read()
def key(value, dct):
for key in dct.keys():
if dct[key] == value:
return key
return None
def conjecture(text):
traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("n")} #Processes the text file into a dict mapping numbers to their trajectory length.
result = {}
bound = math.floor(math.log(len(traj),2))
traj2 = {num: traj[num] for num in list(traj.keys())[:2**bound+1]} #Slices the dictionary to receive the section of interest.
for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
slce = list(traj2.items())[int((math.log(st,2)-1)**2):st] #Slices "traj2" into powers of 2.
result[st] = key(max([n[1] for n in slce]), traj2)
return pprint.pformat(result)
print(conjecture(text))
Use a if __name__ == '__main__'
segment.
One of the best places to start with such a script (especially if you want to make it reusable!) is to place the "main" code (file-mangling) inside a main
block:
... function definitions ...
if __name__ == '__main__':
f = open("CollatzConjecture.txt")
text = f.read()
print(conjecture(text))
Move parsing the trajectories out of conjecture
.
Your conjecture has nothing to do with the piece of text, but about the length of the trajectories. We shouldn't care how these lengths are generated, calculated or parsed inside the conjecture
function. But the current implementation does.
By moving out the text
to traj
transformation, we make conjecture
more general.
...
def conjecture(traj):
result = {}
bound = math.floor(math.log(len(traj),2))
...
if __name__ == '__main__':
f = open("CollatzConjecture.txt")
text = f.read()
traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("n")} #Processes the text file into a dict mapping numbers to their trajectory length.
print(conjecture(traj))
And we can then split the parsing into a separate function, where we use a context manager.
...
def parse_lengths(filename):
with open(filename) as fp:
#Processes the text file into a dict mapping numbers to their trajectory length.
return {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in fp}
...
if __name__ == '__main__':
traj = parse_lengths("CollatzConjecture.txt")
print(conjecture(traj))
And similarly, move the pformat out of conjecture:
def conjecture(text):
result = {}
...
return result
if __name__ == '__main__':
traj = parse_lengths("CollatzConjecture.txt")
pprint.pprint(conjecture(traj))
Simplify conjecture
.
Now that we're left with a conjecture
method that does only things related to the conjecture, we can look at it a bit more:
def conjecture(traj):
result = {}
bound = math.floor(math.log(len(traj),2))
traj2 = {num: traj[num] for num in list(traj.keys())[:2**bound+1]} #Slices the dictionary to receive the section of interest.
for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
slce = list(traj2.items())[int((math.log(st,2)-1)**2):st] #Slices "traj2" into powers of 2.
result[st] = key(max([n[1] for n in slce]), traj2)
return result
Starting with this expression: int((math.log(st,2)-1)**2)
, what does it even do?! First it takes the 2-logarithm, substracts 1, and then squares it. At first I suspected that you wanted the previous power of two (and tested it). But, what actually happens is that 64 (26) becomes 25 (52), and also 2048 (211) becomes 100 (102). My basic guess is that this is incorrect, and you mean 64 => 32, 128 => 64, and so forth. Assuming that's what you want, you can just divide by two.
def conjecture(traj):
result = {}
bound = math.floor(math.log(len(traj),2))
traj2 = {num: traj[num] for num in list(traj.keys())[:2**bound+1]} #Slices the dictionary to receive the section of interest.
for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
slce = list(traj2.items())[st // 2:st] #Slices "traj2" into powers of 2.
result[st] = key(max([n[1] for n in slce]), traj2)
return result
Surprisingly, though, this does not alter the output.
Next up is the fact that I see slices being taken of dictionaries (after casting to a list). In older Python versions (before 3.6) that might cause surprising results, as dictionaries are not necessarily ordered by value, though in this case it does turn out to be that way it seems. It's better to explicitly filter out the values that you want.
Final:
In the end, I found myself settling on this.
'''
* I had a conjecture that maximal trajectory lengths for the collatz conjecture were yielded by numbers of the form (2**n)-1. This program attempts to test it.
* "conjecture()" finds the number with the highest trajectory lengths in a given range. The ranges are of the form [2**n, 2**(n+1)).
'''
import pprint
def key(value, dct):
for key in dct.keys():
if dct[key] == value:
return key
return None
def parse_lengths(filename):
with open(filename) as fp:
#Processes the text file into a dict mapping numbers to their trajectory length.
return {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in fp}
def conjecture(traj):
result = {}
traj_size = len(traj)
lower, higher = 1, 2
while True:
if higher > traj_size:
# We don't have all the data for this range available.
break
max_in_slice = max(length for num, length in traj.items() if lower <= num <= higher)
result[higher] = key(max_in_slice, traj)
# Prepare data for next iteration.
lower, higher = lower * 2, higher * 2
return result
if __name__ == '__main__':
traj = parse_lengths('CollatzConjecture.txt')
pprint.pprint(conjecture(traj))
I'm also not quite happy with this yet, as the "key" function is still bothering me as well (we could iterate over .items()
instead of .keys()
to simplify).
I hope this at least helps a bit.
$endgroup$
add a comment |
$begingroup$
First of all, thumbs up on including a documentation block at the beginning.
When trying to run the segment, I get
Traceback (most recent call last):
File "conjecture.py", line 28, in <module>
print(conjecture(text))
File "conjecture.py", line 19, in conjecture
traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("n")} #Processes the text file into a dict mapping numbers to their trajectory length.
File "conjecture.py", line 19, in <dictcomp>
traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("n")} #Processes the text file into a dict mapping numbers to their trajectory length.
IndexError: list index out of range
Turns out that when I saved the output of the link, there were empty trailing lines. After fixing that, I could get it to run.
Furthermore, I see several issues with this implementation:
- The
conjecture
function is doing a lot of the work, with several lines that are quite filled with operations, meaning it's hard to find out what it's doing. - A dictionary can contain a value multiple times, and in fact does. For instance, both 5 and 32 have a length of 5 (5 -> 16 -> ...) and (32 -> 16 -> ...). For longer lengths, there will be more ways in which it can be obtained.
- Floating point arithmetic in a fully integral problem seems like an issue.
- The file is opened, and then not immediately closed. For this it's not really an issue, but in general one should use a contextmanager to clean up used resources as quickly as possible.
- The semantics of the output isn't really clear to me, but I'll probably figure out a bit during picking it apart.
- It doesn't use a
__main__
block, which would be beneficial for re-usable code. - The
conjecture
function returns a string, instead of a dictionary.
So let's start with your implementation:
'''
* I had a conjecture that maximal trajectory lengths for the collatz conjecture were yielded by numbers of the form (2**n)-1. This program attempts to test it.
* "conjecture()" finds the number with the highest trajectory lengths in a given range. The ranges are of the form [2**n, 2**(n+1)).
'''
import math, pprint
f = open("CollatzConjecture.txt")
text = f.read()
def key(value, dct):
for key in dct.keys():
if dct[key] == value:
return key
return None
def conjecture(text):
traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("n")} #Processes the text file into a dict mapping numbers to their trajectory length.
result = {}
bound = math.floor(math.log(len(traj),2))
traj2 = {num: traj[num] for num in list(traj.keys())[:2**bound+1]} #Slices the dictionary to receive the section of interest.
for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
slce = list(traj2.items())[int((math.log(st,2)-1)**2):st] #Slices "traj2" into powers of 2.
result[st] = key(max([n[1] for n in slce]), traj2)
return pprint.pformat(result)
print(conjecture(text))
Use a if __name__ == '__main__'
segment.
One of the best places to start with such a script (especially if you want to make it reusable!) is to place the "main" code (file-mangling) inside a main
block:
... function definitions ...
if __name__ == '__main__':
f = open("CollatzConjecture.txt")
text = f.read()
print(conjecture(text))
Move parsing the trajectories out of conjecture
.
Your conjecture has nothing to do with the piece of text, but about the length of the trajectories. We shouldn't care how these lengths are generated, calculated or parsed inside the conjecture
function. But the current implementation does.
By moving out the text
to traj
transformation, we make conjecture
more general.
...
def conjecture(traj):
result = {}
bound = math.floor(math.log(len(traj),2))
...
if __name__ == '__main__':
f = open("CollatzConjecture.txt")
text = f.read()
traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("n")} #Processes the text file into a dict mapping numbers to their trajectory length.
print(conjecture(traj))
And we can then split the parsing into a separate function, where we use a context manager.
...
def parse_lengths(filename):
with open(filename) as fp:
#Processes the text file into a dict mapping numbers to their trajectory length.
return {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in fp}
...
if __name__ == '__main__':
traj = parse_lengths("CollatzConjecture.txt")
print(conjecture(traj))
And similarly, move the pformat out of conjecture:
def conjecture(text):
result = {}
...
return result
if __name__ == '__main__':
traj = parse_lengths("CollatzConjecture.txt")
pprint.pprint(conjecture(traj))
Simplify conjecture
.
Now that we're left with a conjecture
method that does only things related to the conjecture, we can look at it a bit more:
def conjecture(traj):
result = {}
bound = math.floor(math.log(len(traj),2))
traj2 = {num: traj[num] for num in list(traj.keys())[:2**bound+1]} #Slices the dictionary to receive the section of interest.
for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
slce = list(traj2.items())[int((math.log(st,2)-1)**2):st] #Slices "traj2" into powers of 2.
result[st] = key(max([n[1] for n in slce]), traj2)
return result
Starting with this expression: int((math.log(st,2)-1)**2)
, what does it even do?! First it takes the 2-logarithm, substracts 1, and then squares it. At first I suspected that you wanted the previous power of two (and tested it). But, what actually happens is that 64 (26) becomes 25 (52), and also 2048 (211) becomes 100 (102). My basic guess is that this is incorrect, and you mean 64 => 32, 128 => 64, and so forth. Assuming that's what you want, you can just divide by two.
def conjecture(traj):
result = {}
bound = math.floor(math.log(len(traj),2))
traj2 = {num: traj[num] for num in list(traj.keys())[:2**bound+1]} #Slices the dictionary to receive the section of interest.
for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
slce = list(traj2.items())[st // 2:st] #Slices "traj2" into powers of 2.
result[st] = key(max([n[1] for n in slce]), traj2)
return result
Surprisingly, though, this does not alter the output.
Next up is the fact that I see slices being taken of dictionaries (after casting to a list). In older Python versions (before 3.6) that might cause surprising results, as dictionaries are not necessarily ordered by value, though in this case it does turn out to be that way it seems. It's better to explicitly filter out the values that you want.
Final:
In the end, I found myself settling on this.
'''
* I had a conjecture that maximal trajectory lengths for the collatz conjecture were yielded by numbers of the form (2**n)-1. This program attempts to test it.
* "conjecture()" finds the number with the highest trajectory lengths in a given range. The ranges are of the form [2**n, 2**(n+1)).
'''
import pprint
def key(value, dct):
for key in dct.keys():
if dct[key] == value:
return key
return None
def parse_lengths(filename):
with open(filename) as fp:
#Processes the text file into a dict mapping numbers to their trajectory length.
return {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in fp}
def conjecture(traj):
result = {}
traj_size = len(traj)
lower, higher = 1, 2
while True:
if higher > traj_size:
# We don't have all the data for this range available.
break
max_in_slice = max(length for num, length in traj.items() if lower <= num <= higher)
result[higher] = key(max_in_slice, traj)
# Prepare data for next iteration.
lower, higher = lower * 2, higher * 2
return result
if __name__ == '__main__':
traj = parse_lengths('CollatzConjecture.txt')
pprint.pprint(conjecture(traj))
I'm also not quite happy with this yet, as the "key" function is still bothering me as well (we could iterate over .items()
instead of .keys()
to simplify).
I hope this at least helps a bit.
$endgroup$
add a comment |
$begingroup$
First of all, thumbs up on including a documentation block at the beginning.
When trying to run the segment, I get
Traceback (most recent call last):
File "conjecture.py", line 28, in <module>
print(conjecture(text))
File "conjecture.py", line 19, in conjecture
traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("n")} #Processes the text file into a dict mapping numbers to their trajectory length.
File "conjecture.py", line 19, in <dictcomp>
traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("n")} #Processes the text file into a dict mapping numbers to their trajectory length.
IndexError: list index out of range
Turns out that when I saved the output of the link, there were empty trailing lines. After fixing that, I could get it to run.
Furthermore, I see several issues with this implementation:
- The
conjecture
function is doing a lot of the work, with several lines that are quite filled with operations, meaning it's hard to find out what it's doing. - A dictionary can contain a value multiple times, and in fact does. For instance, both 5 and 32 have a length of 5 (5 -> 16 -> ...) and (32 -> 16 -> ...). For longer lengths, there will be more ways in which it can be obtained.
- Floating point arithmetic in a fully integral problem seems like an issue.
- The file is opened, and then not immediately closed. For this it's not really an issue, but in general one should use a contextmanager to clean up used resources as quickly as possible.
- The semantics of the output isn't really clear to me, but I'll probably figure out a bit during picking it apart.
- It doesn't use a
__main__
block, which would be beneficial for re-usable code. - The
conjecture
function returns a string, instead of a dictionary.
So let's start with your implementation:
'''
* I had a conjecture that maximal trajectory lengths for the collatz conjecture were yielded by numbers of the form (2**n)-1. This program attempts to test it.
* "conjecture()" finds the number with the highest trajectory lengths in a given range. The ranges are of the form [2**n, 2**(n+1)).
'''
import math, pprint
f = open("CollatzConjecture.txt")
text = f.read()
def key(value, dct):
for key in dct.keys():
if dct[key] == value:
return key
return None
def conjecture(text):
traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("n")} #Processes the text file into a dict mapping numbers to their trajectory length.
result = {}
bound = math.floor(math.log(len(traj),2))
traj2 = {num: traj[num] for num in list(traj.keys())[:2**bound+1]} #Slices the dictionary to receive the section of interest.
for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
slce = list(traj2.items())[int((math.log(st,2)-1)**2):st] #Slices "traj2" into powers of 2.
result[st] = key(max([n[1] for n in slce]), traj2)
return pprint.pformat(result)
print(conjecture(text))
Use a if __name__ == '__main__'
segment.
One of the best places to start with such a script (especially if you want to make it reusable!) is to place the "main" code (file-mangling) inside a main
block:
... function definitions ...
if __name__ == '__main__':
f = open("CollatzConjecture.txt")
text = f.read()
print(conjecture(text))
Move parsing the trajectories out of conjecture
.
Your conjecture has nothing to do with the piece of text, but about the length of the trajectories. We shouldn't care how these lengths are generated, calculated or parsed inside the conjecture
function. But the current implementation does.
By moving out the text
to traj
transformation, we make conjecture
more general.
...
def conjecture(traj):
result = {}
bound = math.floor(math.log(len(traj),2))
...
if __name__ == '__main__':
f = open("CollatzConjecture.txt")
text = f.read()
traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("n")} #Processes the text file into a dict mapping numbers to their trajectory length.
print(conjecture(traj))
And we can then split the parsing into a separate function, where we use a context manager.
...
def parse_lengths(filename):
with open(filename) as fp:
#Processes the text file into a dict mapping numbers to their trajectory length.
return {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in fp}
...
if __name__ == '__main__':
traj = parse_lengths("CollatzConjecture.txt")
print(conjecture(traj))
And similarly, move the pformat out of conjecture:
def conjecture(text):
result = {}
...
return result
if __name__ == '__main__':
traj = parse_lengths("CollatzConjecture.txt")
pprint.pprint(conjecture(traj))
Simplify conjecture
.
Now that we're left with a conjecture
method that does only things related to the conjecture, we can look at it a bit more:
def conjecture(traj):
result = {}
bound = math.floor(math.log(len(traj),2))
traj2 = {num: traj[num] for num in list(traj.keys())[:2**bound+1]} #Slices the dictionary to receive the section of interest.
for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
slce = list(traj2.items())[int((math.log(st,2)-1)**2):st] #Slices "traj2" into powers of 2.
result[st] = key(max([n[1] for n in slce]), traj2)
return result
Starting with this expression: int((math.log(st,2)-1)**2)
, what does it even do?! First it takes the 2-logarithm, substracts 1, and then squares it. At first I suspected that you wanted the previous power of two (and tested it). But, what actually happens is that 64 (26) becomes 25 (52), and also 2048 (211) becomes 100 (102). My basic guess is that this is incorrect, and you mean 64 => 32, 128 => 64, and so forth. Assuming that's what you want, you can just divide by two.
def conjecture(traj):
result = {}
bound = math.floor(math.log(len(traj),2))
traj2 = {num: traj[num] for num in list(traj.keys())[:2**bound+1]} #Slices the dictionary to receive the section of interest.
for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
slce = list(traj2.items())[st // 2:st] #Slices "traj2" into powers of 2.
result[st] = key(max([n[1] for n in slce]), traj2)
return result
Surprisingly, though, this does not alter the output.
Next up is the fact that I see slices being taken of dictionaries (after casting to a list). In older Python versions (before 3.6) that might cause surprising results, as dictionaries are not necessarily ordered by value, though in this case it does turn out to be that way it seems. It's better to explicitly filter out the values that you want.
Final:
In the end, I found myself settling on this.
'''
* I had a conjecture that maximal trajectory lengths for the collatz conjecture were yielded by numbers of the form (2**n)-1. This program attempts to test it.
* "conjecture()" finds the number with the highest trajectory lengths in a given range. The ranges are of the form [2**n, 2**(n+1)).
'''
import pprint
def key(value, dct):
for key in dct.keys():
if dct[key] == value:
return key
return None
def parse_lengths(filename):
with open(filename) as fp:
#Processes the text file into a dict mapping numbers to their trajectory length.
return {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in fp}
def conjecture(traj):
result = {}
traj_size = len(traj)
lower, higher = 1, 2
while True:
if higher > traj_size:
# We don't have all the data for this range available.
break
max_in_slice = max(length for num, length in traj.items() if lower <= num <= higher)
result[higher] = key(max_in_slice, traj)
# Prepare data for next iteration.
lower, higher = lower * 2, higher * 2
return result
if __name__ == '__main__':
traj = parse_lengths('CollatzConjecture.txt')
pprint.pprint(conjecture(traj))
I'm also not quite happy with this yet, as the "key" function is still bothering me as well (we could iterate over .items()
instead of .keys()
to simplify).
I hope this at least helps a bit.
$endgroup$
First of all, thumbs up on including a documentation block at the beginning.
When trying to run the segment, I get
Traceback (most recent call last):
File "conjecture.py", line 28, in <module>
print(conjecture(text))
File "conjecture.py", line 19, in conjecture
traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("n")} #Processes the text file into a dict mapping numbers to their trajectory length.
File "conjecture.py", line 19, in <dictcomp>
traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("n")} #Processes the text file into a dict mapping numbers to their trajectory length.
IndexError: list index out of range
Turns out that when I saved the output of the link, there were empty trailing lines. After fixing that, I could get it to run.
Furthermore, I see several issues with this implementation:
- The
conjecture
function is doing a lot of the work, with several lines that are quite filled with operations, meaning it's hard to find out what it's doing. - A dictionary can contain a value multiple times, and in fact does. For instance, both 5 and 32 have a length of 5 (5 -> 16 -> ...) and (32 -> 16 -> ...). For longer lengths, there will be more ways in which it can be obtained.
- Floating point arithmetic in a fully integral problem seems like an issue.
- The file is opened, and then not immediately closed. For this it's not really an issue, but in general one should use a contextmanager to clean up used resources as quickly as possible.
- The semantics of the output isn't really clear to me, but I'll probably figure out a bit during picking it apart.
- It doesn't use a
__main__
block, which would be beneficial for re-usable code. - The
conjecture
function returns a string, instead of a dictionary.
So let's start with your implementation:
'''
* I had a conjecture that maximal trajectory lengths for the collatz conjecture were yielded by numbers of the form (2**n)-1. This program attempts to test it.
* "conjecture()" finds the number with the highest trajectory lengths in a given range. The ranges are of the form [2**n, 2**(n+1)).
'''
import math, pprint
f = open("CollatzConjecture.txt")
text = f.read()
def key(value, dct):
for key in dct.keys():
if dct[key] == value:
return key
return None
def conjecture(text):
traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("n")} #Processes the text file into a dict mapping numbers to their trajectory length.
result = {}
bound = math.floor(math.log(len(traj),2))
traj2 = {num: traj[num] for num in list(traj.keys())[:2**bound+1]} #Slices the dictionary to receive the section of interest.
for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
slce = list(traj2.items())[int((math.log(st,2)-1)**2):st] #Slices "traj2" into powers of 2.
result[st] = key(max([n[1] for n in slce]), traj2)
return pprint.pformat(result)
print(conjecture(text))
Use a if __name__ == '__main__'
segment.
One of the best places to start with such a script (especially if you want to make it reusable!) is to place the "main" code (file-mangling) inside a main
block:
... function definitions ...
if __name__ == '__main__':
f = open("CollatzConjecture.txt")
text = f.read()
print(conjecture(text))
Move parsing the trajectories out of conjecture
.
Your conjecture has nothing to do with the piece of text, but about the length of the trajectories. We shouldn't care how these lengths are generated, calculated or parsed inside the conjecture
function. But the current implementation does.
By moving out the text
to traj
transformation, we make conjecture
more general.
...
def conjecture(traj):
result = {}
bound = math.floor(math.log(len(traj),2))
...
if __name__ == '__main__':
f = open("CollatzConjecture.txt")
text = f.read()
traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("n")} #Processes the text file into a dict mapping numbers to their trajectory length.
print(conjecture(traj))
And we can then split the parsing into a separate function, where we use a context manager.
...
def parse_lengths(filename):
with open(filename) as fp:
#Processes the text file into a dict mapping numbers to their trajectory length.
return {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in fp}
...
if __name__ == '__main__':
traj = parse_lengths("CollatzConjecture.txt")
print(conjecture(traj))
And similarly, move the pformat out of conjecture:
def conjecture(text):
result = {}
...
return result
if __name__ == '__main__':
traj = parse_lengths("CollatzConjecture.txt")
pprint.pprint(conjecture(traj))
Simplify conjecture
.
Now that we're left with a conjecture
method that does only things related to the conjecture, we can look at it a bit more:
def conjecture(traj):
result = {}
bound = math.floor(math.log(len(traj),2))
traj2 = {num: traj[num] for num in list(traj.keys())[:2**bound+1]} #Slices the dictionary to receive the section of interest.
for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
slce = list(traj2.items())[int((math.log(st,2)-1)**2):st] #Slices "traj2" into powers of 2.
result[st] = key(max([n[1] for n in slce]), traj2)
return result
Starting with this expression: int((math.log(st,2)-1)**2)
, what does it even do?! First it takes the 2-logarithm, substracts 1, and then squares it. At first I suspected that you wanted the previous power of two (and tested it). But, what actually happens is that 64 (26) becomes 25 (52), and also 2048 (211) becomes 100 (102). My basic guess is that this is incorrect, and you mean 64 => 32, 128 => 64, and so forth. Assuming that's what you want, you can just divide by two.
def conjecture(traj):
result = {}
bound = math.floor(math.log(len(traj),2))
traj2 = {num: traj[num] for num in list(traj.keys())[:2**bound+1]} #Slices the dictionary to receive the section of interest.
for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
slce = list(traj2.items())[st // 2:st] #Slices "traj2" into powers of 2.
result[st] = key(max([n[1] for n in slce]), traj2)
return result
Surprisingly, though, this does not alter the output.
Next up is the fact that I see slices being taken of dictionaries (after casting to a list). In older Python versions (before 3.6) that might cause surprising results, as dictionaries are not necessarily ordered by value, though in this case it does turn out to be that way it seems. It's better to explicitly filter out the values that you want.
Final:
In the end, I found myself settling on this.
'''
* I had a conjecture that maximal trajectory lengths for the collatz conjecture were yielded by numbers of the form (2**n)-1. This program attempts to test it.
* "conjecture()" finds the number with the highest trajectory lengths in a given range. The ranges are of the form [2**n, 2**(n+1)).
'''
import pprint
def key(value, dct):
for key in dct.keys():
if dct[key] == value:
return key
return None
def parse_lengths(filename):
with open(filename) as fp:
#Processes the text file into a dict mapping numbers to their trajectory length.
return {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in fp}
def conjecture(traj):
result = {}
traj_size = len(traj)
lower, higher = 1, 2
while True:
if higher > traj_size:
# We don't have all the data for this range available.
break
max_in_slice = max(length for num, length in traj.items() if lower <= num <= higher)
result[higher] = key(max_in_slice, traj)
# Prepare data for next iteration.
lower, higher = lower * 2, higher * 2
return result
if __name__ == '__main__':
traj = parse_lengths('CollatzConjecture.txt')
pprint.pprint(conjecture(traj))
I'm also not quite happy with this yet, as the "key" function is still bothering me as well (we could iterate over .items()
instead of .keys()
to simplify).
I hope this at least helps a bit.
edited 22 hours ago
Peter Taylor
16.1k2759
16.1k2759
answered yesterday
Sjoerd Job PostmusSjoerd Job Postmus
3,2941016
3,2941016
add a comment |
add a comment |
$begingroup$
It's desirable to try to break one's code into small, more reusable segments. For example, it would be preferable (in my opinion) to create a small function to open the text file, extract its contents, and return the dict
of trajectories you desire:
def load_trajectories(file_name):
traj = {}
with open(file_name, 'r') as f:
for line in f.readlines():
S = line.split(' ')
num, traj_len = int(S[0]), int(S[1])
traj[num] = traj_len
return traj
Note here that I used the context manager with open(file_name, 'r') as f:
, since in Python we need to first open a file, do things with it, and then close it. The context manager handles the opening and closing of the file for us, and within it we can access the file.
From my searching there doesn't seem to be a universally accepted way to find the "inverse" of a value in a dictionary, but I like yours for the most part. As one change, I would iterate over the key:value
pairs to simplify things a bit:
def find_key(target_value, dct):
for key, val in dct.items():
if val == target_value:
return key
raise Exception('Value not found!')
With this, it really only remains to clean up the conjecture
function. Here, we can pass the traj
dictionary as an argument, as well as the (un-exponentiated) bound.
In the parlance of your code, note that st
is always of the form 2**x
, so when we compute math.log(st, 2)
, it always evaluates to x
, so one line of your code reads (equivalently)
slce = list(traj2.items())[(x-1)**2:st]
which isn't the 'slicing into power of 2' that you desire. Moreover, it's not (necessarily) guaranteed that traj2.items()
will be turned into a list in the way you desire, so it'll be better to be more explicit:
slce = [key, val for key, val in traj.items() if key in range(2**(x-1), 2**x)]
Together, along with splitting our imports onto different lines, adding a if __name__ == '__main__':
guard, and some other minor reorganization yields the following code:
import math
import pprint
def find_key(target_value, dct):
for key, val in dct.items():
if val == target_value:
return key
raise Exception('Value not found!')
def load_trajectories(file_name):
traj = {}
with open(file_name, 'r') as f:
for line in f.readlines():
S = line.split(' ')
num, traj_len = int(S[0]), int(S[1])
traj[num] = traj_len
return traj
def conjecture(traj):
"""Checks the conjecture that the maximum 'Collatz length' among the integers in [2**n, 2**(n+1)) is 2**n - 1.
The conjecture is false.
"""
bound = math.floor(math.log(len(traj),2))
exp_bound = 2**bound
traj = {key:val for key, val in traj.items() if key < exp_bound} # Only include numbers of interest.
result = {}
for n in range(1,bound+1):
exp_n = 2**n
slce = [key, val for key, val in traj.items() if key in range(2**(n-1), exp_n)]
result[exp_n] = find_key(max([val for key, val in slce]), traj)
return pprint(pformat(result))
if __name__ == "__main__":
file_name = "CollatzConjecture.txt"
print(conjecture(file_name))
$endgroup$
$begingroup$
"I would iterate over thekey:value
pairs to simplify things a bit": it's not just a question of simplifying. It's also more efficient, because advancing the iterator doesn't involve any search so is as cheap as possible, whereas the lookup potentially has to resolve hash collisions.
$endgroup$
– Peter Taylor
21 hours ago
add a comment |
$begingroup$
It's desirable to try to break one's code into small, more reusable segments. For example, it would be preferable (in my opinion) to create a small function to open the text file, extract its contents, and return the dict
of trajectories you desire:
def load_trajectories(file_name):
traj = {}
with open(file_name, 'r') as f:
for line in f.readlines():
S = line.split(' ')
num, traj_len = int(S[0]), int(S[1])
traj[num] = traj_len
return traj
Note here that I used the context manager with open(file_name, 'r') as f:
, since in Python we need to first open a file, do things with it, and then close it. The context manager handles the opening and closing of the file for us, and within it we can access the file.
From my searching there doesn't seem to be a universally accepted way to find the "inverse" of a value in a dictionary, but I like yours for the most part. As one change, I would iterate over the key:value
pairs to simplify things a bit:
def find_key(target_value, dct):
for key, val in dct.items():
if val == target_value:
return key
raise Exception('Value not found!')
With this, it really only remains to clean up the conjecture
function. Here, we can pass the traj
dictionary as an argument, as well as the (un-exponentiated) bound.
In the parlance of your code, note that st
is always of the form 2**x
, so when we compute math.log(st, 2)
, it always evaluates to x
, so one line of your code reads (equivalently)
slce = list(traj2.items())[(x-1)**2:st]
which isn't the 'slicing into power of 2' that you desire. Moreover, it's not (necessarily) guaranteed that traj2.items()
will be turned into a list in the way you desire, so it'll be better to be more explicit:
slce = [key, val for key, val in traj.items() if key in range(2**(x-1), 2**x)]
Together, along with splitting our imports onto different lines, adding a if __name__ == '__main__':
guard, and some other minor reorganization yields the following code:
import math
import pprint
def find_key(target_value, dct):
for key, val in dct.items():
if val == target_value:
return key
raise Exception('Value not found!')
def load_trajectories(file_name):
traj = {}
with open(file_name, 'r') as f:
for line in f.readlines():
S = line.split(' ')
num, traj_len = int(S[0]), int(S[1])
traj[num] = traj_len
return traj
def conjecture(traj):
"""Checks the conjecture that the maximum 'Collatz length' among the integers in [2**n, 2**(n+1)) is 2**n - 1.
The conjecture is false.
"""
bound = math.floor(math.log(len(traj),2))
exp_bound = 2**bound
traj = {key:val for key, val in traj.items() if key < exp_bound} # Only include numbers of interest.
result = {}
for n in range(1,bound+1):
exp_n = 2**n
slce = [key, val for key, val in traj.items() if key in range(2**(n-1), exp_n)]
result[exp_n] = find_key(max([val for key, val in slce]), traj)
return pprint(pformat(result))
if __name__ == "__main__":
file_name = "CollatzConjecture.txt"
print(conjecture(file_name))
$endgroup$
$begingroup$
"I would iterate over thekey:value
pairs to simplify things a bit": it's not just a question of simplifying. It's also more efficient, because advancing the iterator doesn't involve any search so is as cheap as possible, whereas the lookup potentially has to resolve hash collisions.
$endgroup$
– Peter Taylor
21 hours ago
add a comment |
$begingroup$
It's desirable to try to break one's code into small, more reusable segments. For example, it would be preferable (in my opinion) to create a small function to open the text file, extract its contents, and return the dict
of trajectories you desire:
def load_trajectories(file_name):
traj = {}
with open(file_name, 'r') as f:
for line in f.readlines():
S = line.split(' ')
num, traj_len = int(S[0]), int(S[1])
traj[num] = traj_len
return traj
Note here that I used the context manager with open(file_name, 'r') as f:
, since in Python we need to first open a file, do things with it, and then close it. The context manager handles the opening and closing of the file for us, and within it we can access the file.
From my searching there doesn't seem to be a universally accepted way to find the "inverse" of a value in a dictionary, but I like yours for the most part. As one change, I would iterate over the key:value
pairs to simplify things a bit:
def find_key(target_value, dct):
for key, val in dct.items():
if val == target_value:
return key
raise Exception('Value not found!')
With this, it really only remains to clean up the conjecture
function. Here, we can pass the traj
dictionary as an argument, as well as the (un-exponentiated) bound.
In the parlance of your code, note that st
is always of the form 2**x
, so when we compute math.log(st, 2)
, it always evaluates to x
, so one line of your code reads (equivalently)
slce = list(traj2.items())[(x-1)**2:st]
which isn't the 'slicing into power of 2' that you desire. Moreover, it's not (necessarily) guaranteed that traj2.items()
will be turned into a list in the way you desire, so it'll be better to be more explicit:
slce = [key, val for key, val in traj.items() if key in range(2**(x-1), 2**x)]
Together, along with splitting our imports onto different lines, adding a if __name__ == '__main__':
guard, and some other minor reorganization yields the following code:
import math
import pprint
def find_key(target_value, dct):
for key, val in dct.items():
if val == target_value:
return key
raise Exception('Value not found!')
def load_trajectories(file_name):
traj = {}
with open(file_name, 'r') as f:
for line in f.readlines():
S = line.split(' ')
num, traj_len = int(S[0]), int(S[1])
traj[num] = traj_len
return traj
def conjecture(traj):
"""Checks the conjecture that the maximum 'Collatz length' among the integers in [2**n, 2**(n+1)) is 2**n - 1.
The conjecture is false.
"""
bound = math.floor(math.log(len(traj),2))
exp_bound = 2**bound
traj = {key:val for key, val in traj.items() if key < exp_bound} # Only include numbers of interest.
result = {}
for n in range(1,bound+1):
exp_n = 2**n
slce = [key, val for key, val in traj.items() if key in range(2**(n-1), exp_n)]
result[exp_n] = find_key(max([val for key, val in slce]), traj)
return pprint(pformat(result))
if __name__ == "__main__":
file_name = "CollatzConjecture.txt"
print(conjecture(file_name))
$endgroup$
It's desirable to try to break one's code into small, more reusable segments. For example, it would be preferable (in my opinion) to create a small function to open the text file, extract its contents, and return the dict
of trajectories you desire:
def load_trajectories(file_name):
traj = {}
with open(file_name, 'r') as f:
for line in f.readlines():
S = line.split(' ')
num, traj_len = int(S[0]), int(S[1])
traj[num] = traj_len
return traj
Note here that I used the context manager with open(file_name, 'r') as f:
, since in Python we need to first open a file, do things with it, and then close it. The context manager handles the opening and closing of the file for us, and within it we can access the file.
From my searching there doesn't seem to be a universally accepted way to find the "inverse" of a value in a dictionary, but I like yours for the most part. As one change, I would iterate over the key:value
pairs to simplify things a bit:
def find_key(target_value, dct):
for key, val in dct.items():
if val == target_value:
return key
raise Exception('Value not found!')
With this, it really only remains to clean up the conjecture
function. Here, we can pass the traj
dictionary as an argument, as well as the (un-exponentiated) bound.
In the parlance of your code, note that st
is always of the form 2**x
, so when we compute math.log(st, 2)
, it always evaluates to x
, so one line of your code reads (equivalently)
slce = list(traj2.items())[(x-1)**2:st]
which isn't the 'slicing into power of 2' that you desire. Moreover, it's not (necessarily) guaranteed that traj2.items()
will be turned into a list in the way you desire, so it'll be better to be more explicit:
slce = [key, val for key, val in traj.items() if key in range(2**(x-1), 2**x)]
Together, along with splitting our imports onto different lines, adding a if __name__ == '__main__':
guard, and some other minor reorganization yields the following code:
import math
import pprint
def find_key(target_value, dct):
for key, val in dct.items():
if val == target_value:
return key
raise Exception('Value not found!')
def load_trajectories(file_name):
traj = {}
with open(file_name, 'r') as f:
for line in f.readlines():
S = line.split(' ')
num, traj_len = int(S[0]), int(S[1])
traj[num] = traj_len
return traj
def conjecture(traj):
"""Checks the conjecture that the maximum 'Collatz length' among the integers in [2**n, 2**(n+1)) is 2**n - 1.
The conjecture is false.
"""
bound = math.floor(math.log(len(traj),2))
exp_bound = 2**bound
traj = {key:val for key, val in traj.items() if key < exp_bound} # Only include numbers of interest.
result = {}
for n in range(1,bound+1):
exp_n = 2**n
slce = [key, val for key, val in traj.items() if key in range(2**(n-1), exp_n)]
result[exp_n] = find_key(max([val for key, val in slce]), traj)
return pprint(pformat(result))
if __name__ == "__main__":
file_name = "CollatzConjecture.txt"
print(conjecture(file_name))
answered yesterday
Michael EngenMichael Engen
1286
1286
$begingroup$
"I would iterate over thekey:value
pairs to simplify things a bit": it's not just a question of simplifying. It's also more efficient, because advancing the iterator doesn't involve any search so is as cheap as possible, whereas the lookup potentially has to resolve hash collisions.
$endgroup$
– Peter Taylor
21 hours ago
add a comment |
$begingroup$
"I would iterate over thekey:value
pairs to simplify things a bit": it's not just a question of simplifying. It's also more efficient, because advancing the iterator doesn't involve any search so is as cheap as possible, whereas the lookup potentially has to resolve hash collisions.
$endgroup$
– Peter Taylor
21 hours ago
$begingroup$
"I would iterate over the
key:value
pairs to simplify things a bit": it's not just a question of simplifying. It's also more efficient, because advancing the iterator doesn't involve any search so is as cheap as possible, whereas the lookup potentially has to resolve hash collisions.$endgroup$
– Peter Taylor
21 hours ago
$begingroup$
"I would iterate over the
key:value
pairs to simplify things a bit": it's not just a question of simplifying. It's also more efficient, because advancing the iterator doesn't involve any search so is as cheap as possible, whereas the lookup potentially has to resolve hash collisions.$endgroup$
– Peter Taylor
21 hours ago
add a comment |
$begingroup$
The existing answers cover most of the issues that I spotted. There's just one thing that I think it's useful to add:
for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
slce = list(traj2.items())[int((math.log(st,2)-1)**2):st] #Slices "traj2" into powers of 2.
result[st] = key(max([n[1] for n in slce]), traj2)
This converts the dict
into a list every time through the loop. The suggestions, which don't rely on undocumented behaviour, instead filter the dict
each time through the loop. But you can instead invert the loop:
best = [0] * (bound + 1)
for key, value in traj2.items():
x = int(math.log(key, 2)) + 1
if value > best[x] or (value == best[x] and key < result[2**x]):
best[x] = value
result[2**x] = key
This gives straightforward linear running time and allows you explicit control over tie-breaking. (I'm not sure whether I've got the tie-breaking correct for your conjecture, but you can easily fix that).
$endgroup$
add a comment |
$begingroup$
The existing answers cover most of the issues that I spotted. There's just one thing that I think it's useful to add:
for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
slce = list(traj2.items())[int((math.log(st,2)-1)**2):st] #Slices "traj2" into powers of 2.
result[st] = key(max([n[1] for n in slce]), traj2)
This converts the dict
into a list every time through the loop. The suggestions, which don't rely on undocumented behaviour, instead filter the dict
each time through the loop. But you can instead invert the loop:
best = [0] * (bound + 1)
for key, value in traj2.items():
x = int(math.log(key, 2)) + 1
if value > best[x] or (value == best[x] and key < result[2**x]):
best[x] = value
result[2**x] = key
This gives straightforward linear running time and allows you explicit control over tie-breaking. (I'm not sure whether I've got the tie-breaking correct for your conjecture, but you can easily fix that).
$endgroup$
add a comment |
$begingroup$
The existing answers cover most of the issues that I spotted. There's just one thing that I think it's useful to add:
for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
slce = list(traj2.items())[int((math.log(st,2)-1)**2):st] #Slices "traj2" into powers of 2.
result[st] = key(max([n[1] for n in slce]), traj2)
This converts the dict
into a list every time through the loop. The suggestions, which don't rely on undocumented behaviour, instead filter the dict
each time through the loop. But you can instead invert the loop:
best = [0] * (bound + 1)
for key, value in traj2.items():
x = int(math.log(key, 2)) + 1
if value > best[x] or (value == best[x] and key < result[2**x]):
best[x] = value
result[2**x] = key
This gives straightforward linear running time and allows you explicit control over tie-breaking. (I'm not sure whether I've got the tie-breaking correct for your conjecture, but you can easily fix that).
$endgroup$
The existing answers cover most of the issues that I spotted. There's just one thing that I think it's useful to add:
for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
slce = list(traj2.items())[int((math.log(st,2)-1)**2):st] #Slices "traj2" into powers of 2.
result[st] = key(max([n[1] for n in slce]), traj2)
This converts the dict
into a list every time through the loop. The suggestions, which don't rely on undocumented behaviour, instead filter the dict
each time through the loop. But you can instead invert the loop:
best = [0] * (bound + 1)
for key, value in traj2.items():
x = int(math.log(key, 2)) + 1
if value > best[x] or (value == best[x] and key < result[2**x]):
best[x] = value
result[2**x] = key
This gives straightforward linear running time and allows you explicit control over tie-breaking. (I'm not sure whether I've got the tie-breaking correct for your conjecture, but you can easily fix that).
answered 21 hours ago
Peter TaylorPeter Taylor
16.1k2759
16.1k2759
add a comment |
add a comment |
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f212377%2ftesting-a-collatz-conjecture-conjecture-python%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
Can you add an example input textfile?
$endgroup$
– Ludisposed
yesterday
1
$begingroup$
Where is the
key
function actually used?$endgroup$
– Graipher
yesterday
$begingroup$
@Ludisposed I'll link the actual text file I used: oeis.org/A006577/b006577.txt
$endgroup$
– Tobi Alafin
yesterday
1
$begingroup$
Now that you have improved code it might make sense to ask a new question with the updated code, linking back to this one, as stated in the link provided by Mast.
$endgroup$
– Graipher
yesterday
3
$begingroup$
@TobiAlafin: Hm, what a nice edit mess :) I'll let you figure it out, but if I were you, I would just roll it back to your first first revision (again) and then forget about this question. Instead spend the effort on making the follow-up question look good.
$endgroup$
– Graipher
yesterday