Wrong str.join implementation behavior?











up vote
1
down vote

favorite
1












Consider the following code:



class A(object):
def __init__(self):
self.a = '123'

def __len__(self):
print('len')
return 2

def __getitem__(self, pos):
print('get pos', pos)
return self.a[pos]

a = A()
print(''.join(a))


My expected output:



> len
> get pos 0
> get pos 1
> 12


The real output:



> len
> get pos 0
> get pos 1
> get pos 2
> get pos 3
> 123


Try it your self. I cannot believe what happens here.



As I understand the behavior correctly, str.join() calls __len__ but ignores the value and calls __getItem__ until the index out of range exception.



I must overlook something because the implementation of join seems different:



https://github.com/python/cpython/blob/3.6/Objects/stringlib/join.h



My current workaround is:



def __getitem__(self, pos):
if pos >= len(self):
raise IndexError()
return self.a[pos]


This is ridiculous.



I tested it under with Python 3.6 and 3.7 (CPython).










share|improve this question


















  • 3




    That's nothing to do with str.join specifically; list(a) behaves just the same, it iterates until StopIteration or IndexError is raised.
    – jonrsharpe
    Nov 17 at 22:00








  • 1




    See: What exactly are iterator, iterable, and iteration?
    – Patrick Haugh
    Nov 17 at 22:04










  • You should implement __iter__.
    – Ry-
    Nov 17 at 22:09










  • join checks if the object is a list or tuple (just to get the size right away). Not sure why it calls len...
    – Jean-François Fabre
    Nov 17 at 22:30















up vote
1
down vote

favorite
1












Consider the following code:



class A(object):
def __init__(self):
self.a = '123'

def __len__(self):
print('len')
return 2

def __getitem__(self, pos):
print('get pos', pos)
return self.a[pos]

a = A()
print(''.join(a))


My expected output:



> len
> get pos 0
> get pos 1
> 12


The real output:



> len
> get pos 0
> get pos 1
> get pos 2
> get pos 3
> 123


Try it your self. I cannot believe what happens here.



As I understand the behavior correctly, str.join() calls __len__ but ignores the value and calls __getItem__ until the index out of range exception.



I must overlook something because the implementation of join seems different:



https://github.com/python/cpython/blob/3.6/Objects/stringlib/join.h



My current workaround is:



def __getitem__(self, pos):
if pos >= len(self):
raise IndexError()
return self.a[pos]


This is ridiculous.



I tested it under with Python 3.6 and 3.7 (CPython).










share|improve this question


















  • 3




    That's nothing to do with str.join specifically; list(a) behaves just the same, it iterates until StopIteration or IndexError is raised.
    – jonrsharpe
    Nov 17 at 22:00








  • 1




    See: What exactly are iterator, iterable, and iteration?
    – Patrick Haugh
    Nov 17 at 22:04










  • You should implement __iter__.
    – Ry-
    Nov 17 at 22:09










  • join checks if the object is a list or tuple (just to get the size right away). Not sure why it calls len...
    – Jean-François Fabre
    Nov 17 at 22:30













up vote
1
down vote

favorite
1









up vote
1
down vote

favorite
1






1





Consider the following code:



class A(object):
def __init__(self):
self.a = '123'

def __len__(self):
print('len')
return 2

def __getitem__(self, pos):
print('get pos', pos)
return self.a[pos]

a = A()
print(''.join(a))


My expected output:



> len
> get pos 0
> get pos 1
> 12


The real output:



> len
> get pos 0
> get pos 1
> get pos 2
> get pos 3
> 123


Try it your self. I cannot believe what happens here.



As I understand the behavior correctly, str.join() calls __len__ but ignores the value and calls __getItem__ until the index out of range exception.



I must overlook something because the implementation of join seems different:



https://github.com/python/cpython/blob/3.6/Objects/stringlib/join.h



My current workaround is:



def __getitem__(self, pos):
if pos >= len(self):
raise IndexError()
return self.a[pos]


This is ridiculous.



I tested it under with Python 3.6 and 3.7 (CPython).










share|improve this question













Consider the following code:



class A(object):
def __init__(self):
self.a = '123'

def __len__(self):
print('len')
return 2

def __getitem__(self, pos):
print('get pos', pos)
return self.a[pos]

a = A()
print(''.join(a))


My expected output:



> len
> get pos 0
> get pos 1
> 12


The real output:



> len
> get pos 0
> get pos 1
> get pos 2
> get pos 3
> 123


Try it your self. I cannot believe what happens here.



As I understand the behavior correctly, str.join() calls __len__ but ignores the value and calls __getItem__ until the index out of range exception.



I must overlook something because the implementation of join seems different:



https://github.com/python/cpython/blob/3.6/Objects/stringlib/join.h



My current workaround is:



def __getitem__(self, pos):
if pos >= len(self):
raise IndexError()
return self.a[pos]


This is ridiculous.



I tested it under with Python 3.6 and 3.7 (CPython).







python c python-3.x join cpython






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 17 at 21:56









Viatorus

878827




878827








  • 3




    That's nothing to do with str.join specifically; list(a) behaves just the same, it iterates until StopIteration or IndexError is raised.
    – jonrsharpe
    Nov 17 at 22:00








  • 1




    See: What exactly are iterator, iterable, and iteration?
    – Patrick Haugh
    Nov 17 at 22:04










  • You should implement __iter__.
    – Ry-
    Nov 17 at 22:09










  • join checks if the object is a list or tuple (just to get the size right away). Not sure why it calls len...
    – Jean-François Fabre
    Nov 17 at 22:30














  • 3




    That's nothing to do with str.join specifically; list(a) behaves just the same, it iterates until StopIteration or IndexError is raised.
    – jonrsharpe
    Nov 17 at 22:00








  • 1




    See: What exactly are iterator, iterable, and iteration?
    – Patrick Haugh
    Nov 17 at 22:04










  • You should implement __iter__.
    – Ry-
    Nov 17 at 22:09










  • join checks if the object is a list or tuple (just to get the size right away). Not sure why it calls len...
    – Jean-François Fabre
    Nov 17 at 22:30








3




3




That's nothing to do with str.join specifically; list(a) behaves just the same, it iterates until StopIteration or IndexError is raised.
– jonrsharpe
Nov 17 at 22:00






That's nothing to do with str.join specifically; list(a) behaves just the same, it iterates until StopIteration or IndexError is raised.
– jonrsharpe
Nov 17 at 22:00






1




1




See: What exactly are iterator, iterable, and iteration?
– Patrick Haugh
Nov 17 at 22:04




See: What exactly are iterator, iterable, and iteration?
– Patrick Haugh
Nov 17 at 22:04












You should implement __iter__.
– Ry-
Nov 17 at 22:09




You should implement __iter__.
– Ry-
Nov 17 at 22:09












join checks if the object is a list or tuple (just to get the size right away). Not sure why it calls len...
– Jean-François Fabre
Nov 17 at 22:30




join checks if the object is a list or tuple (just to get the size right away). Not sure why it calls len...
– Jean-François Fabre
Nov 17 at 22:30












1 Answer
1






active

oldest

votes

















up vote
1
down vote













How str.join works (from analysing the source code)



First it checks if the object is an iterable & creates a sequence out of it if needed



seq = PySequence_Fast(iterable, "can only join an iterable");


If the object is a list or tuple, it just returns the object itself, no need to iterate.



If it's not, then it iterates to create a list. That's where the object is fully iterated upon.



From there, only the list copy is used. iterable has been iterated upon and is useless now if it wasn't list or tuple.



(I couldn't track down the call to len, would take a debugging session to find it in the PySequence_Fast call, but that seems useless. Your iterable has a __len__ method, okay, but since it's not a list or tuple, the returned value isn't used)






share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














     

    draft saved


    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53355927%2fwrong-str-join-implementation-behavior%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    1
    down vote













    How str.join works (from analysing the source code)



    First it checks if the object is an iterable & creates a sequence out of it if needed



    seq = PySequence_Fast(iterable, "can only join an iterable");


    If the object is a list or tuple, it just returns the object itself, no need to iterate.



    If it's not, then it iterates to create a list. That's where the object is fully iterated upon.



    From there, only the list copy is used. iterable has been iterated upon and is useless now if it wasn't list or tuple.



    (I couldn't track down the call to len, would take a debugging session to find it in the PySequence_Fast call, but that seems useless. Your iterable has a __len__ method, okay, but since it's not a list or tuple, the returned value isn't used)






    share|improve this answer



























      up vote
      1
      down vote













      How str.join works (from analysing the source code)



      First it checks if the object is an iterable & creates a sequence out of it if needed



      seq = PySequence_Fast(iterable, "can only join an iterable");


      If the object is a list or tuple, it just returns the object itself, no need to iterate.



      If it's not, then it iterates to create a list. That's where the object is fully iterated upon.



      From there, only the list copy is used. iterable has been iterated upon and is useless now if it wasn't list or tuple.



      (I couldn't track down the call to len, would take a debugging session to find it in the PySequence_Fast call, but that seems useless. Your iterable has a __len__ method, okay, but since it's not a list or tuple, the returned value isn't used)






      share|improve this answer

























        up vote
        1
        down vote










        up vote
        1
        down vote









        How str.join works (from analysing the source code)



        First it checks if the object is an iterable & creates a sequence out of it if needed



        seq = PySequence_Fast(iterable, "can only join an iterable");


        If the object is a list or tuple, it just returns the object itself, no need to iterate.



        If it's not, then it iterates to create a list. That's where the object is fully iterated upon.



        From there, only the list copy is used. iterable has been iterated upon and is useless now if it wasn't list or tuple.



        (I couldn't track down the call to len, would take a debugging session to find it in the PySequence_Fast call, but that seems useless. Your iterable has a __len__ method, okay, but since it's not a list or tuple, the returned value isn't used)






        share|improve this answer














        How str.join works (from analysing the source code)



        First it checks if the object is an iterable & creates a sequence out of it if needed



        seq = PySequence_Fast(iterable, "can only join an iterable");


        If the object is a list or tuple, it just returns the object itself, no need to iterate.



        If it's not, then it iterates to create a list. That's where the object is fully iterated upon.



        From there, only the list copy is used. iterable has been iterated upon and is useless now if it wasn't list or tuple.



        (I couldn't track down the call to len, would take a debugging session to find it in the PySequence_Fast call, but that seems useless. Your iterable has a __len__ method, okay, but since it's not a list or tuple, the returned value isn't used)







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Nov 17 at 22:51

























        answered Nov 17 at 22:38









        Jean-François Fabre

        97.8k950107




        97.8k950107






























             

            draft saved


            draft discarded



















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53355927%2fwrong-str-join-implementation-behavior%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            "Incorrect syntax near the keyword 'ON'. (on update cascade, on delete cascade,)

            Alcedinidae

            Origin of the phrase “under your belt”?