Collect subsequences of consecutive integers












4












$begingroup$


I've written a function to split a list into several lists with consecutive integers, e.g.:



extract_subsequences([1,2,3,7,8,9])



[[1, 2, 3], [7, 8, 9]]




extract_subsequences([1,2,3])



[[1, 2, 3]]




extract_subsequences([1,2,3, 10])



[[1, 2, 3], [10]]




extract_subsequences([1,2, 4,5, 8,9])



[[1, 2], [4, 5], [8, 9]]




This is the code that I came up with:



import numpy as np

def extract_subsequences(seq):
def split_sequence(seq):
for i in np.arange(1, len(seq)):
if seq[i] != seq[i - 1] + 1:
break

if i < len(seq) - 1:
return seq[:i], seq[i:]
else:
if seq[-1] == seq[-2] + 1:
return seq,
else:
return seq[:-1], [seq[-1]]

res =
last = seq
while len(last) > 1:
first, last = split_sequence(last)
res.append(first)

if len(last) == 1:
res.append(last)

return res


Can someone help me improve this code? It's kind of difficult to read with all the indices and special cases, and for the same reasons it was also quite difficult to write. I would very much appreciate a better way of thinking about this problem.










share|improve this question









New contributor




C. E. is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$












  • $begingroup$
    Are all sequences sorted?
    $endgroup$
    – Ludisposed
    yesterday










  • $begingroup$
    @Ludisposed Yes
    $endgroup$
    – C. E.
    yesterday
















4












$begingroup$


I've written a function to split a list into several lists with consecutive integers, e.g.:



extract_subsequences([1,2,3,7,8,9])



[[1, 2, 3], [7, 8, 9]]




extract_subsequences([1,2,3])



[[1, 2, 3]]




extract_subsequences([1,2,3, 10])



[[1, 2, 3], [10]]




extract_subsequences([1,2, 4,5, 8,9])



[[1, 2], [4, 5], [8, 9]]




This is the code that I came up with:



import numpy as np

def extract_subsequences(seq):
def split_sequence(seq):
for i in np.arange(1, len(seq)):
if seq[i] != seq[i - 1] + 1:
break

if i < len(seq) - 1:
return seq[:i], seq[i:]
else:
if seq[-1] == seq[-2] + 1:
return seq,
else:
return seq[:-1], [seq[-1]]

res =
last = seq
while len(last) > 1:
first, last = split_sequence(last)
res.append(first)

if len(last) == 1:
res.append(last)

return res


Can someone help me improve this code? It's kind of difficult to read with all the indices and special cases, and for the same reasons it was also quite difficult to write. I would very much appreciate a better way of thinking about this problem.










share|improve this question









New contributor




C. E. is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$












  • $begingroup$
    Are all sequences sorted?
    $endgroup$
    – Ludisposed
    yesterday










  • $begingroup$
    @Ludisposed Yes
    $endgroup$
    – C. E.
    yesterday














4












4








4





$begingroup$


I've written a function to split a list into several lists with consecutive integers, e.g.:



extract_subsequences([1,2,3,7,8,9])



[[1, 2, 3], [7, 8, 9]]




extract_subsequences([1,2,3])



[[1, 2, 3]]




extract_subsequences([1,2,3, 10])



[[1, 2, 3], [10]]




extract_subsequences([1,2, 4,5, 8,9])



[[1, 2], [4, 5], [8, 9]]




This is the code that I came up with:



import numpy as np

def extract_subsequences(seq):
def split_sequence(seq):
for i in np.arange(1, len(seq)):
if seq[i] != seq[i - 1] + 1:
break

if i < len(seq) - 1:
return seq[:i], seq[i:]
else:
if seq[-1] == seq[-2] + 1:
return seq,
else:
return seq[:-1], [seq[-1]]

res =
last = seq
while len(last) > 1:
first, last = split_sequence(last)
res.append(first)

if len(last) == 1:
res.append(last)

return res


Can someone help me improve this code? It's kind of difficult to read with all the indices and special cases, and for the same reasons it was also quite difficult to write. I would very much appreciate a better way of thinking about this problem.










share|improve this question









New contributor




C. E. is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$




I've written a function to split a list into several lists with consecutive integers, e.g.:



extract_subsequences([1,2,3,7,8,9])



[[1, 2, 3], [7, 8, 9]]




extract_subsequences([1,2,3])



[[1, 2, 3]]




extract_subsequences([1,2,3, 10])



[[1, 2, 3], [10]]




extract_subsequences([1,2, 4,5, 8,9])



[[1, 2], [4, 5], [8, 9]]




This is the code that I came up with:



import numpy as np

def extract_subsequences(seq):
def split_sequence(seq):
for i in np.arange(1, len(seq)):
if seq[i] != seq[i - 1] + 1:
break

if i < len(seq) - 1:
return seq[:i], seq[i:]
else:
if seq[-1] == seq[-2] + 1:
return seq,
else:
return seq[:-1], [seq[-1]]

res =
last = seq
while len(last) > 1:
first, last = split_sequence(last)
res.append(first)

if len(last) == 1:
res.append(last)

return res


Can someone help me improve this code? It's kind of difficult to read with all the indices and special cases, and for the same reasons it was also quite difficult to write. I would very much appreciate a better way of thinking about this problem.







python python-3.x programming-challenge interval






share|improve this question









New contributor




C. E. is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




C. E. is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited yesterday









200_success

129k15153415




129k15153415






New contributor




C. E. is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked yesterday









C. E.C. E.

1234




1234




New contributor




C. E. is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





C. E. is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






C. E. is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












  • $begingroup$
    Are all sequences sorted?
    $endgroup$
    – Ludisposed
    yesterday










  • $begingroup$
    @Ludisposed Yes
    $endgroup$
    – C. E.
    yesterday


















  • $begingroup$
    Are all sequences sorted?
    $endgroup$
    – Ludisposed
    yesterday










  • $begingroup$
    @Ludisposed Yes
    $endgroup$
    – C. E.
    yesterday
















$begingroup$
Are all sequences sorted?
$endgroup$
– Ludisposed
yesterday




$begingroup$
Are all sequences sorted?
$endgroup$
– Ludisposed
yesterday












$begingroup$
@Ludisposed Yes
$endgroup$
– C. E.
yesterday




$begingroup$
@Ludisposed Yes
$endgroup$
– C. E.
yesterday










3 Answers
3






active

oldest

votes


















6












$begingroup$

Using yield is often simpler than building up a list to return. Here it would reduce




    res = 
last = seq
while len(last) > 1:
first, last = split_sequence(last)
res.append(first)

if len(last) == 1:
res.append(last)

return res



to



last = seq
while len(last) > 1:
first, last = split_sequence(last)
yield first

if len(last) == 1:
yield last


at which point you could inline split_sequence.





split_sequence seems unnecessarily complicated to me. The entire extract_subsequences could be written with only one special case as



current = 
for val in seq:
if current != and val != current[-1] + 1:
yield current
current =
current += [val]

# Test is only necessary because seq might be empty
if current != :
yield current





share|improve this answer









$endgroup$





















    9












    $begingroup$

    @Peter Taylor has a good idea, but there is more to be improved




    • You should add some tests



    • Python is often described as Batteries included



      This is a perfect example to use itertools.groupby()






    from itertools import groupby
    import doctest

    def extract_seq(seq):
    """
    Splits sequence into consecutive lists

    args:
    seq (list): A sorted sequence

    >>> extract_seq([1,2, 4,5, 8,9])
    [[1, 2], [4, 5], [8, 9]]

    >>> extract_seq([1,2,3])
    [[1, 2, 3]]

    >>> extract_seq([1,2,3,7,8,9])
    [[1, 2, 3], [7, 8, 9]]

    >>> extract_seq([1,2,3,10])
    [[1, 2, 3], [10]]
    """
    return [
    [x for _, x in g]
    for k, g in groupby(
    enumerate(seq),
    lambda i_x : i_x[0] - i_x[1]
    )
    ]

    if __name__ == "__main__":
    doctest.testmod()





    share|improve this answer











    $endgroup$





















      4












      $begingroup$

      One more thing that no one has explicitly pointed out yet, because it is made irrelevant: there's no need to import numpy just to iterate over numpy.arange. Just use range instead: for i in range(1, len(seq)):. Or even better, use the itertools recipe pairwise (available with more-itertools):



      for a, b in pairwise(seq):
      if b != a + 1:
      break





      share|improve this answer









      $endgroup$









      • 2




        $begingroup$
        Thank you for pointing me to more-itertools! It has a function called consecutive_groups which does that which is needed.
        $endgroup$
        – C. E.
        17 hours ago











      Your Answer





      StackExchange.ifUsing("editor", function () {
      return StackExchange.using("mathjaxEditing", function () {
      StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
      StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
      });
      });
      }, "mathjax-editing");

      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "196"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: false,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });






      C. E. is a new contributor. Be nice, and check out our Code of Conduct.










      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f212848%2fcollect-subsequences-of-consecutive-integers%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      3 Answers
      3






      active

      oldest

      votes








      3 Answers
      3






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      6












      $begingroup$

      Using yield is often simpler than building up a list to return. Here it would reduce




          res = 
      last = seq
      while len(last) > 1:
      first, last = split_sequence(last)
      res.append(first)

      if len(last) == 1:
      res.append(last)

      return res



      to



      last = seq
      while len(last) > 1:
      first, last = split_sequence(last)
      yield first

      if len(last) == 1:
      yield last


      at which point you could inline split_sequence.





      split_sequence seems unnecessarily complicated to me. The entire extract_subsequences could be written with only one special case as



      current = 
      for val in seq:
      if current != and val != current[-1] + 1:
      yield current
      current =
      current += [val]

      # Test is only necessary because seq might be empty
      if current != :
      yield current





      share|improve this answer









      $endgroup$


















        6












        $begingroup$

        Using yield is often simpler than building up a list to return. Here it would reduce




            res = 
        last = seq
        while len(last) > 1:
        first, last = split_sequence(last)
        res.append(first)

        if len(last) == 1:
        res.append(last)

        return res



        to



        last = seq
        while len(last) > 1:
        first, last = split_sequence(last)
        yield first

        if len(last) == 1:
        yield last


        at which point you could inline split_sequence.





        split_sequence seems unnecessarily complicated to me. The entire extract_subsequences could be written with only one special case as



        current = 
        for val in seq:
        if current != and val != current[-1] + 1:
        yield current
        current =
        current += [val]

        # Test is only necessary because seq might be empty
        if current != :
        yield current





        share|improve this answer









        $endgroup$
















          6












          6








          6





          $begingroup$

          Using yield is often simpler than building up a list to return. Here it would reduce




              res = 
          last = seq
          while len(last) > 1:
          first, last = split_sequence(last)
          res.append(first)

          if len(last) == 1:
          res.append(last)

          return res



          to



          last = seq
          while len(last) > 1:
          first, last = split_sequence(last)
          yield first

          if len(last) == 1:
          yield last


          at which point you could inline split_sequence.





          split_sequence seems unnecessarily complicated to me. The entire extract_subsequences could be written with only one special case as



          current = 
          for val in seq:
          if current != and val != current[-1] + 1:
          yield current
          current =
          current += [val]

          # Test is only necessary because seq might be empty
          if current != :
          yield current





          share|improve this answer









          $endgroup$



          Using yield is often simpler than building up a list to return. Here it would reduce




              res = 
          last = seq
          while len(last) > 1:
          first, last = split_sequence(last)
          res.append(first)

          if len(last) == 1:
          res.append(last)

          return res



          to



          last = seq
          while len(last) > 1:
          first, last = split_sequence(last)
          yield first

          if len(last) == 1:
          yield last


          at which point you could inline split_sequence.





          split_sequence seems unnecessarily complicated to me. The entire extract_subsequences could be written with only one special case as



          current = 
          for val in seq:
          if current != and val != current[-1] + 1:
          yield current
          current =
          current += [val]

          # Test is only necessary because seq might be empty
          if current != :
          yield current






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered yesterday









          Peter TaylorPeter Taylor

          16.3k2860




          16.3k2860

























              9












              $begingroup$

              @Peter Taylor has a good idea, but there is more to be improved




              • You should add some tests



              • Python is often described as Batteries included



                This is a perfect example to use itertools.groupby()






              from itertools import groupby
              import doctest

              def extract_seq(seq):
              """
              Splits sequence into consecutive lists

              args:
              seq (list): A sorted sequence

              >>> extract_seq([1,2, 4,5, 8,9])
              [[1, 2], [4, 5], [8, 9]]

              >>> extract_seq([1,2,3])
              [[1, 2, 3]]

              >>> extract_seq([1,2,3,7,8,9])
              [[1, 2, 3], [7, 8, 9]]

              >>> extract_seq([1,2,3,10])
              [[1, 2, 3], [10]]
              """
              return [
              [x for _, x in g]
              for k, g in groupby(
              enumerate(seq),
              lambda i_x : i_x[0] - i_x[1]
              )
              ]

              if __name__ == "__main__":
              doctest.testmod()





              share|improve this answer











              $endgroup$


















                9












                $begingroup$

                @Peter Taylor has a good idea, but there is more to be improved




                • You should add some tests



                • Python is often described as Batteries included



                  This is a perfect example to use itertools.groupby()






                from itertools import groupby
                import doctest

                def extract_seq(seq):
                """
                Splits sequence into consecutive lists

                args:
                seq (list): A sorted sequence

                >>> extract_seq([1,2, 4,5, 8,9])
                [[1, 2], [4, 5], [8, 9]]

                >>> extract_seq([1,2,3])
                [[1, 2, 3]]

                >>> extract_seq([1,2,3,7,8,9])
                [[1, 2, 3], [7, 8, 9]]

                >>> extract_seq([1,2,3,10])
                [[1, 2, 3], [10]]
                """
                return [
                [x for _, x in g]
                for k, g in groupby(
                enumerate(seq),
                lambda i_x : i_x[0] - i_x[1]
                )
                ]

                if __name__ == "__main__":
                doctest.testmod()





                share|improve this answer











                $endgroup$
















                  9












                  9








                  9





                  $begingroup$

                  @Peter Taylor has a good idea, but there is more to be improved




                  • You should add some tests



                  • Python is often described as Batteries included



                    This is a perfect example to use itertools.groupby()






                  from itertools import groupby
                  import doctest

                  def extract_seq(seq):
                  """
                  Splits sequence into consecutive lists

                  args:
                  seq (list): A sorted sequence

                  >>> extract_seq([1,2, 4,5, 8,9])
                  [[1, 2], [4, 5], [8, 9]]

                  >>> extract_seq([1,2,3])
                  [[1, 2, 3]]

                  >>> extract_seq([1,2,3,7,8,9])
                  [[1, 2, 3], [7, 8, 9]]

                  >>> extract_seq([1,2,3,10])
                  [[1, 2, 3], [10]]
                  """
                  return [
                  [x for _, x in g]
                  for k, g in groupby(
                  enumerate(seq),
                  lambda i_x : i_x[0] - i_x[1]
                  )
                  ]

                  if __name__ == "__main__":
                  doctest.testmod()





                  share|improve this answer











                  $endgroup$



                  @Peter Taylor has a good idea, but there is more to be improved




                  • You should add some tests



                  • Python is often described as Batteries included



                    This is a perfect example to use itertools.groupby()






                  from itertools import groupby
                  import doctest

                  def extract_seq(seq):
                  """
                  Splits sequence into consecutive lists

                  args:
                  seq (list): A sorted sequence

                  >>> extract_seq([1,2, 4,5, 8,9])
                  [[1, 2], [4, 5], [8, 9]]

                  >>> extract_seq([1,2,3])
                  [[1, 2, 3]]

                  >>> extract_seq([1,2,3,7,8,9])
                  [[1, 2, 3], [7, 8, 9]]

                  >>> extract_seq([1,2,3,10])
                  [[1, 2, 3], [10]]
                  """
                  return [
                  [x for _, x in g]
                  for k, g in groupby(
                  enumerate(seq),
                  lambda i_x : i_x[0] - i_x[1]
                  )
                  ]

                  if __name__ == "__main__":
                  doctest.testmod()






                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited 18 hours ago

























                  answered yesterday









                  LudisposedLudisposed

                  7,93721960




                  7,93721960























                      4












                      $begingroup$

                      One more thing that no one has explicitly pointed out yet, because it is made irrelevant: there's no need to import numpy just to iterate over numpy.arange. Just use range instead: for i in range(1, len(seq)):. Or even better, use the itertools recipe pairwise (available with more-itertools):



                      for a, b in pairwise(seq):
                      if b != a + 1:
                      break





                      share|improve this answer









                      $endgroup$









                      • 2




                        $begingroup$
                        Thank you for pointing me to more-itertools! It has a function called consecutive_groups which does that which is needed.
                        $endgroup$
                        – C. E.
                        17 hours ago
















                      4












                      $begingroup$

                      One more thing that no one has explicitly pointed out yet, because it is made irrelevant: there's no need to import numpy just to iterate over numpy.arange. Just use range instead: for i in range(1, len(seq)):. Or even better, use the itertools recipe pairwise (available with more-itertools):



                      for a, b in pairwise(seq):
                      if b != a + 1:
                      break





                      share|improve this answer









                      $endgroup$









                      • 2




                        $begingroup$
                        Thank you for pointing me to more-itertools! It has a function called consecutive_groups which does that which is needed.
                        $endgroup$
                        – C. E.
                        17 hours ago














                      4












                      4








                      4





                      $begingroup$

                      One more thing that no one has explicitly pointed out yet, because it is made irrelevant: there's no need to import numpy just to iterate over numpy.arange. Just use range instead: for i in range(1, len(seq)):. Or even better, use the itertools recipe pairwise (available with more-itertools):



                      for a, b in pairwise(seq):
                      if b != a + 1:
                      break





                      share|improve this answer









                      $endgroup$



                      One more thing that no one has explicitly pointed out yet, because it is made irrelevant: there's no need to import numpy just to iterate over numpy.arange. Just use range instead: for i in range(1, len(seq)):. Or even better, use the itertools recipe pairwise (available with more-itertools):



                      for a, b in pairwise(seq):
                      if b != a + 1:
                      break






                      share|improve this answer












                      share|improve this answer



                      share|improve this answer










                      answered yesterday









                      Solomon UckoSolomon Ucko

                      1,152415




                      1,152415








                      • 2




                        $begingroup$
                        Thank you for pointing me to more-itertools! It has a function called consecutive_groups which does that which is needed.
                        $endgroup$
                        – C. E.
                        17 hours ago














                      • 2




                        $begingroup$
                        Thank you for pointing me to more-itertools! It has a function called consecutive_groups which does that which is needed.
                        $endgroup$
                        – C. E.
                        17 hours ago








                      2




                      2




                      $begingroup$
                      Thank you for pointing me to more-itertools! It has a function called consecutive_groups which does that which is needed.
                      $endgroup$
                      – C. E.
                      17 hours ago




                      $begingroup$
                      Thank you for pointing me to more-itertools! It has a function called consecutive_groups which does that which is needed.
                      $endgroup$
                      – C. E.
                      17 hours ago










                      C. E. is a new contributor. Be nice, and check out our Code of Conduct.










                      draft saved

                      draft discarded


















                      C. E. is a new contributor. Be nice, and check out our Code of Conduct.













                      C. E. is a new contributor. Be nice, and check out our Code of Conduct.












                      C. E. is a new contributor. Be nice, and check out our Code of Conduct.
















                      Thanks for contributing an answer to Code Review Stack Exchange!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      Use MathJax to format equations. MathJax reference.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f212848%2fcollect-subsequences-of-consecutive-integers%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      "Incorrect syntax near the keyword 'ON'. (on update cascade, on delete cascade,)

                      Alcedinidae

                      RAC Tourist Trophy