Pandas Split DataFrame using row index

I want to split dataframe by uneven number of rows using row index.

The below code:

groups = df.groupby((np.arange(len(df.index))/l[1]).astype(int))

works only for uniform number of rows.

df



a b c  

1 1 1  

2 2 2  

3 3 3  

4 4 4  

5 5 5  

6 6 6  

7 7 7  



l = [2, 5, 7]



df1  

1 1 1  

2 2 2  



df2  

3,3,3  

4,4,4  

5,5,5  



df3  

6,6,6  

7,7,7  



df4  

8,8,8

edited Nov 20 '18 at 11:04

anky_91

1,055214

asked Nov 20 '18 at 10:51

Pradeep Tummala

133

have you tried df.loc?
– Mohit Motwani
Nov 20 '18 at 11:10

Do you want to split randomly or do you have some set of indexes you'd like to split with?
– Mohit Motwani
Nov 20 '18 at 11:12

Not random, I would like split based on array l. First 2 rows then from 3rd to 5th row and so on
– Pradeep Tummala
Nov 21 '18 at 7:37

add a comment |

I want to split dataframe by uneven number of rows using row index.

The below code:

groups = df.groupby((np.arange(len(df.index))/l[1]).astype(int))

works only for uniform number of rows.

df



a b c  

1 1 1  

2 2 2  

3 3 3  

4 4 4  

5 5 5  

6 6 6  

7 7 7  



l = [2, 5, 7]



df1  

1 1 1  

2 2 2  



df2  

3,3,3  

4,4,4  

5,5,5  



df3  

6,6,6  

7,7,7  



df4  

8,8,8

edited Nov 20 '18 at 11:04

anky_91

1,055214

asked Nov 20 '18 at 10:51

Pradeep Tummala

133

have you tried df.loc?
– Mohit Motwani
Nov 20 '18 at 11:10

Do you want to split randomly or do you have some set of indexes you'd like to split with?
– Mohit Motwani
Nov 20 '18 at 11:12

Not random, I would like split based on array l. First 2 rows then from 3rd to 5th row and so on
– Pradeep Tummala
Nov 21 '18 at 7:37

add a comment |

I want to split dataframe by uneven number of rows using row index.

The below code:

groups = df.groupby((np.arange(len(df.index))/l[1]).astype(int))

works only for uniform number of rows.

df



a b c  

1 1 1  

2 2 2  

3 3 3  

4 4 4  

5 5 5  

6 6 6  

7 7 7  



l = [2, 5, 7]



df1  

1 1 1  

2 2 2  



df2  

3,3,3  

4,4,4  

5,5,5  



df3  

6,6,6  

7,7,7  



df4  

8,8,8

edited Nov 20 '18 at 11:04

anky_91

1,055214

asked Nov 20 '18 at 10:51

Pradeep Tummala

133

I want to split dataframe by uneven number of rows using row index.

The below code:

groups = df.groupby((np.arange(len(df.index))/l[1]).astype(int))

works only for uniform number of rows.

df



a b c  

1 1 1  

2 2 2  

3 3 3  

4 4 4  

5 5 5  

6 6 6  

7 7 7  



l = [2, 5, 7]



df1  

1 1 1  

2 2 2  



df2  

3,3,3  

4,4,4  

5,5,5  



df3  

6,6,6  

7,7,7  



df4  

8,8,8

python pandas dataframe pandas-groupby

edited Nov 20 '18 at 11:04

anky_91

1,055214

asked Nov 20 '18 at 10:51

Pradeep Tummala

133

edited Nov 20 '18 at 11:04

anky_91

1,055214

asked Nov 20 '18 at 10:51

Pradeep Tummala

133

edited Nov 20 '18 at 11:04

anky_91

1,055214

edited Nov 20 '18 at 11:04

anky_91

1,055214

edited Nov 20 '18 at 11:04

anky_91

1,055214

asked Nov 20 '18 at 10:51

Pradeep Tummala

133

asked Nov 20 '18 at 10:51

Pradeep Tummala

133

asked Nov 20 '18 at 10:51

Pradeep Tummala

133

have you tried df.loc?
– Mohit Motwani
Nov 20 '18 at 11:10

Do you want to split randomly or do you have some set of indexes you'd like to split with?
– Mohit Motwani
Nov 20 '18 at 11:12

Not random, I would like split based on array l. First 2 rows then from 3rd to 5th row and so on
– Pradeep Tummala
Nov 21 '18 at 7:37

add a comment |

have you tried df.loc?
– Mohit Motwani
Nov 20 '18 at 11:10

Do you want to split randomly or do you have some set of indexes you'd like to split with?
– Mohit Motwani
Nov 20 '18 at 11:12

Not random, I would like split based on array l. First 2 rows then from 3rd to 5th row and so on
– Pradeep Tummala
Nov 21 '18 at 7:37

have you tried df.loc?
– Mohit Motwani
Nov 20 '18 at 11:10

Do you want to split randomly or do you have some set of indexes you'd like to split with?
– Mohit Motwani
Nov 20 '18 at 11:12

Not random, I would like split based on array l. First 2 rows then from 3rd to 5th row and so on
– Pradeep Tummala
Nov 21 '18 at 7:37

add a comment |

5 Answers
5

active

oldest

votes

You could use list comprehension with a little modications your list, l, first.

print(df)



   a  b  c

0  1  1  1

1  2  2  2

2  3  3  3

3  4  4  4

4  5  5  5

5  6  6  6

6  7  7  7

7  8  8  8





l = [2,5,7]

l_mod = [0] + l + [max(l)+1]



list_of_dfs = [df.iloc[l_mod[n]:l_mod[n+1]] for n in range(len(l_mod)-1)]

Output:

list_of_dfs[0]



   a  b  c

0  1  1  1

1  2  2  2



list_of_dfs[1]



   a  b  c

2  3  3  3

3  4  4  4

4  5  5  5



list_of_dfs[2]



   a  b  c

5  6  6  6

6  7  7  7



list_of_dfs[3]



   a  b  c

7  8  8  8

answered Nov 20 '18 at 14:40

Scott Boston

51.8k72955

Thanks. Works pretty well in minimum lines
– Pradeep Tummala
Nov 21 '18 at 7:47

@PradeepTummala if this answer helped you, would you consider upvoting and accepting.
– Scott Boston
Dec 4 '18 at 14:20

add a comment |

I think this is you are looking for.,

l = [2, 5, 7]

dfs=

i=0

for val in l:

    if i==0:

        temp=df.iloc[:val]

        dfs.append(temp)

    elif i==len(l):

        temp=df.iloc[val]

        dfs.append(temp)        

    else:

        temp=df.iloc[l[i-1]:val]

        dfs.append(temp)

    i+=1

Output:

Another Solution:

l = [2, 5, 7]

t= np.arange(l[-1])

l.reverse()

for val in l:

    t[:val]=val

temp=pd.DataFrame(t)

temp=pd.concat([df,temp],axis=1)

for u,v in temp.groupby(0):

    print v

Output:

   a  b  c  0

0  1  1  1  2

1  2  2  2  2

   a  b  c  0

2  3  3  3  5

3  4  4  4  5

4  5  5  5  5

   a  b  c  0

5  6  6  6  7

6  7  7  7  7

edited Nov 20 '18 at 11:31

answered Nov 20 '18 at 11:13

Mohamed Thasin ah

3,45931238

add a comment |

Do this:

l = [2,5,7]

c = 0

d = dict()  # A dictionary to hold multiple dataframes



In [477]: for i in l:

     ...:     if c == 0:

     ...:         index_list = df[df.a <= i].index

     ...:     else:

     ...:         index_list = df[(df.a > l[c-1]) & (df.a <= l[c])].index

     ...:     min_index = index_list[0]

     ...:     max_index = index_list[-1] + 1

     ...:     d[i] = df.iloc[min_index:max_index]

     ...:     c += 1

     ...:     





In [479]: for key in d.keys():

     ...:     print(d[key])

     ...:     

   a  b  c

0  1  1  1

1  2  2  2

   a  b  c

2  3  3  3

3  4  4  4

4  5  5  5

   a  b  c

5  6  6  6

6  7  7  7

edited Nov 20 '18 at 12:24

answered Nov 20 '18 at 11:20

Mayank Porwal

4,4991624

add a comment |

You can create an array to use for indexing via NumPy:

import pandas as pd, numpy as np



df = pd.DataFrame(np.arange(24).reshape((8, 3)), columns=list('abc'))



L = [2, 5, 7]

idx = np.cumsum(np.in1d(np.arange(len(df.index)), L))



for _, chunk in df.groupby(idx):

    print(chunk, 'n')



   a  b  c

0  0  1  2

1  3  4  5 



    a   b   c

2   6   7   8

3   9  10  11

4  12  13  14 



    a   b   c

5  15  16  17

6  18  19  20 



    a   b   c

7  21  22  23

Instead of defining a new variable for each dataframe, you can use a dictionary:

d = dict(tuple(df.groupby(idx)))



print(d[1])  # print second groupby value



    a   b   c

2   6   7   8

3   9  10  11

4  12  13  14

answered Nov 20 '18 at 14:04

jpp

92.2k2053103

add a comment |

I think this is what you need:

df = pd.DataFrame({'a': np.arange(1, 8),

                  'b': np.arange(1, 8),

                  'c': np.arange(1, 8)})

df.head()

    a   b   c

0   1   1   1

1   2   2   2

2   3   3   3

3   4   4   4

4   5   5   5

5   6   6   6

6   7   7   7



last_check = 0

dfs = 

for ind in [2, 5, 7]:

    dfs.append(df.loc[last_check:ind-1])

    last_check = ind

Although list comprehension are much more efficient than a for loop, the last_check is necessary if you don't have a pattern in your list of indices.

dfs[0]



    a   b   c

0   1   1   1

1   2   2   2



dfs[2]



    a   b   c

5   6   6   6

6   7   7   7

answered Nov 21 '18 at 9:37

Mohit Motwani

1,1111422

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53391378%2fpandas-split-dataframe-using-row-index%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

5 Answers
5

active

oldest

votes

5 Answers
5

active

oldest

votes

You could use list comprehension with a little modications your list, l, first.

print(df)



   a  b  c

0  1  1  1

1  2  2  2

2  3  3  3

3  4  4  4

4  5  5  5

5  6  6  6

6  7  7  7

7  8  8  8





l = [2,5,7]

l_mod = [0] + l + [max(l)+1]



list_of_dfs = [df.iloc[l_mod[n]:l_mod[n+1]] for n in range(len(l_mod)-1)]

Output:

list_of_dfs[0]



   a  b  c

0  1  1  1

1  2  2  2



list_of_dfs[1]



   a  b  c

2  3  3  3

3  4  4  4

4  5  5  5



list_of_dfs[2]



   a  b  c

5  6  6  6

6  7  7  7



list_of_dfs[3]



   a  b  c

7  8  8  8

answered Nov 20 '18 at 14:40

Scott Boston

51.8k72955

Thanks. Works pretty well in minimum lines
– Pradeep Tummala
Nov 21 '18 at 7:47

@PradeepTummala if this answer helped you, would you consider upvoting and accepting.
– Scott Boston
Dec 4 '18 at 14:20

add a comment |

You could use list comprehension with a little modications your list, l, first.

print(df)



   a  b  c

0  1  1  1

1  2  2  2

2  3  3  3

3  4  4  4

4  5  5  5

5  6  6  6

6  7  7  7

7  8  8  8





l = [2,5,7]

l_mod = [0] + l + [max(l)+1]



list_of_dfs = [df.iloc[l_mod[n]:l_mod[n+1]] for n in range(len(l_mod)-1)]

Output:

list_of_dfs[0]



   a  b  c

0  1  1  1

1  2  2  2



list_of_dfs[1]



   a  b  c

2  3  3  3

3  4  4  4

4  5  5  5



list_of_dfs[2]



   a  b  c

5  6  6  6

6  7  7  7



list_of_dfs[3]



   a  b  c

7  8  8  8

answered Nov 20 '18 at 14:40

Scott Boston

51.8k72955

Thanks. Works pretty well in minimum lines
– Pradeep Tummala
Nov 21 '18 at 7:47

@PradeepTummala if this answer helped you, would you consider upvoting and accepting.
– Scott Boston
Dec 4 '18 at 14:20

add a comment |

You could use list comprehension with a little modications your list, l, first.

print(df)



   a  b  c

0  1  1  1

1  2  2  2

2  3  3  3

3  4  4  4

4  5  5  5

5  6  6  6

6  7  7  7

7  8  8  8





l = [2,5,7]

l_mod = [0] + l + [max(l)+1]



list_of_dfs = [df.iloc[l_mod[n]:l_mod[n+1]] for n in range(len(l_mod)-1)]

Output:

list_of_dfs[0]



   a  b  c

0  1  1  1

1  2  2  2



list_of_dfs[1]



   a  b  c

2  3  3  3

3  4  4  4

4  5  5  5



list_of_dfs[2]



   a  b  c

5  6  6  6

6  7  7  7



list_of_dfs[3]



   a  b  c

7  8  8  8

answered Nov 20 '18 at 14:40

Scott Boston

51.8k72955

You could use list comprehension with a little modications your list, l, first.

print(df)



   a  b  c

0  1  1  1

1  2  2  2

2  3  3  3

3  4  4  4

4  5  5  5

5  6  6  6

6  7  7  7

7  8  8  8





l = [2,5,7]

l_mod = [0] + l + [max(l)+1]



list_of_dfs = [df.iloc[l_mod[n]:l_mod[n+1]] for n in range(len(l_mod)-1)]

Output:

list_of_dfs[0]



   a  b  c

0  1  1  1

1  2  2  2



list_of_dfs[1]



   a  b  c

2  3  3  3

3  4  4  4

4  5  5  5



list_of_dfs[2]



   a  b  c

5  6  6  6

6  7  7  7



list_of_dfs[3]



   a  b  c

7  8  8  8

answered Nov 20 '18 at 14:40

Scott Boston

51.8k72955

answered Nov 20 '18 at 14:40

Scott Boston

51.8k72955

answered Nov 20 '18 at 14:40

Scott Boston

51.8k72955

answered Nov 20 '18 at 14:40

Scott Boston

51.8k72955

Thanks. Works pretty well in minimum lines
– Pradeep Tummala
Nov 21 '18 at 7:47

@PradeepTummala if this answer helped you, would you consider upvoting and accepting.
– Scott Boston
Dec 4 '18 at 14:20

add a comment |

Thanks. Works pretty well in minimum lines
– Pradeep Tummala
Nov 21 '18 at 7:47

@PradeepTummala if this answer helped you, would you consider upvoting and accepting.
– Scott Boston
Dec 4 '18 at 14:20

Thanks. Works pretty well in minimum lines
– Pradeep Tummala
Nov 21 '18 at 7:47

@PradeepTummala if this answer helped you, would you consider upvoting and accepting.
– Scott Boston
Dec 4 '18 at 14:20

add a comment |

I think this is you are looking for.,

l = [2, 5, 7]

dfs=

i=0

for val in l:

    if i==0:

        temp=df.iloc[:val]

        dfs.append(temp)

    elif i==len(l):

        temp=df.iloc[val]

        dfs.append(temp)        

    else:

        temp=df.iloc[l[i-1]:val]

        dfs.append(temp)

    i+=1

Output:

Another Solution:

l = [2, 5, 7]

t= np.arange(l[-1])

l.reverse()

for val in l:

    t[:val]=val

temp=pd.DataFrame(t)

temp=pd.concat([df,temp],axis=1)

for u,v in temp.groupby(0):

    print v

Output:

   a  b  c  0

0  1  1  1  2

1  2  2  2  2

   a  b  c  0

2  3  3  3  5

3  4  4  4  5

4  5  5  5  5

   a  b  c  0

5  6  6  6  7

6  7  7  7  7

edited Nov 20 '18 at 11:31

answered Nov 20 '18 at 11:13

Mohamed Thasin ah

3,45931238

add a comment |

I think this is you are looking for.,

l = [2, 5, 7]

dfs=

i=0

for val in l:

    if i==0:

        temp=df.iloc[:val]

        dfs.append(temp)

    elif i==len(l):

        temp=df.iloc[val]

        dfs.append(temp)        

    else:

        temp=df.iloc[l[i-1]:val]

        dfs.append(temp)

    i+=1

Output:

Another Solution:

l = [2, 5, 7]

t= np.arange(l[-1])

l.reverse()

for val in l:

    t[:val]=val

temp=pd.DataFrame(t)

temp=pd.concat([df,temp],axis=1)

for u,v in temp.groupby(0):

    print v

Output:

   a  b  c  0

0  1  1  1  2

1  2  2  2  2

   a  b  c  0

2  3  3  3  5

3  4  4  4  5

4  5  5  5  5

   a  b  c  0

5  6  6  6  7

6  7  7  7  7

edited Nov 20 '18 at 11:31

answered Nov 20 '18 at 11:13

Mohamed Thasin ah

3,45931238

add a comment |

I think this is you are looking for.,

l = [2, 5, 7]

dfs=

i=0

for val in l:

    if i==0:

        temp=df.iloc[:val]

        dfs.append(temp)

    elif i==len(l):

        temp=df.iloc[val]

        dfs.append(temp)        

    else:

        temp=df.iloc[l[i-1]:val]

        dfs.append(temp)

    i+=1

Output:

Another Solution:

l = [2, 5, 7]

t= np.arange(l[-1])

l.reverse()

for val in l:

    t[:val]=val

temp=pd.DataFrame(t)

temp=pd.concat([df,temp],axis=1)

for u,v in temp.groupby(0):

    print v

Output:

   a  b  c  0

0  1  1  1  2

1  2  2  2  2

   a  b  c  0

2  3  3  3  5

3  4  4  4  5

4  5  5  5  5

   a  b  c  0

5  6  6  6  7

6  7  7  7  7

edited Nov 20 '18 at 11:31

answered Nov 20 '18 at 11:13

Mohamed Thasin ah

3,45931238

I think this is you are looking for.,

l = [2, 5, 7]

dfs=

i=0

for val in l:

    if i==0:

        temp=df.iloc[:val]

        dfs.append(temp)

    elif i==len(l):

        temp=df.iloc[val]

        dfs.append(temp)        

    else:

        temp=df.iloc[l[i-1]:val]

        dfs.append(temp)

    i+=1

Output:

Another Solution:

l = [2, 5, 7]

t= np.arange(l[-1])

l.reverse()

for val in l:

    t[:val]=val

temp=pd.DataFrame(t)

temp=pd.concat([df,temp],axis=1)

for u,v in temp.groupby(0):

    print v

Output:

   a  b  c  0

0  1  1  1  2

1  2  2  2  2

   a  b  c  0

2  3  3  3  5

3  4  4  4  5

4  5  5  5  5

   a  b  c  0

5  6  6  6  7

6  7  7  7  7

edited Nov 20 '18 at 11:31

answered Nov 20 '18 at 11:13

Mohamed Thasin ah

3,45931238

edited Nov 20 '18 at 11:31

answered Nov 20 '18 at 11:13

Mohamed Thasin ah

3,45931238

answered Nov 20 '18 at 11:13

Mohamed Thasin ah

3,45931238

answered Nov 20 '18 at 11:13

Mohamed Thasin ah

3,45931238

add a comment |

Do this:

l = [2,5,7]

c = 0

d = dict()  # A dictionary to hold multiple dataframes



In [477]: for i in l:

     ...:     if c == 0:

     ...:         index_list = df[df.a <= i].index

     ...:     else:

     ...:         index_list = df[(df.a > l[c-1]) & (df.a <= l[c])].index

     ...:     min_index = index_list[0]

     ...:     max_index = index_list[-1] + 1

     ...:     d[i] = df.iloc[min_index:max_index]

     ...:     c += 1

     ...:     





In [479]: for key in d.keys():

     ...:     print(d[key])

     ...:     

   a  b  c

0  1  1  1

1  2  2  2

   a  b  c

2  3  3  3

3  4  4  4

4  5  5  5

   a  b  c

5  6  6  6

6  7  7  7

edited Nov 20 '18 at 12:24

answered Nov 20 '18 at 11:20

Mayank Porwal

4,4991624

add a comment |

Do this:

l = [2,5,7]

c = 0

d = dict()  # A dictionary to hold multiple dataframes



In [477]: for i in l:

     ...:     if c == 0:

     ...:         index_list = df[df.a <= i].index

     ...:     else:

     ...:         index_list = df[(df.a > l[c-1]) & (df.a <= l[c])].index

     ...:     min_index = index_list[0]

     ...:     max_index = index_list[-1] + 1

     ...:     d[i] = df.iloc[min_index:max_index]

     ...:     c += 1

     ...:     





In [479]: for key in d.keys():

     ...:     print(d[key])

     ...:     

   a  b  c

0  1  1  1

1  2  2  2

   a  b  c

2  3  3  3

3  4  4  4

4  5  5  5

   a  b  c

5  6  6  6

6  7  7  7

edited Nov 20 '18 at 12:24

answered Nov 20 '18 at 11:20

Mayank Porwal

4,4991624

add a comment |

Do this:

l = [2,5,7]

c = 0

d = dict()  # A dictionary to hold multiple dataframes



In [477]: for i in l:

     ...:     if c == 0:

     ...:         index_list = df[df.a <= i].index

     ...:     else:

     ...:         index_list = df[(df.a > l[c-1]) & (df.a <= l[c])].index

     ...:     min_index = index_list[0]

     ...:     max_index = index_list[-1] + 1

     ...:     d[i] = df.iloc[min_index:max_index]

     ...:     c += 1

     ...:     





In [479]: for key in d.keys():

     ...:     print(d[key])

     ...:     

   a  b  c

0  1  1  1

1  2  2  2

   a  b  c

2  3  3  3

3  4  4  4

4  5  5  5

   a  b  c

5  6  6  6

6  7  7  7

edited Nov 20 '18 at 12:24

answered Nov 20 '18 at 11:20

Mayank Porwal

4,4991624

Do this:

l = [2,5,7]

c = 0

d = dict()  # A dictionary to hold multiple dataframes



In [477]: for i in l:

     ...:     if c == 0:

     ...:         index_list = df[df.a <= i].index

     ...:     else:

     ...:         index_list = df[(df.a > l[c-1]) & (df.a <= l[c])].index

     ...:     min_index = index_list[0]

     ...:     max_index = index_list[-1] + 1

     ...:     d[i] = df.iloc[min_index:max_index]

     ...:     c += 1

     ...:     





In [479]: for key in d.keys():

     ...:     print(d[key])

     ...:     

   a  b  c

0  1  1  1

1  2  2  2

   a  b  c

2  3  3  3

3  4  4  4

4  5  5  5

   a  b  c

5  6  6  6

6  7  7  7

edited Nov 20 '18 at 12:24

answered Nov 20 '18 at 11:20

Mayank Porwal

4,4991624

edited Nov 20 '18 at 12:24

answered Nov 20 '18 at 11:20

Mayank Porwal

4,4991624

answered Nov 20 '18 at 11:20

Mayank Porwal

4,4991624

answered Nov 20 '18 at 11:20

Mayank Porwal

4,4991624

add a comment |

You can create an array to use for indexing via NumPy:

import pandas as pd, numpy as np



df = pd.DataFrame(np.arange(24).reshape((8, 3)), columns=list('abc'))



L = [2, 5, 7]

idx = np.cumsum(np.in1d(np.arange(len(df.index)), L))



for _, chunk in df.groupby(idx):

    print(chunk, 'n')



   a  b  c

0  0  1  2

1  3  4  5 



    a   b   c

2   6   7   8

3   9  10  11

4  12  13  14 



    a   b   c

5  15  16  17

6  18  19  20 



    a   b   c

7  21  22  23

Instead of defining a new variable for each dataframe, you can use a dictionary:

d = dict(tuple(df.groupby(idx)))



print(d[1])  # print second groupby value



    a   b   c

2   6   7   8

3   9  10  11

4  12  13  14

answered Nov 20 '18 at 14:04

jpp

92.2k2053103

add a comment |

You can create an array to use for indexing via NumPy:

import pandas as pd, numpy as np



df = pd.DataFrame(np.arange(24).reshape((8, 3)), columns=list('abc'))



L = [2, 5, 7]

idx = np.cumsum(np.in1d(np.arange(len(df.index)), L))



for _, chunk in df.groupby(idx):

    print(chunk, 'n')



   a  b  c

0  0  1  2

1  3  4  5 



    a   b   c

2   6   7   8

3   9  10  11

4  12  13  14 



    a   b   c

5  15  16  17

6  18  19  20 



    a   b   c

7  21  22  23

Instead of defining a new variable for each dataframe, you can use a dictionary:

d = dict(tuple(df.groupby(idx)))



print(d[1])  # print second groupby value



    a   b   c

2   6   7   8

3   9  10  11

4  12  13  14

answered Nov 20 '18 at 14:04

jpp

92.2k2053103

add a comment |

You can create an array to use for indexing via NumPy:

import pandas as pd, numpy as np



df = pd.DataFrame(np.arange(24).reshape((8, 3)), columns=list('abc'))



L = [2, 5, 7]

idx = np.cumsum(np.in1d(np.arange(len(df.index)), L))



for _, chunk in df.groupby(idx):

    print(chunk, 'n')



   a  b  c

0  0  1  2

1  3  4  5 



    a   b   c

2   6   7   8

3   9  10  11

4  12  13  14 



    a   b   c

5  15  16  17

6  18  19  20 



    a   b   c

7  21  22  23

Instead of defining a new variable for each dataframe, you can use a dictionary:

d = dict(tuple(df.groupby(idx)))



print(d[1])  # print second groupby value



    a   b   c

2   6   7   8

3   9  10  11

4  12  13  14

answered Nov 20 '18 at 14:04

jpp

92.2k2053103

You can create an array to use for indexing via NumPy:

import pandas as pd, numpy as np



df = pd.DataFrame(np.arange(24).reshape((8, 3)), columns=list('abc'))



L = [2, 5, 7]

idx = np.cumsum(np.in1d(np.arange(len(df.index)), L))



for _, chunk in df.groupby(idx):

    print(chunk, 'n')



   a  b  c

0  0  1  2

1  3  4  5 



    a   b   c

2   6   7   8

3   9  10  11

4  12  13  14 



    a   b   c

5  15  16  17

6  18  19  20 



    a   b   c

7  21  22  23

Instead of defining a new variable for each dataframe, you can use a dictionary:

d = dict(tuple(df.groupby(idx)))



print(d[1])  # print second groupby value



    a   b   c

2   6   7   8

3   9  10  11

4  12  13  14

answered Nov 20 '18 at 14:04

jpp

92.2k2053103

answered Nov 20 '18 at 14:04

jpp

92.2k2053103

answered Nov 20 '18 at 14:04

jpp

92.2k2053103

answered Nov 20 '18 at 14:04

jpp

92.2k2053103

add a comment |

I think this is what you need:

df = pd.DataFrame({'a': np.arange(1, 8),

                  'b': np.arange(1, 8),

                  'c': np.arange(1, 8)})

df.head()

    a   b   c

0   1   1   1

1   2   2   2

2   3   3   3

3   4   4   4

4   5   5   5

5   6   6   6

6   7   7   7



last_check = 0

dfs = 

for ind in [2, 5, 7]:

    dfs.append(df.loc[last_check:ind-1])

    last_check = ind

Although list comprehension are much more efficient than a for loop, the last_check is necessary if you don't have a pattern in your list of indices.

dfs[0]



    a   b   c

0   1   1   1

1   2   2   2



dfs[2]



    a   b   c

5   6   6   6

6   7   7   7

answered Nov 21 '18 at 9:37

Mohit Motwani

1,1111422

add a comment |

I think this is what you need:

df = pd.DataFrame({'a': np.arange(1, 8),

                  'b': np.arange(1, 8),

                  'c': np.arange(1, 8)})

df.head()

    a   b   c

0   1   1   1

1   2   2   2

2   3   3   3

3   4   4   4

4   5   5   5

5   6   6   6

6   7   7   7



last_check = 0

dfs = 

for ind in [2, 5, 7]:

    dfs.append(df.loc[last_check:ind-1])

    last_check = ind

Although list comprehension are much more efficient than a for loop, the last_check is necessary if you don't have a pattern in your list of indices.

dfs[0]



    a   b   c

0   1   1   1

1   2   2   2



dfs[2]



    a   b   c

5   6   6   6

6   7   7   7

answered Nov 21 '18 at 9:37

Mohit Motwani

1,1111422

add a comment |

I think this is what you need:

df = pd.DataFrame({'a': np.arange(1, 8),

                  'b': np.arange(1, 8),

                  'c': np.arange(1, 8)})

df.head()

    a   b   c

0   1   1   1

1   2   2   2

2   3   3   3

3   4   4   4

4   5   5   5

5   6   6   6

6   7   7   7



last_check = 0

dfs = 

for ind in [2, 5, 7]:

    dfs.append(df.loc[last_check:ind-1])

    last_check = ind

Although list comprehension are much more efficient than a for loop, the last_check is necessary if you don't have a pattern in your list of indices.

dfs[0]



    a   b   c

0   1   1   1

1   2   2   2



dfs[2]



    a   b   c

5   6   6   6

6   7   7   7

answered Nov 21 '18 at 9:37

Mohit Motwani

1,1111422

I think this is what you need:

df = pd.DataFrame({'a': np.arange(1, 8),

                  'b': np.arange(1, 8),

                  'c': np.arange(1, 8)})

df.head()

    a   b   c

0   1   1   1

1   2   2   2

2   3   3   3

3   4   4   4

4   5   5   5

5   6   6   6

6   7   7   7



last_check = 0

dfs = 

for ind in [2, 5, 7]:

    dfs.append(df.loc[last_check:ind-1])

    last_check = ind

Although list comprehension are much more efficient than a for loop, the last_check is necessary if you don't have a pattern in your list of indices.

dfs[0]



    a   b   c

0   1   1   1

1   2   2   2



dfs[2]



    a   b   c

5   6   6   6

6   7   7   7

answered Nov 21 '18 at 9:37

Mohit Motwani

1,1111422

answered Nov 21 '18 at 9:37

Mohit Motwani

1,1111422

answered Nov 21 '18 at 9:37

Mohit Motwani

1,1111422

answered Nov 21 '18 at 9:37

Mohit Motwani

1,1111422

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

4N 1Ge6q1rT8Hts45noQVwRJfWL1WRWSkn

搜尋此網誌

Argthtjtr