Python 3 pandas.groupby.filter

I am trying to perform a groupby filter that is very similar to the example in this documentation: pandas groupby filter

>>> df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',

...                           'foo', 'bar'],

...                    'B' : [1, 2, 3, 4, 5, 6],

...                    'C' : [2.0, 5., 8., 1., 2., 9.]})

>>> grouped = df.groupby('A')

>>> grouped.filter(lambda x: x['B'].mean() > 3.)

     A  B    C

1  bar  2  5.0

3  bar  4  1.0

5  bar  6  9.0

I am trying to return a DataFrame that has all 3 columns, but only 2 rows. Those 2 rows contain the minimum values of column B, after grouping by column A. I tried the following line of code:

grouped.filter(lambda x: x['B'] == x['B'].min())

But this doesn't work, and I get this error:
TypeError: filter function returned a Series, but expected a scalar bool

The DataFrame I am trying to return should look like this:

    A   B   C

0  foo  1  2.0

1  bar  2  5.0

I would appreciate any help you can provide. Thank you, in advance, for your help.

edited 1 hour ago

weliketocode

522411

asked 6 hours ago

FinProg

404

1

The doc string reading can seem a bit ambiguous: "Return a copy of a DataFrame excluding elements from groups that do not satisfy..." You aren't excluding elements from groups, you are excluding elements from the DataFrame of groups that do not satisfy the single condition.

– ALollz
5 hours ago

Thank you for help.

– FinProg
1 hour ago

@ALollz: please file a docbug to improve the docstring

– smci
1 hour ago

add a comment |

I am trying to perform a groupby filter that is very similar to the example in this documentation: pandas groupby filter

>>> df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',

...                           'foo', 'bar'],

...                    'B' : [1, 2, 3, 4, 5, 6],

...                    'C' : [2.0, 5., 8., 1., 2., 9.]})

>>> grouped = df.groupby('A')

>>> grouped.filter(lambda x: x['B'].mean() > 3.)

     A  B    C

1  bar  2  5.0

3  bar  4  1.0

5  bar  6  9.0

I am trying to return a DataFrame that has all 3 columns, but only 2 rows. Those 2 rows contain the minimum values of column B, after grouping by column A. I tried the following line of code:

grouped.filter(lambda x: x['B'] == x['B'].min())

But this doesn't work, and I get this error:
TypeError: filter function returned a Series, but expected a scalar bool

The DataFrame I am trying to return should look like this:

    A   B   C

0  foo  1  2.0

1  bar  2  5.0

I would appreciate any help you can provide. Thank you, in advance, for your help.

edited 1 hour ago

weliketocode

522411

asked 6 hours ago

FinProg

404

1

The doc string reading can seem a bit ambiguous: "Return a copy of a DataFrame excluding elements from groups that do not satisfy..." You aren't excluding elements from groups, you are excluding elements from the DataFrame of groups that do not satisfy the single condition.

– ALollz
5 hours ago

Thank you for help.

– FinProg
1 hour ago

@ALollz: please file a docbug to improve the docstring

– smci
1 hour ago

add a comment |

I am trying to perform a groupby filter that is very similar to the example in this documentation: pandas groupby filter

>>> df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',

...                           'foo', 'bar'],

...                    'B' : [1, 2, 3, 4, 5, 6],

...                    'C' : [2.0, 5., 8., 1., 2., 9.]})

>>> grouped = df.groupby('A')

>>> grouped.filter(lambda x: x['B'].mean() > 3.)

     A  B    C

1  bar  2  5.0

3  bar  4  1.0

5  bar  6  9.0

I am trying to return a DataFrame that has all 3 columns, but only 2 rows. Those 2 rows contain the minimum values of column B, after grouping by column A. I tried the following line of code:

grouped.filter(lambda x: x['B'] == x['B'].min())

But this doesn't work, and I get this error:
TypeError: filter function returned a Series, but expected a scalar bool

The DataFrame I am trying to return should look like this:

    A   B   C

0  foo  1  2.0

1  bar  2  5.0

I would appreciate any help you can provide. Thank you, in advance, for your help.

edited 1 hour ago

weliketocode

522411

asked 6 hours ago

FinProg

404

I am trying to perform a groupby filter that is very similar to the example in this documentation: pandas groupby filter

>>> df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',

...                           'foo', 'bar'],

...                    'B' : [1, 2, 3, 4, 5, 6],

...                    'C' : [2.0, 5., 8., 1., 2., 9.]})

>>> grouped = df.groupby('A')

>>> grouped.filter(lambda x: x['B'].mean() > 3.)

     A  B    C

1  bar  2  5.0

3  bar  4  1.0

5  bar  6  9.0

I am trying to return a DataFrame that has all 3 columns, but only 2 rows. Those 2 rows contain the minimum values of column B, after grouping by column A. I tried the following line of code:

grouped.filter(lambda x: x['B'] == x['B'].min())

But this doesn't work, and I get this error:
TypeError: filter function returned a Series, but expected a scalar bool

The DataFrame I am trying to return should look like this:

    A   B   C

0  foo  1  2.0

1  bar  2  5.0

I would appreciate any help you can provide. Thank you, in advance, for your help.

python pandas dataframe

edited 1 hour ago

weliketocode

522411

asked 6 hours ago

FinProg

404

edited 1 hour ago

weliketocode

522411

asked 6 hours ago

FinProg

404

edited 1 hour ago

weliketocode

522411

edited 1 hour ago

weliketocode

522411

edited 1 hour ago

weliketocode

522411

asked 6 hours ago

FinProg

404

asked 6 hours ago

FinProg

404

asked 6 hours ago

FinProg

404

1

The doc string reading can seem a bit ambiguous: "Return a copy of a DataFrame excluding elements from groups that do not satisfy..." You aren't excluding elements from groups, you are excluding elements from the DataFrame of groups that do not satisfy the single condition.

– ALollz
5 hours ago

Thank you for help.

– FinProg
1 hour ago

@ALollz: please file a docbug to improve the docstring

– smci
1 hour ago

add a comment |

1

The doc string reading can seem a bit ambiguous: "Return a copy of a DataFrame excluding elements from groups that do not satisfy..." You aren't excluding elements from groups, you are excluding elements from the DataFrame of groups that do not satisfy the single condition.

– ALollz
5 hours ago

Thank you for help.

– FinProg
1 hour ago

@ALollz: please file a docbug to improve the docstring

– smci
1 hour ago

The doc string reading can seem a bit ambiguous: "Return a copy of a DataFrame excluding elements from groups that do not satisfy..." You aren't excluding elements from groups, you are excluding elements from the DataFrame of groups that do not satisfy the single condition.

– ALollz
5 hours ago

Thank you for help.

– FinProg
1 hour ago

@ALollz: please file a docbug to improve the docstring

– smci
1 hour ago

add a comment |

5 Answers
5

active

oldest

votes

>>> df.loc[df.groupby('A')['B'].idxmin()]



     A  B    C

1  bar  2  5.0

0  foo  1  2.0

answered 3 hours ago

BallpointBen

3,6121438

Thank you very much for your solution.

– FinProg
1 hour ago

add a comment |

df.groupby('A').apply(lambda x: x.loc[x['B'].idxmin(), ['B','C']]).reset_index()

answered 6 hours ago

kudeh

31519

Thank you very much for your solution.

– FinProg
1 hour ago

add a comment |

No need groupby :-)

df.sort_values('B').drop_duplicates('A')

Out[288]: 

     A  B    C

0  foo  1  2.0

1  bar  2  5.0

answered 5 hours ago

Wen-Ben

110k83266

add a comment |

There's a fundamental difference: In the documentation example, there is a single Boolean value per group. That is, you return the entire group if the mean is greater than 3. In your example, you want to filter specific rows within a group.

For your task the usual trick is to sort values and use .head or .tail to filter to the row with the smallest or largest value respectively:

df.sort_values('B').groupby('A').head(1)



#     A  B    C

#0  foo  1  2.0

#1  bar  2  5.0

For more complicated queries you can use .transform or .apply to create a Boolean Series to slice. Also in this case safer if multiple rows share the minimum and you need all of them:

df[df.groupby('A').B.transform(lambda x: x == x.min())]



#     A  B    C

#0  foo  1  2.0

#1  bar  2  5.0

edited 5 hours ago

answered 5 hours ago

ALollz

13.4k31636

Thank you very much for your solutions. I really appreciate your help.

– FinProg
1 hour ago

add a comment |

The short answer:

grouped.apply(lambda x: x[x['B'] == x['B']].min())

... and the longer one:

Your grouped object has 2 groups:

In[25]: for df in grouped:

   ...:     print(df)

   ...:     

('bar',      

     A  B    C

1  bar  2  5.0

3  bar  4  1.0

5  bar  6  9.0)



('foo',      

     A  B    C

0  foo  1  2.0

2  foo  3  8.0

4  foo  5  2.0)

filter() method for GroupBy object is for filtering groups as entities, NOT for filtering their individual rows. So using the filter() method, you may obtain only 4 results:

an empty DataFrame (0 rows),

rows of the group 'bar' (3 rows),

rows of the group 'foo' (3 rows),

rows of both groups (6 rows)

Nothing else, regardless of the used parameter (boolean function) in the filter() method.

So you have to use some other method. An appropriate one is the very flexible apply() method, which lets you apply an arbitrary function which

takes a DataFrame (a group of GroupBy object) as its only parameter,

returns either a Pandas object or a scalar.

In your case that function should return (for every of your 2 groups) the 1-row DataFrame having the minimal value in the column 'B', so we will use the Boolean mask

group['B'] == group['B'].min()

for selecting such a row (or - maybe - more rows):

In[26]: def select_min_b(group):

   ...:     return group[group['B'] == group['B'].min()]

Now using this function as a parameter of the apply() method of GroupBy object grouped we will obtain

In[27]: grouped.apply(select_min_b)

Out[27]: 

         A  B    C

A                 

bar 1  bar  2  5.0

foo 0  foo  1  2.0

Note:

The same, but as only one command (using the lambda function):

grouped.apply(lambda group: group[group['B'] == group['B']].min())

edited 4 hours ago

answered 5 hours ago

MarianD

4,40761331

Wow, thank you so very much for your detailed solutions! Thank you for taking the time to provide me with such through explanations. I hope to return the favor one day.

– FinProg
1 hour ago

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54717473%2fpython-3-pandas-groupby-filter%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

5 Answers
5

active

oldest

votes

5 Answers
5

active

oldest

votes

>>> df.loc[df.groupby('A')['B'].idxmin()]



     A  B    C

1  bar  2  5.0

0  foo  1  2.0

answered 3 hours ago

BallpointBen

3,6121438

Thank you very much for your solution.

– FinProg
1 hour ago

add a comment |

>>> df.loc[df.groupby('A')['B'].idxmin()]



     A  B    C

1  bar  2  5.0

0  foo  1  2.0

answered 3 hours ago

BallpointBen

3,6121438

Thank you very much for your solution.

– FinProg
1 hour ago

add a comment |

>>> df.loc[df.groupby('A')['B'].idxmin()]



     A  B    C

1  bar  2  5.0

0  foo  1  2.0

answered 3 hours ago

BallpointBen

3,6121438

>>> df.loc[df.groupby('A')['B'].idxmin()]



     A  B    C

1  bar  2  5.0

0  foo  1  2.0

answered 3 hours ago

BallpointBen

3,6121438

answered 3 hours ago

BallpointBen

3,6121438

answered 3 hours ago

BallpointBen

3,6121438

answered 3 hours ago

BallpointBen

3,6121438

Thank you very much for your solution.

– FinProg
1 hour ago

add a comment |

Thank you very much for your solution.

– FinProg
1 hour ago

Thank you very much for your solution.

– FinProg
1 hour ago

add a comment |

df.groupby('A').apply(lambda x: x.loc[x['B'].idxmin(), ['B','C']]).reset_index()

answered 6 hours ago

kudeh

31519

Thank you very much for your solution.

– FinProg
1 hour ago

add a comment |

df.groupby('A').apply(lambda x: x.loc[x['B'].idxmin(), ['B','C']]).reset_index()

answered 6 hours ago

kudeh

31519

Thank you very much for your solution.

– FinProg
1 hour ago

add a comment |

df.groupby('A').apply(lambda x: x.loc[x['B'].idxmin(), ['B','C']]).reset_index()

answered 6 hours ago

kudeh

31519

df.groupby('A').apply(lambda x: x.loc[x['B'].idxmin(), ['B','C']]).reset_index()

answered 6 hours ago

kudeh

31519

answered 6 hours ago

kudeh

31519

answered 6 hours ago

kudeh

31519

answered 6 hours ago

kudeh

31519

Thank you very much for your solution.

– FinProg
1 hour ago

add a comment |

Thank you very much for your solution.

– FinProg
1 hour ago

Thank you very much for your solution.

– FinProg
1 hour ago

add a comment |

No need groupby :-)

df.sort_values('B').drop_duplicates('A')

Out[288]: 

     A  B    C

0  foo  1  2.0

1  bar  2  5.0

answered 5 hours ago

Wen-Ben

110k83266

add a comment |

No need groupby :-)

df.sort_values('B').drop_duplicates('A')

Out[288]: 

     A  B    C

0  foo  1  2.0

1  bar  2  5.0

answered 5 hours ago

Wen-Ben

110k83266

add a comment |

No need groupby :-)

df.sort_values('B').drop_duplicates('A')

Out[288]: 

     A  B    C

0  foo  1  2.0

1  bar  2  5.0

answered 5 hours ago

Wen-Ben

110k83266

No need groupby :-)

df.sort_values('B').drop_duplicates('A')

Out[288]: 

     A  B    C

0  foo  1  2.0

1  bar  2  5.0

answered 5 hours ago

Wen-Ben

110k83266

answered 5 hours ago

Wen-Ben

110k83266

answered 5 hours ago

Wen-Ben

110k83266

answered 5 hours ago

Wen-Ben

110k83266

add a comment |

For your task the usual trick is to sort values and use .head or .tail to filter to the row with the smallest or largest value respectively:

df.sort_values('B').groupby('A').head(1)



#     A  B    C

#0  foo  1  2.0

#1  bar  2  5.0

For more complicated queries you can use .transform or .apply to create a Boolean Series to slice. Also in this case safer if multiple rows share the minimum and you need all of them:

df[df.groupby('A').B.transform(lambda x: x == x.min())]



#     A  B    C

#0  foo  1  2.0

#1  bar  2  5.0

edited 5 hours ago

answered 5 hours ago

ALollz

13.4k31636

Thank you very much for your solutions. I really appreciate your help.

– FinProg
1 hour ago

add a comment |

For your task the usual trick is to sort values and use .head or .tail to filter to the row with the smallest or largest value respectively:

df.sort_values('B').groupby('A').head(1)



#     A  B    C

#0  foo  1  2.0

#1  bar  2  5.0

For more complicated queries you can use .transform or .apply to create a Boolean Series to slice. Also in this case safer if multiple rows share the minimum and you need all of them:

df[df.groupby('A').B.transform(lambda x: x == x.min())]



#     A  B    C

#0  foo  1  2.0

#1  bar  2  5.0

edited 5 hours ago

answered 5 hours ago

ALollz

13.4k31636

Thank you very much for your solutions. I really appreciate your help.

– FinProg
1 hour ago

add a comment |

For your task the usual trick is to sort values and use .head or .tail to filter to the row with the smallest or largest value respectively:

df.sort_values('B').groupby('A').head(1)



#     A  B    C

#0  foo  1  2.0

#1  bar  2  5.0

For more complicated queries you can use .transform or .apply to create a Boolean Series to slice. Also in this case safer if multiple rows share the minimum and you need all of them:

df[df.groupby('A').B.transform(lambda x: x == x.min())]



#     A  B    C

#0  foo  1  2.0

#1  bar  2  5.0

edited 5 hours ago

answered 5 hours ago

ALollz

13.4k31636

For your task the usual trick is to sort values and use .head or .tail to filter to the row with the smallest or largest value respectively:

df.sort_values('B').groupby('A').head(1)



#     A  B    C

#0  foo  1  2.0

#1  bar  2  5.0

For more complicated queries you can use .transform or .apply to create a Boolean Series to slice. Also in this case safer if multiple rows share the minimum and you need all of them:

df[df.groupby('A').B.transform(lambda x: x == x.min())]



#     A  B    C

#0  foo  1  2.0

#1  bar  2  5.0

edited 5 hours ago

answered 5 hours ago

ALollz

13.4k31636

edited 5 hours ago

answered 5 hours ago

ALollz

13.4k31636

answered 5 hours ago

ALollz

13.4k31636

answered 5 hours ago

ALollz

13.4k31636

Thank you very much for your solutions. I really appreciate your help.

– FinProg
1 hour ago

add a comment |

Thank you very much for your solutions. I really appreciate your help.

– FinProg
1 hour ago

Thank you very much for your solutions. I really appreciate your help.

– FinProg
1 hour ago

add a comment |

The short answer:

grouped.apply(lambda x: x[x['B'] == x['B']].min())

... and the longer one:

Your grouped object has 2 groups:

In[25]: for df in grouped:

   ...:     print(df)

   ...:     

('bar',      

     A  B    C

1  bar  2  5.0

3  bar  4  1.0

5  bar  6  9.0)



('foo',      

     A  B    C

0  foo  1  2.0

2  foo  3  8.0

4  foo  5  2.0)

filter() method for GroupBy object is for filtering groups as entities, NOT for filtering their individual rows. So using the filter() method, you may obtain only 4 results:

an empty DataFrame (0 rows),

rows of the group 'bar' (3 rows),

rows of the group 'foo' (3 rows),

rows of both groups (6 rows)

Nothing else, regardless of the used parameter (boolean function) in the filter() method.

So you have to use some other method. An appropriate one is the very flexible apply() method, which lets you apply an arbitrary function which

takes a DataFrame (a group of GroupBy object) as its only parameter,

returns either a Pandas object or a scalar.

In your case that function should return (for every of your 2 groups) the 1-row DataFrame having the minimal value in the column 'B', so we will use the Boolean mask

group['B'] == group['B'].min()

for selecting such a row (or - maybe - more rows):

In[26]: def select_min_b(group):

   ...:     return group[group['B'] == group['B'].min()]

Now using this function as a parameter of the apply() method of GroupBy object grouped we will obtain

In[27]: grouped.apply(select_min_b)

Out[27]: 

         A  B    C

A                 

bar 1  bar  2  5.0

foo 0  foo  1  2.0

Note:

The same, but as only one command (using the lambda function):

grouped.apply(lambda group: group[group['B'] == group['B']].min())

edited 4 hours ago

answered 5 hours ago

MarianD

4,40761331

Wow, thank you so very much for your detailed solutions! Thank you for taking the time to provide me with such through explanations. I hope to return the favor one day.

– FinProg
1 hour ago

add a comment |

The short answer:

grouped.apply(lambda x: x[x['B'] == x['B']].min())

... and the longer one:

Your grouped object has 2 groups:

In[25]: for df in grouped:

   ...:     print(df)

   ...:     

('bar',      

     A  B    C

1  bar  2  5.0

3  bar  4  1.0

5  bar  6  9.0)



('foo',      

     A  B    C

0  foo  1  2.0

2  foo  3  8.0

4  foo  5  2.0)

filter() method for GroupBy object is for filtering groups as entities, NOT for filtering their individual rows. So using the filter() method, you may obtain only 4 results:

an empty DataFrame (0 rows),

rows of the group 'bar' (3 rows),

rows of the group 'foo' (3 rows),

rows of both groups (6 rows)

Nothing else, regardless of the used parameter (boolean function) in the filter() method.

So you have to use some other method. An appropriate one is the very flexible apply() method, which lets you apply an arbitrary function which

takes a DataFrame (a group of GroupBy object) as its only parameter,

returns either a Pandas object or a scalar.

In your case that function should return (for every of your 2 groups) the 1-row DataFrame having the minimal value in the column 'B', so we will use the Boolean mask

group['B'] == group['B'].min()

for selecting such a row (or - maybe - more rows):

In[26]: def select_min_b(group):

   ...:     return group[group['B'] == group['B'].min()]

Now using this function as a parameter of the apply() method of GroupBy object grouped we will obtain

In[27]: grouped.apply(select_min_b)

Out[27]: 

         A  B    C

A                 

bar 1  bar  2  5.0

foo 0  foo  1  2.0

Note:

The same, but as only one command (using the lambda function):

grouped.apply(lambda group: group[group['B'] == group['B']].min())

edited 4 hours ago

answered 5 hours ago

MarianD

4,40761331

Wow, thank you so very much for your detailed solutions! Thank you for taking the time to provide me with such through explanations. I hope to return the favor one day.

– FinProg
1 hour ago

add a comment |

The short answer:

grouped.apply(lambda x: x[x['B'] == x['B']].min())

... and the longer one:

Your grouped object has 2 groups:

In[25]: for df in grouped:

   ...:     print(df)

   ...:     

('bar',      

     A  B    C

1  bar  2  5.0

3  bar  4  1.0

5  bar  6  9.0)



('foo',      

     A  B    C

0  foo  1  2.0

2  foo  3  8.0

4  foo  5  2.0)

filter() method for GroupBy object is for filtering groups as entities, NOT for filtering their individual rows. So using the filter() method, you may obtain only 4 results:

an empty DataFrame (0 rows),

rows of the group 'bar' (3 rows),

rows of the group 'foo' (3 rows),

rows of both groups (6 rows)

Nothing else, regardless of the used parameter (boolean function) in the filter() method.

So you have to use some other method. An appropriate one is the very flexible apply() method, which lets you apply an arbitrary function which

takes a DataFrame (a group of GroupBy object) as its only parameter,

returns either a Pandas object or a scalar.

In your case that function should return (for every of your 2 groups) the 1-row DataFrame having the minimal value in the column 'B', so we will use the Boolean mask

group['B'] == group['B'].min()

for selecting such a row (or - maybe - more rows):

In[26]: def select_min_b(group):

   ...:     return group[group['B'] == group['B'].min()]

Now using this function as a parameter of the apply() method of GroupBy object grouped we will obtain

In[27]: grouped.apply(select_min_b)

Out[27]: 

         A  B    C

A                 

bar 1  bar  2  5.0

foo 0  foo  1  2.0

Note:

The same, but as only one command (using the lambda function):

grouped.apply(lambda group: group[group['B'] == group['B']].min())

edited 4 hours ago

answered 5 hours ago

MarianD

4,40761331

The short answer:

grouped.apply(lambda x: x[x['B'] == x['B']].min())

... and the longer one:

Your grouped object has 2 groups:

In[25]: for df in grouped:

   ...:     print(df)

   ...:     

('bar',      

     A  B    C

1  bar  2  5.0

3  bar  4  1.0

5  bar  6  9.0)



('foo',      

     A  B    C

0  foo  1  2.0

2  foo  3  8.0

4  foo  5  2.0)

filter() method for GroupBy object is for filtering groups as entities, NOT for filtering their individual rows. So using the filter() method, you may obtain only 4 results:

an empty DataFrame (0 rows),

rows of the group 'bar' (3 rows),

rows of the group 'foo' (3 rows),

rows of both groups (6 rows)

Nothing else, regardless of the used parameter (boolean function) in the filter() method.

So you have to use some other method. An appropriate one is the very flexible apply() method, which lets you apply an arbitrary function which

takes a DataFrame (a group of GroupBy object) as its only parameter,

returns either a Pandas object or a scalar.

In your case that function should return (for every of your 2 groups) the 1-row DataFrame having the minimal value in the column 'B', so we will use the Boolean mask

group['B'] == group['B'].min()

for selecting such a row (or - maybe - more rows):

In[26]: def select_min_b(group):

   ...:     return group[group['B'] == group['B'].min()]

Now using this function as a parameter of the apply() method of GroupBy object grouped we will obtain

In[27]: grouped.apply(select_min_b)

Out[27]: 

         A  B    C

A                 

bar 1  bar  2  5.0

foo 0  foo  1  2.0

Note:

The same, but as only one command (using the lambda function):

grouped.apply(lambda group: group[group['B'] == group['B']].min())

edited 4 hours ago

answered 5 hours ago

MarianD

4,40761331

edited 4 hours ago

answered 5 hours ago

MarianD

4,40761331

answered 5 hours ago

MarianD

4,40761331

answered 5 hours ago

MarianD

4,40761331

Wow, thank you so very much for your detailed solutions! Thank you for taking the time to provide me with such through explanations. I hope to return the favor one day.

– FinProg
1 hour ago

add a comment |

Wow, thank you so very much for your detailed solutions! Thank you for taking the time to provide me with such through explanations. I hope to return the favor one day.

– FinProg
1 hour ago

Wow, thank you so very much for your detailed solutions! Thank you for taking the time to provide me with such through explanations. I hope to return the favor one day.

– FinProg
1 hour ago

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Argthtjtr