dropping dataframe rows based on values in other dataframe
I am working on IPL dataset from Kaggle (https://www.kaggle.com/manasgarg/ipl). It has two .csv files with a primary key to connect the data.
I want to drop rows where batting team has lost the match.
df_deliv has batting team
df_match has the winner of the match
I achieved it using the below code but its very slow due to the for loop.
import pandas as pd
import numpy as np
df_deliv = pd.read_csv("deliveries.csv")
df_match = pd.read_csv("matches.csv")
df_deliv = df_deliv[["match_id", "batting_team", "batsman", "batsman_runs"]]
df_deliv["winner"] = [df_match.loc[i-1]["winner"] for i in df_deliv["match_id"]] #makes it very slow
df_deliv.drop(df_deliv[df_deliv["batting_team"] != df_deliv["winner"]].index, inplace = True)
print(df_deliv)
is there a way to do in one df.drop statement rather than the for loop???
python pandas
add a comment |
I am working on IPL dataset from Kaggle (https://www.kaggle.com/manasgarg/ipl). It has two .csv files with a primary key to connect the data.
I want to drop rows where batting team has lost the match.
df_deliv has batting team
df_match has the winner of the match
I achieved it using the below code but its very slow due to the for loop.
import pandas as pd
import numpy as np
df_deliv = pd.read_csv("deliveries.csv")
df_match = pd.read_csv("matches.csv")
df_deliv = df_deliv[["match_id", "batting_team", "batsman", "batsman_runs"]]
df_deliv["winner"] = [df_match.loc[i-1]["winner"] for i in df_deliv["match_id"]] #makes it very slow
df_deliv.drop(df_deliv[df_deliv["batting_team"] != df_deliv["winner"]].index, inplace = True)
print(df_deliv)
is there a way to do in one df.drop statement rather than the for loop???
python pandas
3
Please, post a reproducible example. Why don't you join them and then just filter by the conditions you want instead of using a drop ?
– Manrique
Nov 21 '18 at 17:44
You could probably join the two dataframes usingmerge()
. Please postdf_deliv.head()
anddf_match.head()
so we can see structure of dataframes and offer a more complete solution.
– Gal Sivan
Nov 21 '18 at 17:45
@AntonioManrique sir, i am very new to asking questions and to data science... please let me know what is a reproducible example.
– Yash Mishra
Nov 21 '18 at 18:29
@YashMishra of course i can :) It's basically to post the code that allow's us to reproduce your dataset and your error. Here you have a better explanation: stackoverflow.com/questions/20109391/…
– Manrique
Nov 21 '18 at 19:03
add a comment |
I am working on IPL dataset from Kaggle (https://www.kaggle.com/manasgarg/ipl). It has two .csv files with a primary key to connect the data.
I want to drop rows where batting team has lost the match.
df_deliv has batting team
df_match has the winner of the match
I achieved it using the below code but its very slow due to the for loop.
import pandas as pd
import numpy as np
df_deliv = pd.read_csv("deliveries.csv")
df_match = pd.read_csv("matches.csv")
df_deliv = df_deliv[["match_id", "batting_team", "batsman", "batsman_runs"]]
df_deliv["winner"] = [df_match.loc[i-1]["winner"] for i in df_deliv["match_id"]] #makes it very slow
df_deliv.drop(df_deliv[df_deliv["batting_team"] != df_deliv["winner"]].index, inplace = True)
print(df_deliv)
is there a way to do in one df.drop statement rather than the for loop???
python pandas
I am working on IPL dataset from Kaggle (https://www.kaggle.com/manasgarg/ipl). It has two .csv files with a primary key to connect the data.
I want to drop rows where batting team has lost the match.
df_deliv has batting team
df_match has the winner of the match
I achieved it using the below code but its very slow due to the for loop.
import pandas as pd
import numpy as np
df_deliv = pd.read_csv("deliveries.csv")
df_match = pd.read_csv("matches.csv")
df_deliv = df_deliv[["match_id", "batting_team", "batsman", "batsman_runs"]]
df_deliv["winner"] = [df_match.loc[i-1]["winner"] for i in df_deliv["match_id"]] #makes it very slow
df_deliv.drop(df_deliv[df_deliv["batting_team"] != df_deliv["winner"]].index, inplace = True)
print(df_deliv)
is there a way to do in one df.drop statement rather than the for loop???
python pandas
python pandas
edited Nov 21 '18 at 18:46
Yash Mishra
asked Nov 21 '18 at 17:42
Yash MishraYash Mishra
264
264
3
Please, post a reproducible example. Why don't you join them and then just filter by the conditions you want instead of using a drop ?
– Manrique
Nov 21 '18 at 17:44
You could probably join the two dataframes usingmerge()
. Please postdf_deliv.head()
anddf_match.head()
so we can see structure of dataframes and offer a more complete solution.
– Gal Sivan
Nov 21 '18 at 17:45
@AntonioManrique sir, i am very new to asking questions and to data science... please let me know what is a reproducible example.
– Yash Mishra
Nov 21 '18 at 18:29
@YashMishra of course i can :) It's basically to post the code that allow's us to reproduce your dataset and your error. Here you have a better explanation: stackoverflow.com/questions/20109391/…
– Manrique
Nov 21 '18 at 19:03
add a comment |
3
Please, post a reproducible example. Why don't you join them and then just filter by the conditions you want instead of using a drop ?
– Manrique
Nov 21 '18 at 17:44
You could probably join the two dataframes usingmerge()
. Please postdf_deliv.head()
anddf_match.head()
so we can see structure of dataframes and offer a more complete solution.
– Gal Sivan
Nov 21 '18 at 17:45
@AntonioManrique sir, i am very new to asking questions and to data science... please let me know what is a reproducible example.
– Yash Mishra
Nov 21 '18 at 18:29
@YashMishra of course i can :) It's basically to post the code that allow's us to reproduce your dataset and your error. Here you have a better explanation: stackoverflow.com/questions/20109391/…
– Manrique
Nov 21 '18 at 19:03
3
3
Please, post a reproducible example. Why don't you join them and then just filter by the conditions you want instead of using a drop ?
– Manrique
Nov 21 '18 at 17:44
Please, post a reproducible example. Why don't you join them and then just filter by the conditions you want instead of using a drop ?
– Manrique
Nov 21 '18 at 17:44
You could probably join the two dataframes using
merge()
. Please post df_deliv.head()
and df_match.head()
so we can see structure of dataframes and offer a more complete solution.– Gal Sivan
Nov 21 '18 at 17:45
You could probably join the two dataframes using
merge()
. Please post df_deliv.head()
and df_match.head()
so we can see structure of dataframes and offer a more complete solution.– Gal Sivan
Nov 21 '18 at 17:45
@AntonioManrique sir, i am very new to asking questions and to data science... please let me know what is a reproducible example.
– Yash Mishra
Nov 21 '18 at 18:29
@AntonioManrique sir, i am very new to asking questions and to data science... please let me know what is a reproducible example.
– Yash Mishra
Nov 21 '18 at 18:29
@YashMishra of course i can :) It's basically to post the code that allow's us to reproduce your dataset and your error. Here you have a better explanation: stackoverflow.com/questions/20109391/…
– Manrique
Nov 21 '18 at 19:03
@YashMishra of course i can :) It's basically to post the code that allow's us to reproduce your dataset and your error. Here you have a better explanation: stackoverflow.com/questions/20109391/…
– Manrique
Nov 21 '18 at 19:03
add a comment |
1 Answer
1
active
oldest
votes
Instead of droping, you can just filter the rows that you need. Something like this:
df_deliv = df_deliv[df_deliv['batting_team']==df_deliv['winner']]
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53417804%2fdropping-dataframe-rows-based-on-values-in-other-dataframe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Instead of droping, you can just filter the rows that you need. Something like this:
df_deliv = df_deliv[df_deliv['batting_team']==df_deliv['winner']]
add a comment |
Instead of droping, you can just filter the rows that you need. Something like this:
df_deliv = df_deliv[df_deliv['batting_team']==df_deliv['winner']]
add a comment |
Instead of droping, you can just filter the rows that you need. Something like this:
df_deliv = df_deliv[df_deliv['batting_team']==df_deliv['winner']]
Instead of droping, you can just filter the rows that you need. Something like this:
df_deliv = df_deliv[df_deliv['batting_team']==df_deliv['winner']]
answered Nov 21 '18 at 17:52
RonnieRonnie
568
568
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53417804%2fdropping-dataframe-rows-based-on-values-in-other-dataframe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
3
Please, post a reproducible example. Why don't you join them and then just filter by the conditions you want instead of using a drop ?
– Manrique
Nov 21 '18 at 17:44
You could probably join the two dataframes using
merge()
. Please postdf_deliv.head()
anddf_match.head()
so we can see structure of dataframes and offer a more complete solution.– Gal Sivan
Nov 21 '18 at 17:45
@AntonioManrique sir, i am very new to asking questions and to data science... please let me know what is a reproducible example.
– Yash Mishra
Nov 21 '18 at 18:29
@YashMishra of course i can :) It's basically to post the code that allow's us to reproduce your dataset and your error. Here you have a better explanation: stackoverflow.com/questions/20109391/…
– Manrique
Nov 21 '18 at 19:03