How to implement 'in' and 'not in' for Pandas dataframe












182














How can I achieve the equivalents of SQL's IN and NOT IN?



I have a list with the required values.
Here's the scenario:



df = pd.DataFrame({'countries':['US','UK','Germany','China']})
countries = ['UK','China']

# pseudo-code:
df[df['countries'] not in countries]


My current way of doing this is as follows:



df = pd.DataFrame({'countries':['US','UK','Germany','China']})
countries = pd.DataFrame({'countries':['UK','China'], 'matched':True})

# IN
df.merge(countries,how='inner',on='countries')

# NOT IN
not_in = df.merge(countries,how='left',on='countries')
not_in = not_in[pd.isnull(not_in['matched'])]


But this seems like a horrible kludge. Can anyone improve on it?










share|improve this question




















  • 1




    I think your solution is the best solution. Yours can cover IN, NOT_IN of multiple columns.
    – Bruce Jung
    Mar 17 '15 at 1:55










  • Do you want to test on single column or multiple columns?
    – smci
    Jul 17 '15 at 20:26










  • Related (performance / pandas internals): Pandas pd.Series.isin performance with set versus array
    – jpp
    Jun 28 at 0:06
















182














How can I achieve the equivalents of SQL's IN and NOT IN?



I have a list with the required values.
Here's the scenario:



df = pd.DataFrame({'countries':['US','UK','Germany','China']})
countries = ['UK','China']

# pseudo-code:
df[df['countries'] not in countries]


My current way of doing this is as follows:



df = pd.DataFrame({'countries':['US','UK','Germany','China']})
countries = pd.DataFrame({'countries':['UK','China'], 'matched':True})

# IN
df.merge(countries,how='inner',on='countries')

# NOT IN
not_in = df.merge(countries,how='left',on='countries')
not_in = not_in[pd.isnull(not_in['matched'])]


But this seems like a horrible kludge. Can anyone improve on it?










share|improve this question




















  • 1




    I think your solution is the best solution. Yours can cover IN, NOT_IN of multiple columns.
    – Bruce Jung
    Mar 17 '15 at 1:55










  • Do you want to test on single column or multiple columns?
    – smci
    Jul 17 '15 at 20:26










  • Related (performance / pandas internals): Pandas pd.Series.isin performance with set versus array
    – jpp
    Jun 28 at 0:06














182












182








182


54





How can I achieve the equivalents of SQL's IN and NOT IN?



I have a list with the required values.
Here's the scenario:



df = pd.DataFrame({'countries':['US','UK','Germany','China']})
countries = ['UK','China']

# pseudo-code:
df[df['countries'] not in countries]


My current way of doing this is as follows:



df = pd.DataFrame({'countries':['US','UK','Germany','China']})
countries = pd.DataFrame({'countries':['UK','China'], 'matched':True})

# IN
df.merge(countries,how='inner',on='countries')

# NOT IN
not_in = df.merge(countries,how='left',on='countries')
not_in = not_in[pd.isnull(not_in['matched'])]


But this seems like a horrible kludge. Can anyone improve on it?










share|improve this question















How can I achieve the equivalents of SQL's IN and NOT IN?



I have a list with the required values.
Here's the scenario:



df = pd.DataFrame({'countries':['US','UK','Germany','China']})
countries = ['UK','China']

# pseudo-code:
df[df['countries'] not in countries]


My current way of doing this is as follows:



df = pd.DataFrame({'countries':['US','UK','Germany','China']})
countries = pd.DataFrame({'countries':['UK','China'], 'matched':True})

# IN
df.merge(countries,how='inner',on='countries')

# NOT IN
not_in = df.merge(countries,how='left',on='countries')
not_in = not_in[pd.isnull(not_in['matched'])]


But this seems like a horrible kludge. Can anyone improve on it?







python pandas dataframe sql-function






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jul 17 '15 at 20:25









smci

14.6k672104




14.6k672104










asked Nov 13 '13 at 17:11









LondonRob

26.1k1471111




26.1k1471111








  • 1




    I think your solution is the best solution. Yours can cover IN, NOT_IN of multiple columns.
    – Bruce Jung
    Mar 17 '15 at 1:55










  • Do you want to test on single column or multiple columns?
    – smci
    Jul 17 '15 at 20:26










  • Related (performance / pandas internals): Pandas pd.Series.isin performance with set versus array
    – jpp
    Jun 28 at 0:06














  • 1




    I think your solution is the best solution. Yours can cover IN, NOT_IN of multiple columns.
    – Bruce Jung
    Mar 17 '15 at 1:55










  • Do you want to test on single column or multiple columns?
    – smci
    Jul 17 '15 at 20:26










  • Related (performance / pandas internals): Pandas pd.Series.isin performance with set versus array
    – jpp
    Jun 28 at 0:06








1




1




I think your solution is the best solution. Yours can cover IN, NOT_IN of multiple columns.
– Bruce Jung
Mar 17 '15 at 1:55




I think your solution is the best solution. Yours can cover IN, NOT_IN of multiple columns.
– Bruce Jung
Mar 17 '15 at 1:55












Do you want to test on single column or multiple columns?
– smci
Jul 17 '15 at 20:26




Do you want to test on single column or multiple columns?
– smci
Jul 17 '15 at 20:26












Related (performance / pandas internals): Pandas pd.Series.isin performance with set versus array
– jpp
Jun 28 at 0:06




Related (performance / pandas internals): Pandas pd.Series.isin performance with set versus array
– jpp
Jun 28 at 0:06












5 Answers
5






active

oldest

votes


















413














You can use pd.Series.isin.



For "IN" use: something.isin(somewhere)



Or for "NOT IN": ~something.isin(somewhere)



As a worked example:



>>> df
countries
0 US
1 UK
2 Germany
3 China
>>> countries
['UK', 'China']
>>> df.countries.isin(countries)
0 False
1 True
2 False
3 True
Name: countries, dtype: bool
>>> df[df.countries.isin(countries)]
countries
1 UK
3 China
>>> df[~df.countries.isin(countries)]
countries
0 US
2 Germany





share|improve this answer



















  • 27




    isin is not inverse sin()? :D
    – Kos
    Nov 13 '13 at 17:15






  • 1




    Just an FYI, the @LondonRob had his as a DataFrame and yours is a Series. DataFrame's isin was added in .13.
    – TomAugspurger
    Nov 13 '13 at 18:07










  • Any suggestions for how to do this with pandas 0.12.0? It's the current released version. (Maybe I should just wait for 0.13?!)
    – LondonRob
    Nov 13 '13 at 18:41






  • 2




    @TomAugspurger: like usual, I'm probably missing something. df, both mine and his, is a DataFrame. countries is a list. df[~df.countries.isin(countries)] produces a DataFrame, not a Series, and seems to work even back in 0.11.0.dev-14a04dd.
    – DSM
    Nov 14 '13 at 16:10






  • 2




    This answer is confusing because you keep reusing the countries variable. Well, the OP does it, and that's inherited, but that something is done badly before does not justify doing it badly now.
    – ifly6
    May 18 at 22:20





















17














Alternative solution that uses .query() method:



In [5]: df.query("countries in @countries")
Out[5]:
countries
1 UK
3 China

In [6]: df.query("countries not in @countries")
Out[6]:
countries
0 US
2 Germany





share|improve this answer

















  • 3




    Note that this is currently marked as "experimental" in the docs...
    – LondonRob
    Jul 19 '17 at 14:49



















9














I've been usually doing generic filtering over rows like this:



criterion = lambda row: row['countries'] not in countries
not_in = df[df.apply(criterion, axis=1)]





share|improve this answer

















  • 6




    FYI, this is much slower than @DSM soln which is vectorized
    – Jeff
    Nov 13 '13 at 17:47










  • @Jeff I'd expect that, but that's what I fall back to when I need to filter over something unavailable in pandas directly. (I was about to say "like .startwith or regex matching, but just found out about Series.str that has all of that!)
    – Kos
    Nov 14 '13 at 7:42



















1














I wanted to filter out dfbc rows that had a BUSINESS_ID that was also in the BUSINESS_ID of dfProfilesBusIds



Finally got it working:



dfbc = dfbc[(dfbc['BUSINESS_ID'].isin(dfProfilesBusIds['BUSINESS_ID']) == False)]





share|improve this answer

















  • 3




    You can negate the isin (as done in the accepted answer) rather than comparing to False
    – cricket_007
    Jul 19 '17 at 12:17










  • This solution is working for me. Thank you
    – Malek B.
    Jun 21 at 13:33



















1














df = pd.DataFrame({'countries':['US','UK','Germany','China']})
countries = ['UK','China']


implement in:



df[df.countries.isin(countries)]


implement not in as in of rest countries:



df[df.countries.isin([x for x in np.unique(df.countries) if x not in countries])]





share|improve this answer





















    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f19960077%2fhow-to-implement-in-and-not-in-for-pandas-dataframe%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    5 Answers
    5






    active

    oldest

    votes








    5 Answers
    5






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    413














    You can use pd.Series.isin.



    For "IN" use: something.isin(somewhere)



    Or for "NOT IN": ~something.isin(somewhere)



    As a worked example:



    >>> df
    countries
    0 US
    1 UK
    2 Germany
    3 China
    >>> countries
    ['UK', 'China']
    >>> df.countries.isin(countries)
    0 False
    1 True
    2 False
    3 True
    Name: countries, dtype: bool
    >>> df[df.countries.isin(countries)]
    countries
    1 UK
    3 China
    >>> df[~df.countries.isin(countries)]
    countries
    0 US
    2 Germany





    share|improve this answer



















    • 27




      isin is not inverse sin()? :D
      – Kos
      Nov 13 '13 at 17:15






    • 1




      Just an FYI, the @LondonRob had his as a DataFrame and yours is a Series. DataFrame's isin was added in .13.
      – TomAugspurger
      Nov 13 '13 at 18:07










    • Any suggestions for how to do this with pandas 0.12.0? It's the current released version. (Maybe I should just wait for 0.13?!)
      – LondonRob
      Nov 13 '13 at 18:41






    • 2




      @TomAugspurger: like usual, I'm probably missing something. df, both mine and his, is a DataFrame. countries is a list. df[~df.countries.isin(countries)] produces a DataFrame, not a Series, and seems to work even back in 0.11.0.dev-14a04dd.
      – DSM
      Nov 14 '13 at 16:10






    • 2




      This answer is confusing because you keep reusing the countries variable. Well, the OP does it, and that's inherited, but that something is done badly before does not justify doing it badly now.
      – ifly6
      May 18 at 22:20


















    413














    You can use pd.Series.isin.



    For "IN" use: something.isin(somewhere)



    Or for "NOT IN": ~something.isin(somewhere)



    As a worked example:



    >>> df
    countries
    0 US
    1 UK
    2 Germany
    3 China
    >>> countries
    ['UK', 'China']
    >>> df.countries.isin(countries)
    0 False
    1 True
    2 False
    3 True
    Name: countries, dtype: bool
    >>> df[df.countries.isin(countries)]
    countries
    1 UK
    3 China
    >>> df[~df.countries.isin(countries)]
    countries
    0 US
    2 Germany





    share|improve this answer



















    • 27




      isin is not inverse sin()? :D
      – Kos
      Nov 13 '13 at 17:15






    • 1




      Just an FYI, the @LondonRob had his as a DataFrame and yours is a Series. DataFrame's isin was added in .13.
      – TomAugspurger
      Nov 13 '13 at 18:07










    • Any suggestions for how to do this with pandas 0.12.0? It's the current released version. (Maybe I should just wait for 0.13?!)
      – LondonRob
      Nov 13 '13 at 18:41






    • 2




      @TomAugspurger: like usual, I'm probably missing something. df, both mine and his, is a DataFrame. countries is a list. df[~df.countries.isin(countries)] produces a DataFrame, not a Series, and seems to work even back in 0.11.0.dev-14a04dd.
      – DSM
      Nov 14 '13 at 16:10






    • 2




      This answer is confusing because you keep reusing the countries variable. Well, the OP does it, and that's inherited, but that something is done badly before does not justify doing it badly now.
      – ifly6
      May 18 at 22:20
















    413












    413








    413






    You can use pd.Series.isin.



    For "IN" use: something.isin(somewhere)



    Or for "NOT IN": ~something.isin(somewhere)



    As a worked example:



    >>> df
    countries
    0 US
    1 UK
    2 Germany
    3 China
    >>> countries
    ['UK', 'China']
    >>> df.countries.isin(countries)
    0 False
    1 True
    2 False
    3 True
    Name: countries, dtype: bool
    >>> df[df.countries.isin(countries)]
    countries
    1 UK
    3 China
    >>> df[~df.countries.isin(countries)]
    countries
    0 US
    2 Germany





    share|improve this answer














    You can use pd.Series.isin.



    For "IN" use: something.isin(somewhere)



    Or for "NOT IN": ~something.isin(somewhere)



    As a worked example:



    >>> df
    countries
    0 US
    1 UK
    2 Germany
    3 China
    >>> countries
    ['UK', 'China']
    >>> df.countries.isin(countries)
    0 False
    1 True
    2 False
    3 True
    Name: countries, dtype: bool
    >>> df[df.countries.isin(countries)]
    countries
    1 UK
    3 China
    >>> df[~df.countries.isin(countries)]
    countries
    0 US
    2 Germany






    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Apr 15 at 17:52









    jpp

    90.9k2052102




    90.9k2052102










    answered Nov 13 '13 at 17:13









    DSM

    205k34390368




    205k34390368








    • 27




      isin is not inverse sin()? :D
      – Kos
      Nov 13 '13 at 17:15






    • 1




      Just an FYI, the @LondonRob had his as a DataFrame and yours is a Series. DataFrame's isin was added in .13.
      – TomAugspurger
      Nov 13 '13 at 18:07










    • Any suggestions for how to do this with pandas 0.12.0? It's the current released version. (Maybe I should just wait for 0.13?!)
      – LondonRob
      Nov 13 '13 at 18:41






    • 2




      @TomAugspurger: like usual, I'm probably missing something. df, both mine and his, is a DataFrame. countries is a list. df[~df.countries.isin(countries)] produces a DataFrame, not a Series, and seems to work even back in 0.11.0.dev-14a04dd.
      – DSM
      Nov 14 '13 at 16:10






    • 2




      This answer is confusing because you keep reusing the countries variable. Well, the OP does it, and that's inherited, but that something is done badly before does not justify doing it badly now.
      – ifly6
      May 18 at 22:20
















    • 27




      isin is not inverse sin()? :D
      – Kos
      Nov 13 '13 at 17:15






    • 1




      Just an FYI, the @LondonRob had his as a DataFrame and yours is a Series. DataFrame's isin was added in .13.
      – TomAugspurger
      Nov 13 '13 at 18:07










    • Any suggestions for how to do this with pandas 0.12.0? It's the current released version. (Maybe I should just wait for 0.13?!)
      – LondonRob
      Nov 13 '13 at 18:41






    • 2




      @TomAugspurger: like usual, I'm probably missing something. df, both mine and his, is a DataFrame. countries is a list. df[~df.countries.isin(countries)] produces a DataFrame, not a Series, and seems to work even back in 0.11.0.dev-14a04dd.
      – DSM
      Nov 14 '13 at 16:10






    • 2




      This answer is confusing because you keep reusing the countries variable. Well, the OP does it, and that's inherited, but that something is done badly before does not justify doing it badly now.
      – ifly6
      May 18 at 22:20










    27




    27




    isin is not inverse sin()? :D
    – Kos
    Nov 13 '13 at 17:15




    isin is not inverse sin()? :D
    – Kos
    Nov 13 '13 at 17:15




    1




    1




    Just an FYI, the @LondonRob had his as a DataFrame and yours is a Series. DataFrame's isin was added in .13.
    – TomAugspurger
    Nov 13 '13 at 18:07




    Just an FYI, the @LondonRob had his as a DataFrame and yours is a Series. DataFrame's isin was added in .13.
    – TomAugspurger
    Nov 13 '13 at 18:07












    Any suggestions for how to do this with pandas 0.12.0? It's the current released version. (Maybe I should just wait for 0.13?!)
    – LondonRob
    Nov 13 '13 at 18:41




    Any suggestions for how to do this with pandas 0.12.0? It's the current released version. (Maybe I should just wait for 0.13?!)
    – LondonRob
    Nov 13 '13 at 18:41




    2




    2




    @TomAugspurger: like usual, I'm probably missing something. df, both mine and his, is a DataFrame. countries is a list. df[~df.countries.isin(countries)] produces a DataFrame, not a Series, and seems to work even back in 0.11.0.dev-14a04dd.
    – DSM
    Nov 14 '13 at 16:10




    @TomAugspurger: like usual, I'm probably missing something. df, both mine and his, is a DataFrame. countries is a list. df[~df.countries.isin(countries)] produces a DataFrame, not a Series, and seems to work even back in 0.11.0.dev-14a04dd.
    – DSM
    Nov 14 '13 at 16:10




    2




    2




    This answer is confusing because you keep reusing the countries variable. Well, the OP does it, and that's inherited, but that something is done badly before does not justify doing it badly now.
    – ifly6
    May 18 at 22:20






    This answer is confusing because you keep reusing the countries variable. Well, the OP does it, and that's inherited, but that something is done badly before does not justify doing it badly now.
    – ifly6
    May 18 at 22:20















    17














    Alternative solution that uses .query() method:



    In [5]: df.query("countries in @countries")
    Out[5]:
    countries
    1 UK
    3 China

    In [6]: df.query("countries not in @countries")
    Out[6]:
    countries
    0 US
    2 Germany





    share|improve this answer

















    • 3




      Note that this is currently marked as "experimental" in the docs...
      – LondonRob
      Jul 19 '17 at 14:49
















    17














    Alternative solution that uses .query() method:



    In [5]: df.query("countries in @countries")
    Out[5]:
    countries
    1 UK
    3 China

    In [6]: df.query("countries not in @countries")
    Out[6]:
    countries
    0 US
    2 Germany





    share|improve this answer

















    • 3




      Note that this is currently marked as "experimental" in the docs...
      – LondonRob
      Jul 19 '17 at 14:49














    17












    17








    17






    Alternative solution that uses .query() method:



    In [5]: df.query("countries in @countries")
    Out[5]:
    countries
    1 UK
    3 China

    In [6]: df.query("countries not in @countries")
    Out[6]:
    countries
    0 US
    2 Germany





    share|improve this answer












    Alternative solution that uses .query() method:



    In [5]: df.query("countries in @countries")
    Out[5]:
    countries
    1 UK
    3 China

    In [6]: df.query("countries not in @countries")
    Out[6]:
    countries
    0 US
    2 Germany






    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Jul 19 '17 at 12:19









    MaxU

    119k11108166




    119k11108166








    • 3




      Note that this is currently marked as "experimental" in the docs...
      – LondonRob
      Jul 19 '17 at 14:49














    • 3




      Note that this is currently marked as "experimental" in the docs...
      – LondonRob
      Jul 19 '17 at 14:49








    3




    3




    Note that this is currently marked as "experimental" in the docs...
    – LondonRob
    Jul 19 '17 at 14:49




    Note that this is currently marked as "experimental" in the docs...
    – LondonRob
    Jul 19 '17 at 14:49











    9














    I've been usually doing generic filtering over rows like this:



    criterion = lambda row: row['countries'] not in countries
    not_in = df[df.apply(criterion, axis=1)]





    share|improve this answer

















    • 6




      FYI, this is much slower than @DSM soln which is vectorized
      – Jeff
      Nov 13 '13 at 17:47










    • @Jeff I'd expect that, but that's what I fall back to when I need to filter over something unavailable in pandas directly. (I was about to say "like .startwith or regex matching, but just found out about Series.str that has all of that!)
      – Kos
      Nov 14 '13 at 7:42
















    9














    I've been usually doing generic filtering over rows like this:



    criterion = lambda row: row['countries'] not in countries
    not_in = df[df.apply(criterion, axis=1)]





    share|improve this answer

















    • 6




      FYI, this is much slower than @DSM soln which is vectorized
      – Jeff
      Nov 13 '13 at 17:47










    • @Jeff I'd expect that, but that's what I fall back to when I need to filter over something unavailable in pandas directly. (I was about to say "like .startwith or regex matching, but just found out about Series.str that has all of that!)
      – Kos
      Nov 14 '13 at 7:42














    9












    9








    9






    I've been usually doing generic filtering over rows like this:



    criterion = lambda row: row['countries'] not in countries
    not_in = df[df.apply(criterion, axis=1)]





    share|improve this answer












    I've been usually doing generic filtering over rows like this:



    criterion = lambda row: row['countries'] not in countries
    not_in = df[df.apply(criterion, axis=1)]






    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Nov 13 '13 at 17:14









    Kos

    49.6k19119195




    49.6k19119195








    • 6




      FYI, this is much slower than @DSM soln which is vectorized
      – Jeff
      Nov 13 '13 at 17:47










    • @Jeff I'd expect that, but that's what I fall back to when I need to filter over something unavailable in pandas directly. (I was about to say "like .startwith or regex matching, but just found out about Series.str that has all of that!)
      – Kos
      Nov 14 '13 at 7:42














    • 6




      FYI, this is much slower than @DSM soln which is vectorized
      – Jeff
      Nov 13 '13 at 17:47










    • @Jeff I'd expect that, but that's what I fall back to when I need to filter over something unavailable in pandas directly. (I was about to say "like .startwith or regex matching, but just found out about Series.str that has all of that!)
      – Kos
      Nov 14 '13 at 7:42








    6




    6




    FYI, this is much slower than @DSM soln which is vectorized
    – Jeff
    Nov 13 '13 at 17:47




    FYI, this is much slower than @DSM soln which is vectorized
    – Jeff
    Nov 13 '13 at 17:47












    @Jeff I'd expect that, but that's what I fall back to when I need to filter over something unavailable in pandas directly. (I was about to say "like .startwith or regex matching, but just found out about Series.str that has all of that!)
    – Kos
    Nov 14 '13 at 7:42




    @Jeff I'd expect that, but that's what I fall back to when I need to filter over something unavailable in pandas directly. (I was about to say "like .startwith or regex matching, but just found out about Series.str that has all of that!)
    – Kos
    Nov 14 '13 at 7:42











    1














    I wanted to filter out dfbc rows that had a BUSINESS_ID that was also in the BUSINESS_ID of dfProfilesBusIds



    Finally got it working:



    dfbc = dfbc[(dfbc['BUSINESS_ID'].isin(dfProfilesBusIds['BUSINESS_ID']) == False)]





    share|improve this answer

















    • 3




      You can negate the isin (as done in the accepted answer) rather than comparing to False
      – cricket_007
      Jul 19 '17 at 12:17










    • This solution is working for me. Thank you
      – Malek B.
      Jun 21 at 13:33
















    1














    I wanted to filter out dfbc rows that had a BUSINESS_ID that was also in the BUSINESS_ID of dfProfilesBusIds



    Finally got it working:



    dfbc = dfbc[(dfbc['BUSINESS_ID'].isin(dfProfilesBusIds['BUSINESS_ID']) == False)]





    share|improve this answer

















    • 3




      You can negate the isin (as done in the accepted answer) rather than comparing to False
      – cricket_007
      Jul 19 '17 at 12:17










    • This solution is working for me. Thank you
      – Malek B.
      Jun 21 at 13:33














    1












    1








    1






    I wanted to filter out dfbc rows that had a BUSINESS_ID that was also in the BUSINESS_ID of dfProfilesBusIds



    Finally got it working:



    dfbc = dfbc[(dfbc['BUSINESS_ID'].isin(dfProfilesBusIds['BUSINESS_ID']) == False)]





    share|improve this answer












    I wanted to filter out dfbc rows that had a BUSINESS_ID that was also in the BUSINESS_ID of dfProfilesBusIds



    Finally got it working:



    dfbc = dfbc[(dfbc['BUSINESS_ID'].isin(dfProfilesBusIds['BUSINESS_ID']) == False)]






    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Jul 13 '17 at 3:12









    Sam Henderson

    16115




    16115








    • 3




      You can negate the isin (as done in the accepted answer) rather than comparing to False
      – cricket_007
      Jul 19 '17 at 12:17










    • This solution is working for me. Thank you
      – Malek B.
      Jun 21 at 13:33














    • 3




      You can negate the isin (as done in the accepted answer) rather than comparing to False
      – cricket_007
      Jul 19 '17 at 12:17










    • This solution is working for me. Thank you
      – Malek B.
      Jun 21 at 13:33








    3




    3




    You can negate the isin (as done in the accepted answer) rather than comparing to False
    – cricket_007
    Jul 19 '17 at 12:17




    You can negate the isin (as done in the accepted answer) rather than comparing to False
    – cricket_007
    Jul 19 '17 at 12:17












    This solution is working for me. Thank you
    – Malek B.
    Jun 21 at 13:33




    This solution is working for me. Thank you
    – Malek B.
    Jun 21 at 13:33











    1














    df = pd.DataFrame({'countries':['US','UK','Germany','China']})
    countries = ['UK','China']


    implement in:



    df[df.countries.isin(countries)]


    implement not in as in of rest countries:



    df[df.countries.isin([x for x in np.unique(df.countries) if x not in countries])]





    share|improve this answer


























      1














      df = pd.DataFrame({'countries':['US','UK','Germany','China']})
      countries = ['UK','China']


      implement in:



      df[df.countries.isin(countries)]


      implement not in as in of rest countries:



      df[df.countries.isin([x for x in np.unique(df.countries) if x not in countries])]





      share|improve this answer
























        1












        1








        1






        df = pd.DataFrame({'countries':['US','UK','Germany','China']})
        countries = ['UK','China']


        implement in:



        df[df.countries.isin(countries)]


        implement not in as in of rest countries:



        df[df.countries.isin([x for x in np.unique(df.countries) if x not in countries])]





        share|improve this answer












        df = pd.DataFrame({'countries':['US','UK','Germany','China']})
        countries = ['UK','China']


        implement in:



        df[df.countries.isin(countries)]


        implement not in as in of rest countries:



        df[df.countries.isin([x for x in np.unique(df.countries) if x not in countries])]






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Apr 4 at 11:51









        Ioannis Nasios

        3,5863832




        3,5863832






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f19960077%2fhow-to-implement-in-and-not-in-for-pandas-dataframe%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            "Incorrect syntax near the keyword 'ON'. (on update cascade, on delete cascade,)

            Alcedinidae

            RAC Tourist Trophy