Compare dataframe columns with conditions
I have 2 dataframes as below:
df1:
ID col1 col2
1 A1 B1
2 A2 B2
3 A3 B3
4 A4 B4
5 A5 B5
6 A6 B6
df2:
col1 col2
A1 B1
A2 O5
H3 B3
A4 B4
A5 66
A6 C6
Expected Result: I would like to generate a result df based on the condition - Each value in col1,col2 of df1 should exist in col1,col2 values of df2
Expected Result df:
ID col1 col2 Error
1 A1 B1 No mismatch with df2
2 A2 B2 col2 mismatch with df2
3 A3 B3 col1 mismatch with df2
4 A4 B4 No mismatch with df2
5 A5 B5 col2 mismatch with df2
6 A6 B6 col2 mismatch with df2
python pandas dataframe
add a comment |
I have 2 dataframes as below:
df1:
ID col1 col2
1 A1 B1
2 A2 B2
3 A3 B3
4 A4 B4
5 A5 B5
6 A6 B6
df2:
col1 col2
A1 B1
A2 O5
H3 B3
A4 B4
A5 66
A6 C6
Expected Result: I would like to generate a result df based on the condition - Each value in col1,col2 of df1 should exist in col1,col2 values of df2
Expected Result df:
ID col1 col2 Error
1 A1 B1 No mismatch with df2
2 A2 B2 col2 mismatch with df2
3 A3 B3 col1 mismatch with df2
4 A4 B4 No mismatch with df2
5 A5 B5 col2 mismatch with df2
6 A6 B6 col2 mismatch with df2
python pandas dataframe
You do not have list in df2
– Wen-Ben
Nov 22 '18 at 1:25
list is a column in df1 and its value list1 and list2 are just dropdownlist names ; the accepted values are given in columns list1,list2 in df2. So, the data from column "value" of df1 based on its list value should be checked with df2 list1 & list2 values.
– Osceria
Nov 22 '18 at 9:54
Edited the Question
– Osceria
Nov 22 '18 at 11:46
add a comment |
I have 2 dataframes as below:
df1:
ID col1 col2
1 A1 B1
2 A2 B2
3 A3 B3
4 A4 B4
5 A5 B5
6 A6 B6
df2:
col1 col2
A1 B1
A2 O5
H3 B3
A4 B4
A5 66
A6 C6
Expected Result: I would like to generate a result df based on the condition - Each value in col1,col2 of df1 should exist in col1,col2 values of df2
Expected Result df:
ID col1 col2 Error
1 A1 B1 No mismatch with df2
2 A2 B2 col2 mismatch with df2
3 A3 B3 col1 mismatch with df2
4 A4 B4 No mismatch with df2
5 A5 B5 col2 mismatch with df2
6 A6 B6 col2 mismatch with df2
python pandas dataframe
I have 2 dataframes as below:
df1:
ID col1 col2
1 A1 B1
2 A2 B2
3 A3 B3
4 A4 B4
5 A5 B5
6 A6 B6
df2:
col1 col2
A1 B1
A2 O5
H3 B3
A4 B4
A5 66
A6 C6
Expected Result: I would like to generate a result df based on the condition - Each value in col1,col2 of df1 should exist in col1,col2 values of df2
Expected Result df:
ID col1 col2 Error
1 A1 B1 No mismatch with df2
2 A2 B2 col2 mismatch with df2
3 A3 B3 col1 mismatch with df2
4 A4 B4 No mismatch with df2
5 A5 B5 col2 mismatch with df2
6 A6 B6 col2 mismatch with df2
python pandas dataframe
python pandas dataframe
edited Nov 22 '18 at 11:45
Osceria
asked Nov 21 '18 at 23:54
OsceriaOsceria
599
599
You do not have list in df2
– Wen-Ben
Nov 22 '18 at 1:25
list is a column in df1 and its value list1 and list2 are just dropdownlist names ; the accepted values are given in columns list1,list2 in df2. So, the data from column "value" of df1 based on its list value should be checked with df2 list1 & list2 values.
– Osceria
Nov 22 '18 at 9:54
Edited the Question
– Osceria
Nov 22 '18 at 11:46
add a comment |
You do not have list in df2
– Wen-Ben
Nov 22 '18 at 1:25
list is a column in df1 and its value list1 and list2 are just dropdownlist names ; the accepted values are given in columns list1,list2 in df2. So, the data from column "value" of df1 based on its list value should be checked with df2 list1 & list2 values.
– Osceria
Nov 22 '18 at 9:54
Edited the Question
– Osceria
Nov 22 '18 at 11:46
You do not have list in df2
– Wen-Ben
Nov 22 '18 at 1:25
You do not have list in df2
– Wen-Ben
Nov 22 '18 at 1:25
list is a column in df1 and its value list1 and list2 are just dropdownlist names ; the accepted values are given in columns list1,list2 in df2. So, the data from column "value" of df1 based on its list value should be checked with df2 list1 & list2 values.
– Osceria
Nov 22 '18 at 9:54
list is a column in df1 and its value list1 and list2 are just dropdownlist names ; the accepted values are given in columns list1,list2 in df2. So, the data from column "value" of df1 based on its list value should be checked with df2 list1 & list2 values.
– Osceria
Nov 22 '18 at 9:54
Edited the Question
– Osceria
Nov 22 '18 at 11:46
Edited the Question
– Osceria
Nov 22 '18 at 11:46
add a comment |
2 Answers
2
active
oldest
votes
Create helper DataFrame with dictionary comprehension and comparing with isin
:
m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']})
print (m)
col1 col2
0 False False
1 False True
2 True False
3 False False
4 False True
5 False True
And then numpy.where
with mask by any
for test at least one True
per rows and dot
with matrix multiplication for get column names:
df1['Error'] = np.where(m.any(axis=1),
m.dot(m.columns + ', ').str.rstrip(', ') + ' mismatch with df2',
'No mismatch with df2')
print (df1)
ID col1 col2 Error
0 1 A1 B1 No mismatch with df2
1 2 A2 B2 col2 mismatch with df2
2 3 A3 B3 col1 mismatch with df2
3 4 A4 B4 No mismatch with df2
4 5 A5 B5 col2 mismatch with df2
5 6 A6 B6 col2 mismatch with df2
m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']}) - col1 and col2 are hard-coded. When I try to pass the column names directly from the dataframe using df.cols, it says the below error "ValueError: Must pass DataFrame with boolean values only" - Any help with this?
– Osceria
Nov 22 '18 at 14:37
code should work if I pass all the columns from the dataframe like this lovcols = df2.columns m = pd.DataFrame({c: ~dfCSDataset[c].isin(dfLOVRules[c]) for c in [lovcols]}
– Osceria
Nov 22 '18 at 14:41
@Osceria - yes, you are right. You can also pass columns to dict comprehension likem = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in df2.columns})
– jezrael
Nov 22 '18 at 14:43
1
yeah, it works in this way too
– Osceria
Nov 22 '18 at 15:07
add a comment |
Something like this should do the trick but there may be an easier way.
diff = pd.concat([df1[col] == df2[col] for col in df1], axis=1)
def m(row):
mismatches =
for col in diff.columns:
if not row[col]:
mismatches.append(col)
if mismatches == :
return 'No mismatch'
return 'Mismatches: ' + ', '.join(mismatches)
df1['Error'] = diff.apply(m, axis=1)
When I try this, I get the error "ValueError: Can only compare identically-labeled Series objects"
– Osceria
Nov 22 '18 at 10:23
Edited the Question
– Osceria
Nov 22 '18 at 11:46
@Osceria do you get the same error with the following reproducible datasets:df1 = pd.DataFrame({'col1': ["A1", "A2", "A3", "A4", "A5", "A6"], 'col2': ["B1", "B2", "B3", "B4", "B5", "B6"]}) df2 = pd.DataFrame({'col1': ["A1", "A2", "H3", "A4", "A5", "A6"], 'col2': ["B1", "O5", "B3", "B4", "66", "C6"]})
– leoburgy
Nov 22 '18 at 12:01
It's because yourdf1
anddf2
had different columns, right? I noticed you edited the question now, does it work with those dataframes?
– lieblos
Nov 22 '18 at 12:47
If I run what I answered with the dataframes above, it seems like it works.
– lieblos
Nov 22 '18 at 12:49
|
show 1 more comment
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53422071%2fcompare-dataframe-columns-with-conditions%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Create helper DataFrame with dictionary comprehension and comparing with isin
:
m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']})
print (m)
col1 col2
0 False False
1 False True
2 True False
3 False False
4 False True
5 False True
And then numpy.where
with mask by any
for test at least one True
per rows and dot
with matrix multiplication for get column names:
df1['Error'] = np.where(m.any(axis=1),
m.dot(m.columns + ', ').str.rstrip(', ') + ' mismatch with df2',
'No mismatch with df2')
print (df1)
ID col1 col2 Error
0 1 A1 B1 No mismatch with df2
1 2 A2 B2 col2 mismatch with df2
2 3 A3 B3 col1 mismatch with df2
3 4 A4 B4 No mismatch with df2
4 5 A5 B5 col2 mismatch with df2
5 6 A6 B6 col2 mismatch with df2
m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']}) - col1 and col2 are hard-coded. When I try to pass the column names directly from the dataframe using df.cols, it says the below error "ValueError: Must pass DataFrame with boolean values only" - Any help with this?
– Osceria
Nov 22 '18 at 14:37
code should work if I pass all the columns from the dataframe like this lovcols = df2.columns m = pd.DataFrame({c: ~dfCSDataset[c].isin(dfLOVRules[c]) for c in [lovcols]}
– Osceria
Nov 22 '18 at 14:41
@Osceria - yes, you are right. You can also pass columns to dict comprehension likem = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in df2.columns})
– jezrael
Nov 22 '18 at 14:43
1
yeah, it works in this way too
– Osceria
Nov 22 '18 at 15:07
add a comment |
Create helper DataFrame with dictionary comprehension and comparing with isin
:
m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']})
print (m)
col1 col2
0 False False
1 False True
2 True False
3 False False
4 False True
5 False True
And then numpy.where
with mask by any
for test at least one True
per rows and dot
with matrix multiplication for get column names:
df1['Error'] = np.where(m.any(axis=1),
m.dot(m.columns + ', ').str.rstrip(', ') + ' mismatch with df2',
'No mismatch with df2')
print (df1)
ID col1 col2 Error
0 1 A1 B1 No mismatch with df2
1 2 A2 B2 col2 mismatch with df2
2 3 A3 B3 col1 mismatch with df2
3 4 A4 B4 No mismatch with df2
4 5 A5 B5 col2 mismatch with df2
5 6 A6 B6 col2 mismatch with df2
m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']}) - col1 and col2 are hard-coded. When I try to pass the column names directly from the dataframe using df.cols, it says the below error "ValueError: Must pass DataFrame with boolean values only" - Any help with this?
– Osceria
Nov 22 '18 at 14:37
code should work if I pass all the columns from the dataframe like this lovcols = df2.columns m = pd.DataFrame({c: ~dfCSDataset[c].isin(dfLOVRules[c]) for c in [lovcols]}
– Osceria
Nov 22 '18 at 14:41
@Osceria - yes, you are right. You can also pass columns to dict comprehension likem = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in df2.columns})
– jezrael
Nov 22 '18 at 14:43
1
yeah, it works in this way too
– Osceria
Nov 22 '18 at 15:07
add a comment |
Create helper DataFrame with dictionary comprehension and comparing with isin
:
m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']})
print (m)
col1 col2
0 False False
1 False True
2 True False
3 False False
4 False True
5 False True
And then numpy.where
with mask by any
for test at least one True
per rows and dot
with matrix multiplication for get column names:
df1['Error'] = np.where(m.any(axis=1),
m.dot(m.columns + ', ').str.rstrip(', ') + ' mismatch with df2',
'No mismatch with df2')
print (df1)
ID col1 col2 Error
0 1 A1 B1 No mismatch with df2
1 2 A2 B2 col2 mismatch with df2
2 3 A3 B3 col1 mismatch with df2
3 4 A4 B4 No mismatch with df2
4 5 A5 B5 col2 mismatch with df2
5 6 A6 B6 col2 mismatch with df2
Create helper DataFrame with dictionary comprehension and comparing with isin
:
m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']})
print (m)
col1 col2
0 False False
1 False True
2 True False
3 False False
4 False True
5 False True
And then numpy.where
with mask by any
for test at least one True
per rows and dot
with matrix multiplication for get column names:
df1['Error'] = np.where(m.any(axis=1),
m.dot(m.columns + ', ').str.rstrip(', ') + ' mismatch with df2',
'No mismatch with df2')
print (df1)
ID col1 col2 Error
0 1 A1 B1 No mismatch with df2
1 2 A2 B2 col2 mismatch with df2
2 3 A3 B3 col1 mismatch with df2
3 4 A4 B4 No mismatch with df2
4 5 A5 B5 col2 mismatch with df2
5 6 A6 B6 col2 mismatch with df2
answered Nov 22 '18 at 12:08
jezraeljezrael
336k25281357
336k25281357
m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']}) - col1 and col2 are hard-coded. When I try to pass the column names directly from the dataframe using df.cols, it says the below error "ValueError: Must pass DataFrame with boolean values only" - Any help with this?
– Osceria
Nov 22 '18 at 14:37
code should work if I pass all the columns from the dataframe like this lovcols = df2.columns m = pd.DataFrame({c: ~dfCSDataset[c].isin(dfLOVRules[c]) for c in [lovcols]}
– Osceria
Nov 22 '18 at 14:41
@Osceria - yes, you are right. You can also pass columns to dict comprehension likem = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in df2.columns})
– jezrael
Nov 22 '18 at 14:43
1
yeah, it works in this way too
– Osceria
Nov 22 '18 at 15:07
add a comment |
m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']}) - col1 and col2 are hard-coded. When I try to pass the column names directly from the dataframe using df.cols, it says the below error "ValueError: Must pass DataFrame with boolean values only" - Any help with this?
– Osceria
Nov 22 '18 at 14:37
code should work if I pass all the columns from the dataframe like this lovcols = df2.columns m = pd.DataFrame({c: ~dfCSDataset[c].isin(dfLOVRules[c]) for c in [lovcols]}
– Osceria
Nov 22 '18 at 14:41
@Osceria - yes, you are right. You can also pass columns to dict comprehension likem = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in df2.columns})
– jezrael
Nov 22 '18 at 14:43
1
yeah, it works in this way too
– Osceria
Nov 22 '18 at 15:07
m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']}) - col1 and col2 are hard-coded. When I try to pass the column names directly from the dataframe using df.cols, it says the below error "ValueError: Must pass DataFrame with boolean values only" - Any help with this?
– Osceria
Nov 22 '18 at 14:37
m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']}) - col1 and col2 are hard-coded. When I try to pass the column names directly from the dataframe using df.cols, it says the below error "ValueError: Must pass DataFrame with boolean values only" - Any help with this?
– Osceria
Nov 22 '18 at 14:37
code should work if I pass all the columns from the dataframe like this lovcols = df2.columns m = pd.DataFrame({c: ~dfCSDataset[c].isin(dfLOVRules[c]) for c in [lovcols]}
– Osceria
Nov 22 '18 at 14:41
code should work if I pass all the columns from the dataframe like this lovcols = df2.columns m = pd.DataFrame({c: ~dfCSDataset[c].isin(dfLOVRules[c]) for c in [lovcols]}
– Osceria
Nov 22 '18 at 14:41
@Osceria - yes, you are right. You can also pass columns to dict comprehension like
m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in df2.columns})
– jezrael
Nov 22 '18 at 14:43
@Osceria - yes, you are right. You can also pass columns to dict comprehension like
m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in df2.columns})
– jezrael
Nov 22 '18 at 14:43
1
1
yeah, it works in this way too
– Osceria
Nov 22 '18 at 15:07
yeah, it works in this way too
– Osceria
Nov 22 '18 at 15:07
add a comment |
Something like this should do the trick but there may be an easier way.
diff = pd.concat([df1[col] == df2[col] for col in df1], axis=1)
def m(row):
mismatches =
for col in diff.columns:
if not row[col]:
mismatches.append(col)
if mismatches == :
return 'No mismatch'
return 'Mismatches: ' + ', '.join(mismatches)
df1['Error'] = diff.apply(m, axis=1)
When I try this, I get the error "ValueError: Can only compare identically-labeled Series objects"
– Osceria
Nov 22 '18 at 10:23
Edited the Question
– Osceria
Nov 22 '18 at 11:46
@Osceria do you get the same error with the following reproducible datasets:df1 = pd.DataFrame({'col1': ["A1", "A2", "A3", "A4", "A5", "A6"], 'col2': ["B1", "B2", "B3", "B4", "B5", "B6"]}) df2 = pd.DataFrame({'col1': ["A1", "A2", "H3", "A4", "A5", "A6"], 'col2': ["B1", "O5", "B3", "B4", "66", "C6"]})
– leoburgy
Nov 22 '18 at 12:01
It's because yourdf1
anddf2
had different columns, right? I noticed you edited the question now, does it work with those dataframes?
– lieblos
Nov 22 '18 at 12:47
If I run what I answered with the dataframes above, it seems like it works.
– lieblos
Nov 22 '18 at 12:49
|
show 1 more comment
Something like this should do the trick but there may be an easier way.
diff = pd.concat([df1[col] == df2[col] for col in df1], axis=1)
def m(row):
mismatches =
for col in diff.columns:
if not row[col]:
mismatches.append(col)
if mismatches == :
return 'No mismatch'
return 'Mismatches: ' + ', '.join(mismatches)
df1['Error'] = diff.apply(m, axis=1)
When I try this, I get the error "ValueError: Can only compare identically-labeled Series objects"
– Osceria
Nov 22 '18 at 10:23
Edited the Question
– Osceria
Nov 22 '18 at 11:46
@Osceria do you get the same error with the following reproducible datasets:df1 = pd.DataFrame({'col1': ["A1", "A2", "A3", "A4", "A5", "A6"], 'col2': ["B1", "B2", "B3", "B4", "B5", "B6"]}) df2 = pd.DataFrame({'col1': ["A1", "A2", "H3", "A4", "A5", "A6"], 'col2': ["B1", "O5", "B3", "B4", "66", "C6"]})
– leoburgy
Nov 22 '18 at 12:01
It's because yourdf1
anddf2
had different columns, right? I noticed you edited the question now, does it work with those dataframes?
– lieblos
Nov 22 '18 at 12:47
If I run what I answered with the dataframes above, it seems like it works.
– lieblos
Nov 22 '18 at 12:49
|
show 1 more comment
Something like this should do the trick but there may be an easier way.
diff = pd.concat([df1[col] == df2[col] for col in df1], axis=1)
def m(row):
mismatches =
for col in diff.columns:
if not row[col]:
mismatches.append(col)
if mismatches == :
return 'No mismatch'
return 'Mismatches: ' + ', '.join(mismatches)
df1['Error'] = diff.apply(m, axis=1)
Something like this should do the trick but there may be an easier way.
diff = pd.concat([df1[col] == df2[col] for col in df1], axis=1)
def m(row):
mismatches =
for col in diff.columns:
if not row[col]:
mismatches.append(col)
if mismatches == :
return 'No mismatch'
return 'Mismatches: ' + ', '.join(mismatches)
df1['Error'] = diff.apply(m, axis=1)
answered Nov 22 '18 at 0:20
liebloslieblos
1029
1029
When I try this, I get the error "ValueError: Can only compare identically-labeled Series objects"
– Osceria
Nov 22 '18 at 10:23
Edited the Question
– Osceria
Nov 22 '18 at 11:46
@Osceria do you get the same error with the following reproducible datasets:df1 = pd.DataFrame({'col1': ["A1", "A2", "A3", "A4", "A5", "A6"], 'col2': ["B1", "B2", "B3", "B4", "B5", "B6"]}) df2 = pd.DataFrame({'col1': ["A1", "A2", "H3", "A4", "A5", "A6"], 'col2': ["B1", "O5", "B3", "B4", "66", "C6"]})
– leoburgy
Nov 22 '18 at 12:01
It's because yourdf1
anddf2
had different columns, right? I noticed you edited the question now, does it work with those dataframes?
– lieblos
Nov 22 '18 at 12:47
If I run what I answered with the dataframes above, it seems like it works.
– lieblos
Nov 22 '18 at 12:49
|
show 1 more comment
When I try this, I get the error "ValueError: Can only compare identically-labeled Series objects"
– Osceria
Nov 22 '18 at 10:23
Edited the Question
– Osceria
Nov 22 '18 at 11:46
@Osceria do you get the same error with the following reproducible datasets:df1 = pd.DataFrame({'col1': ["A1", "A2", "A3", "A4", "A5", "A6"], 'col2': ["B1", "B2", "B3", "B4", "B5", "B6"]}) df2 = pd.DataFrame({'col1': ["A1", "A2", "H3", "A4", "A5", "A6"], 'col2': ["B1", "O5", "B3", "B4", "66", "C6"]})
– leoburgy
Nov 22 '18 at 12:01
It's because yourdf1
anddf2
had different columns, right? I noticed you edited the question now, does it work with those dataframes?
– lieblos
Nov 22 '18 at 12:47
If I run what I answered with the dataframes above, it seems like it works.
– lieblos
Nov 22 '18 at 12:49
When I try this, I get the error "ValueError: Can only compare identically-labeled Series objects"
– Osceria
Nov 22 '18 at 10:23
When I try this, I get the error "ValueError: Can only compare identically-labeled Series objects"
– Osceria
Nov 22 '18 at 10:23
Edited the Question
– Osceria
Nov 22 '18 at 11:46
Edited the Question
– Osceria
Nov 22 '18 at 11:46
@Osceria do you get the same error with the following reproducible datasets:
df1 = pd.DataFrame({'col1': ["A1", "A2", "A3", "A4", "A5", "A6"], 'col2': ["B1", "B2", "B3", "B4", "B5", "B6"]}) df2 = pd.DataFrame({'col1': ["A1", "A2", "H3", "A4", "A5", "A6"], 'col2': ["B1", "O5", "B3", "B4", "66", "C6"]})
– leoburgy
Nov 22 '18 at 12:01
@Osceria do you get the same error with the following reproducible datasets:
df1 = pd.DataFrame({'col1': ["A1", "A2", "A3", "A4", "A5", "A6"], 'col2': ["B1", "B2", "B3", "B4", "B5", "B6"]}) df2 = pd.DataFrame({'col1': ["A1", "A2", "H3", "A4", "A5", "A6"], 'col2': ["B1", "O5", "B3", "B4", "66", "C6"]})
– leoburgy
Nov 22 '18 at 12:01
It's because your
df1
and df2
had different columns, right? I noticed you edited the question now, does it work with those dataframes?– lieblos
Nov 22 '18 at 12:47
It's because your
df1
and df2
had different columns, right? I noticed you edited the question now, does it work with those dataframes?– lieblos
Nov 22 '18 at 12:47
If I run what I answered with the dataframes above, it seems like it works.
– lieblos
Nov 22 '18 at 12:49
If I run what I answered with the dataframes above, it seems like it works.
– lieblos
Nov 22 '18 at 12:49
|
show 1 more comment
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53422071%2fcompare-dataframe-columns-with-conditions%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
You do not have list in df2
– Wen-Ben
Nov 22 '18 at 1:25
list is a column in df1 and its value list1 and list2 are just dropdownlist names ; the accepted values are given in columns list1,list2 in df2. So, the data from column "value" of df1 based on its list value should be checked with df2 list1 & list2 values.
– Osceria
Nov 22 '18 at 9:54
Edited the Question
– Osceria
Nov 22 '18 at 11:46