Populating a data frame using from separate table using loc
up vote
0
down vote
favorite
data1={'Policy Number':['FSH1235456','FSH7643643','CHH123124','CHH123145252','CHH124124'],'State':['FL','TX','GA','TX','TX'],'TERR':[1,2,3,4,5]}
data2={'TERR':[1,2,3,4,5],'CHH':[0,.15,.65,.35,.20],'FSH':[0,.15,.25,.35,.20]}
output={'Policy Number':['FSH1235456','FSH7643643','CHH123124','CHH123145252','CHH124124'],'State':['FL','TX','GA','TX','TX'],'TERR':[1,2,3,4,5],'Test':[0,.15,0,0,0]}
df1=pd.DataFrame(data1)
df2=pd.DataFrame(data2)
df3=pd.DataFrame(output)
Test data above.
I am trying to create a new column in df1 call it df1['Test'], which contains the values of df2['FSH'] based on the following criteria:
- The state is 'TX'
- The Policy Number contains 'FSH'
3.The value of df1["Terr"] = value of df2['TERR']
View df3 for correct output.
What I tried doing was the following;
if df1.State.any()=="TX":
if df1["Policy Number"].str.contains("FSH").any():
for i in df["TERR"]:
df1['% TERR']=df2.loc[[i],["FSH"]]
However, my output is riddled with NAN, as well as 1 unique correct answer.
I tried checking to ensure the correct i values were being fed into df2 via
print(df2.loc[[i],["FSH"]]
and it is printing correctly.
Any thoughts?
python pandas
add a comment |
up vote
0
down vote
favorite
data1={'Policy Number':['FSH1235456','FSH7643643','CHH123124','CHH123145252','CHH124124'],'State':['FL','TX','GA','TX','TX'],'TERR':[1,2,3,4,5]}
data2={'TERR':[1,2,3,4,5],'CHH':[0,.15,.65,.35,.20],'FSH':[0,.15,.25,.35,.20]}
output={'Policy Number':['FSH1235456','FSH7643643','CHH123124','CHH123145252','CHH124124'],'State':['FL','TX','GA','TX','TX'],'TERR':[1,2,3,4,5],'Test':[0,.15,0,0,0]}
df1=pd.DataFrame(data1)
df2=pd.DataFrame(data2)
df3=pd.DataFrame(output)
Test data above.
I am trying to create a new column in df1 call it df1['Test'], which contains the values of df2['FSH'] based on the following criteria:
- The state is 'TX'
- The Policy Number contains 'FSH'
3.The value of df1["Terr"] = value of df2['TERR']
View df3 for correct output.
What I tried doing was the following;
if df1.State.any()=="TX":
if df1["Policy Number"].str.contains("FSH").any():
for i in df["TERR"]:
df1['% TERR']=df2.loc[[i],["FSH"]]
However, my output is riddled with NAN, as well as 1 unique correct answer.
I tried checking to ensure the correct i values were being fed into df2 via
print(df2.loc[[i],["FSH"]]
and it is printing correctly.
Any thoughts?
python pandas
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
data1={'Policy Number':['FSH1235456','FSH7643643','CHH123124','CHH123145252','CHH124124'],'State':['FL','TX','GA','TX','TX'],'TERR':[1,2,3,4,5]}
data2={'TERR':[1,2,3,4,5],'CHH':[0,.15,.65,.35,.20],'FSH':[0,.15,.25,.35,.20]}
output={'Policy Number':['FSH1235456','FSH7643643','CHH123124','CHH123145252','CHH124124'],'State':['FL','TX','GA','TX','TX'],'TERR':[1,2,3,4,5],'Test':[0,.15,0,0,0]}
df1=pd.DataFrame(data1)
df2=pd.DataFrame(data2)
df3=pd.DataFrame(output)
Test data above.
I am trying to create a new column in df1 call it df1['Test'], which contains the values of df2['FSH'] based on the following criteria:
- The state is 'TX'
- The Policy Number contains 'FSH'
3.The value of df1["Terr"] = value of df2['TERR']
View df3 for correct output.
What I tried doing was the following;
if df1.State.any()=="TX":
if df1["Policy Number"].str.contains("FSH").any():
for i in df["TERR"]:
df1['% TERR']=df2.loc[[i],["FSH"]]
However, my output is riddled with NAN, as well as 1 unique correct answer.
I tried checking to ensure the correct i values were being fed into df2 via
print(df2.loc[[i],["FSH"]]
and it is printing correctly.
Any thoughts?
python pandas
data1={'Policy Number':['FSH1235456','FSH7643643','CHH123124','CHH123145252','CHH124124'],'State':['FL','TX','GA','TX','TX'],'TERR':[1,2,3,4,5]}
data2={'TERR':[1,2,3,4,5],'CHH':[0,.15,.65,.35,.20],'FSH':[0,.15,.25,.35,.20]}
output={'Policy Number':['FSH1235456','FSH7643643','CHH123124','CHH123145252','CHH124124'],'State':['FL','TX','GA','TX','TX'],'TERR':[1,2,3,4,5],'Test':[0,.15,0,0,0]}
df1=pd.DataFrame(data1)
df2=pd.DataFrame(data2)
df3=pd.DataFrame(output)
Test data above.
I am trying to create a new column in df1 call it df1['Test'], which contains the values of df2['FSH'] based on the following criteria:
- The state is 'TX'
- The Policy Number contains 'FSH'
3.The value of df1["Terr"] = value of df2['TERR']
View df3 for correct output.
What I tried doing was the following;
if df1.State.any()=="TX":
if df1["Policy Number"].str.contains("FSH").any():
for i in df["TERR"]:
df1['% TERR']=df2.loc[[i],["FSH"]]
However, my output is riddled with NAN, as well as 1 unique correct answer.
I tried checking to ensure the correct i values were being fed into df2 via
print(df2.loc[[i],["FSH"]]
and it is printing correctly.
Any thoughts?
python pandas
python pandas
asked Nov 19 at 18:27
Bjc51192
527
527
add a comment |
add a comment |
4 Answers
4
active
oldest
votes
up vote
1
down vote
I dont know if this is the best or fastest solution but one option is to merge your 2 dataframes then filter based on your conditions then update:
new = df1.merge(df2, on='TERR')
mask = new[((new['State']=='TX') & (new['Policy Number'].str.contains('FSH')))]
df1['Test'] = 0
df1['Test'].update(mask['FSH'])
Policy Number State TERR Test
0 FSH1235456 FL 1 0.00
1 FSH7643643 TX 2 0.15
2 CHH123124 GA 3 0.00
3 CHH123145252 TX 4 0.00
4 CHH124124 TX 5 0.00
add a comment |
up vote
1
down vote
You can use numpy where by passing conditions,
cond1 = (df1['State'] == 'TX')
cond2 = (df1['Policy Number'].str.contains('FSH'))
cond3 = (df1["TERR"] == df2['TERR'])
df1['Test'] = np.where(cond1 & cond2 & cond3, df2['FSH'], 0)
Policy Number State TERR Test
0 FSH1235456 FL 1 0.00
1 FSH7643643 TX 2 0.15
2 CHH123124 GA 3 0.00
3 CHH123145252 TX 4 0.00
4 CHH124124 TX 5 0.00
This is a great solution, however how would you take into consideration the case where the shapes do not match exactly?
– Bjc51192
Nov 19 at 20:32
You can use map when the shapes of dataframes do not match but in this case, the mapping is possible only based TERR columns. The other two conditions are still index based
– Vaishali
Nov 19 at 20:37
add a comment |
up vote
0
down vote
Are you just trying to get the data from df2 into df1? If so, you could reshape df2 using melt and then do a merge.
df1['policy_prefix'] = df1['Policy Number'].str[:3]
df2 = df2.melt(id_vars='TERR', value_vars=['CHH', 'FSH'],
value_name='Test',
var_name='policy_prefix')
df1 = df1.merge(df2, on=['policy_prefix', 'TERR'])
if you only want this to apply to rows where the state is 'TX' then you could set the other values to null after the merge:
import numpy as np
df1.loc[df1.State!='TX', 'Test'] = np.nan
add a comment |
up vote
0
down vote
Here is your solution:
# ... initialize df1 and df2 here
df3 = df1.join(df2.FSH) # Merge df1 and df2 into a single dataframe
df3 = df3.rename({"FSH": "TEST"}, axis=1) # Change column name
def set_tx_fsh(row):
if row.State == "TX" and "FSH" in row["Policy Number"]:
return row.TEST
else:
return 0
df3.TEST = df3.apply(set_tx_fsh, axis=1) # Set values in "TEST" column based on your condition
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53380615%2fpopulating-a-data-frame-using-from-separate-table-using-loc%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
I dont know if this is the best or fastest solution but one option is to merge your 2 dataframes then filter based on your conditions then update:
new = df1.merge(df2, on='TERR')
mask = new[((new['State']=='TX') & (new['Policy Number'].str.contains('FSH')))]
df1['Test'] = 0
df1['Test'].update(mask['FSH'])
Policy Number State TERR Test
0 FSH1235456 FL 1 0.00
1 FSH7643643 TX 2 0.15
2 CHH123124 GA 3 0.00
3 CHH123145252 TX 4 0.00
4 CHH124124 TX 5 0.00
add a comment |
up vote
1
down vote
I dont know if this is the best or fastest solution but one option is to merge your 2 dataframes then filter based on your conditions then update:
new = df1.merge(df2, on='TERR')
mask = new[((new['State']=='TX') & (new['Policy Number'].str.contains('FSH')))]
df1['Test'] = 0
df1['Test'].update(mask['FSH'])
Policy Number State TERR Test
0 FSH1235456 FL 1 0.00
1 FSH7643643 TX 2 0.15
2 CHH123124 GA 3 0.00
3 CHH123145252 TX 4 0.00
4 CHH124124 TX 5 0.00
add a comment |
up vote
1
down vote
up vote
1
down vote
I dont know if this is the best or fastest solution but one option is to merge your 2 dataframes then filter based on your conditions then update:
new = df1.merge(df2, on='TERR')
mask = new[((new['State']=='TX') & (new['Policy Number'].str.contains('FSH')))]
df1['Test'] = 0
df1['Test'].update(mask['FSH'])
Policy Number State TERR Test
0 FSH1235456 FL 1 0.00
1 FSH7643643 TX 2 0.15
2 CHH123124 GA 3 0.00
3 CHH123145252 TX 4 0.00
4 CHH124124 TX 5 0.00
I dont know if this is the best or fastest solution but one option is to merge your 2 dataframes then filter based on your conditions then update:
new = df1.merge(df2, on='TERR')
mask = new[((new['State']=='TX') & (new['Policy Number'].str.contains('FSH')))]
df1['Test'] = 0
df1['Test'].update(mask['FSH'])
Policy Number State TERR Test
0 FSH1235456 FL 1 0.00
1 FSH7643643 TX 2 0.15
2 CHH123124 GA 3 0.00
3 CHH123145252 TX 4 0.00
4 CHH124124 TX 5 0.00
answered Nov 19 at 18:39
Chris
1,5001214
1,5001214
add a comment |
add a comment |
up vote
1
down vote
You can use numpy where by passing conditions,
cond1 = (df1['State'] == 'TX')
cond2 = (df1['Policy Number'].str.contains('FSH'))
cond3 = (df1["TERR"] == df2['TERR'])
df1['Test'] = np.where(cond1 & cond2 & cond3, df2['FSH'], 0)
Policy Number State TERR Test
0 FSH1235456 FL 1 0.00
1 FSH7643643 TX 2 0.15
2 CHH123124 GA 3 0.00
3 CHH123145252 TX 4 0.00
4 CHH124124 TX 5 0.00
This is a great solution, however how would you take into consideration the case where the shapes do not match exactly?
– Bjc51192
Nov 19 at 20:32
You can use map when the shapes of dataframes do not match but in this case, the mapping is possible only based TERR columns. The other two conditions are still index based
– Vaishali
Nov 19 at 20:37
add a comment |
up vote
1
down vote
You can use numpy where by passing conditions,
cond1 = (df1['State'] == 'TX')
cond2 = (df1['Policy Number'].str.contains('FSH'))
cond3 = (df1["TERR"] == df2['TERR'])
df1['Test'] = np.where(cond1 & cond2 & cond3, df2['FSH'], 0)
Policy Number State TERR Test
0 FSH1235456 FL 1 0.00
1 FSH7643643 TX 2 0.15
2 CHH123124 GA 3 0.00
3 CHH123145252 TX 4 0.00
4 CHH124124 TX 5 0.00
This is a great solution, however how would you take into consideration the case where the shapes do not match exactly?
– Bjc51192
Nov 19 at 20:32
You can use map when the shapes of dataframes do not match but in this case, the mapping is possible only based TERR columns. The other two conditions are still index based
– Vaishali
Nov 19 at 20:37
add a comment |
up vote
1
down vote
up vote
1
down vote
You can use numpy where by passing conditions,
cond1 = (df1['State'] == 'TX')
cond2 = (df1['Policy Number'].str.contains('FSH'))
cond3 = (df1["TERR"] == df2['TERR'])
df1['Test'] = np.where(cond1 & cond2 & cond3, df2['FSH'], 0)
Policy Number State TERR Test
0 FSH1235456 FL 1 0.00
1 FSH7643643 TX 2 0.15
2 CHH123124 GA 3 0.00
3 CHH123145252 TX 4 0.00
4 CHH124124 TX 5 0.00
You can use numpy where by passing conditions,
cond1 = (df1['State'] == 'TX')
cond2 = (df1['Policy Number'].str.contains('FSH'))
cond3 = (df1["TERR"] == df2['TERR'])
df1['Test'] = np.where(cond1 & cond2 & cond3, df2['FSH'], 0)
Policy Number State TERR Test
0 FSH1235456 FL 1 0.00
1 FSH7643643 TX 2 0.15
2 CHH123124 GA 3 0.00
3 CHH123145252 TX 4 0.00
4 CHH124124 TX 5 0.00
answered Nov 19 at 19:23
Vaishali
17.1k31027
17.1k31027
This is a great solution, however how would you take into consideration the case where the shapes do not match exactly?
– Bjc51192
Nov 19 at 20:32
You can use map when the shapes of dataframes do not match but in this case, the mapping is possible only based TERR columns. The other two conditions are still index based
– Vaishali
Nov 19 at 20:37
add a comment |
This is a great solution, however how would you take into consideration the case where the shapes do not match exactly?
– Bjc51192
Nov 19 at 20:32
You can use map when the shapes of dataframes do not match but in this case, the mapping is possible only based TERR columns. The other two conditions are still index based
– Vaishali
Nov 19 at 20:37
This is a great solution, however how would you take into consideration the case where the shapes do not match exactly?
– Bjc51192
Nov 19 at 20:32
This is a great solution, however how would you take into consideration the case where the shapes do not match exactly?
– Bjc51192
Nov 19 at 20:32
You can use map when the shapes of dataframes do not match but in this case, the mapping is possible only based TERR columns. The other two conditions are still index based
– Vaishali
Nov 19 at 20:37
You can use map when the shapes of dataframes do not match but in this case, the mapping is possible only based TERR columns. The other two conditions are still index based
– Vaishali
Nov 19 at 20:37
add a comment |
up vote
0
down vote
Are you just trying to get the data from df2 into df1? If so, you could reshape df2 using melt and then do a merge.
df1['policy_prefix'] = df1['Policy Number'].str[:3]
df2 = df2.melt(id_vars='TERR', value_vars=['CHH', 'FSH'],
value_name='Test',
var_name='policy_prefix')
df1 = df1.merge(df2, on=['policy_prefix', 'TERR'])
if you only want this to apply to rows where the state is 'TX' then you could set the other values to null after the merge:
import numpy as np
df1.loc[df1.State!='TX', 'Test'] = np.nan
add a comment |
up vote
0
down vote
Are you just trying to get the data from df2 into df1? If so, you could reshape df2 using melt and then do a merge.
df1['policy_prefix'] = df1['Policy Number'].str[:3]
df2 = df2.melt(id_vars='TERR', value_vars=['CHH', 'FSH'],
value_name='Test',
var_name='policy_prefix')
df1 = df1.merge(df2, on=['policy_prefix', 'TERR'])
if you only want this to apply to rows where the state is 'TX' then you could set the other values to null after the merge:
import numpy as np
df1.loc[df1.State!='TX', 'Test'] = np.nan
add a comment |
up vote
0
down vote
up vote
0
down vote
Are you just trying to get the data from df2 into df1? If so, you could reshape df2 using melt and then do a merge.
df1['policy_prefix'] = df1['Policy Number'].str[:3]
df2 = df2.melt(id_vars='TERR', value_vars=['CHH', 'FSH'],
value_name='Test',
var_name='policy_prefix')
df1 = df1.merge(df2, on=['policy_prefix', 'TERR'])
if you only want this to apply to rows where the state is 'TX' then you could set the other values to null after the merge:
import numpy as np
df1.loc[df1.State!='TX', 'Test'] = np.nan
Are you just trying to get the data from df2 into df1? If so, you could reshape df2 using melt and then do a merge.
df1['policy_prefix'] = df1['Policy Number'].str[:3]
df2 = df2.melt(id_vars='TERR', value_vars=['CHH', 'FSH'],
value_name='Test',
var_name='policy_prefix')
df1 = df1.merge(df2, on=['policy_prefix', 'TERR'])
if you only want this to apply to rows where the state is 'TX' then you could set the other values to null after the merge:
import numpy as np
df1.loc[df1.State!='TX', 'Test'] = np.nan
edited Nov 19 at 18:45
answered Nov 19 at 18:40
Robert
33429
33429
add a comment |
add a comment |
up vote
0
down vote
Here is your solution:
# ... initialize df1 and df2 here
df3 = df1.join(df2.FSH) # Merge df1 and df2 into a single dataframe
df3 = df3.rename({"FSH": "TEST"}, axis=1) # Change column name
def set_tx_fsh(row):
if row.State == "TX" and "FSH" in row["Policy Number"]:
return row.TEST
else:
return 0
df3.TEST = df3.apply(set_tx_fsh, axis=1) # Set values in "TEST" column based on your condition
add a comment |
up vote
0
down vote
Here is your solution:
# ... initialize df1 and df2 here
df3 = df1.join(df2.FSH) # Merge df1 and df2 into a single dataframe
df3 = df3.rename({"FSH": "TEST"}, axis=1) # Change column name
def set_tx_fsh(row):
if row.State == "TX" and "FSH" in row["Policy Number"]:
return row.TEST
else:
return 0
df3.TEST = df3.apply(set_tx_fsh, axis=1) # Set values in "TEST" column based on your condition
add a comment |
up vote
0
down vote
up vote
0
down vote
Here is your solution:
# ... initialize df1 and df2 here
df3 = df1.join(df2.FSH) # Merge df1 and df2 into a single dataframe
df3 = df3.rename({"FSH": "TEST"}, axis=1) # Change column name
def set_tx_fsh(row):
if row.State == "TX" and "FSH" in row["Policy Number"]:
return row.TEST
else:
return 0
df3.TEST = df3.apply(set_tx_fsh, axis=1) # Set values in "TEST" column based on your condition
Here is your solution:
# ... initialize df1 and df2 here
df3 = df1.join(df2.FSH) # Merge df1 and df2 into a single dataframe
df3 = df3.rename({"FSH": "TEST"}, axis=1) # Change column name
def set_tx_fsh(row):
if row.State == "TX" and "FSH" in row["Policy Number"]:
return row.TEST
else:
return 0
df3.TEST = df3.apply(set_tx_fsh, axis=1) # Set values in "TEST" column based on your condition
answered Nov 19 at 18:47
jadelord
463411
463411
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53380615%2fpopulating-a-data-frame-using-from-separate-table-using-loc%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown