Schedule YARN application on active/standby nodes
I would like to have a cluster that is split to 2 sub-clusters: "active" nodes and "standby" nodes.
Normally, when an application is scheduled I would like it to run on the "active" nodes. But if no "active" node is healthy, I would like it to run on the "standby" nodes.
Is there a way to achieve such behavior in YARN?
To give a bit more details, the "active" nodes of the cluster will be located in different zone than the the "standby" nodes (but not so far from them).
Thus we try to achieve multi-zone high availability for our application. Meaning, upon disaster in the "active" zone, the application will be recovered and scheduled on the "standby" zone.
yarn
add a comment |
I would like to have a cluster that is split to 2 sub-clusters: "active" nodes and "standby" nodes.
Normally, when an application is scheduled I would like it to run on the "active" nodes. But if no "active" node is healthy, I would like it to run on the "standby" nodes.
Is there a way to achieve such behavior in YARN?
To give a bit more details, the "active" nodes of the cluster will be located in different zone than the the "standby" nodes (but not so far from them).
Thus we try to achieve multi-zone high availability for our application. Meaning, upon disaster in the "active" zone, the application will be recovered and scheduled on the "standby" zone.
yarn
What version of Hadoop are you running?
– tk421
Nov 22 '18 at 1:31
Currently we are just checking our options. We are open for any version that gives us that functionality. Thanks.
– Shay
Nov 22 '18 at 8:32
add a comment |
I would like to have a cluster that is split to 2 sub-clusters: "active" nodes and "standby" nodes.
Normally, when an application is scheduled I would like it to run on the "active" nodes. But if no "active" node is healthy, I would like it to run on the "standby" nodes.
Is there a way to achieve such behavior in YARN?
To give a bit more details, the "active" nodes of the cluster will be located in different zone than the the "standby" nodes (but not so far from them).
Thus we try to achieve multi-zone high availability for our application. Meaning, upon disaster in the "active" zone, the application will be recovered and scheduled on the "standby" zone.
yarn
I would like to have a cluster that is split to 2 sub-clusters: "active" nodes and "standby" nodes.
Normally, when an application is scheduled I would like it to run on the "active" nodes. But if no "active" node is healthy, I would like it to run on the "standby" nodes.
Is there a way to achieve such behavior in YARN?
To give a bit more details, the "active" nodes of the cluster will be located in different zone than the the "standby" nodes (but not so far from them).
Thus we try to achieve multi-zone high availability for our application. Meaning, upon disaster in the "active" zone, the application will be recovered and scheduled on the "standby" zone.
yarn
yarn
asked Nov 21 '18 at 21:28
ShayShay
13910
13910
What version of Hadoop are you running?
– tk421
Nov 22 '18 at 1:31
Currently we are just checking our options. We are open for any version that gives us that functionality. Thanks.
– Shay
Nov 22 '18 at 8:32
add a comment |
What version of Hadoop are you running?
– tk421
Nov 22 '18 at 1:31
Currently we are just checking our options. We are open for any version that gives us that functionality. Thanks.
– Shay
Nov 22 '18 at 8:32
What version of Hadoop are you running?
– tk421
Nov 22 '18 at 1:31
What version of Hadoop are you running?
– tk421
Nov 22 '18 at 1:31
Currently we are just checking our options. We are open for any version that gives us that functionality. Thanks.
– Shay
Nov 22 '18 at 8:32
Currently we are just checking our options. We are open for any version that gives us that functionality. Thanks.
– Shay
Nov 22 '18 at 8:32
add a comment |
1 Answer
1
active
oldest
votes
To route jobs to specific nodes, you will need Node Labels. Capacity Scheduler has had them for a while (2.6 or earlier), but for Fair Scheduler I think they were planning on supporting them in Hadoop 3.x.
Another option to consider is YARN federation where you have more than one YARN cluster so your 2nd would be in zone 2 and you can re-route your job to zone 2 if zone 1 has issues.
References
- YARN Node Labels
- Hadoop: YARN Federation
Thanks @tk421. Using Node Labels, can I configure something like "prefer selecting nodes with 'active' labels, and if not healthy select others"? As far as I understood - I can't (though, in k8s it is possible).
– Shay
Nov 29 '18 at 17:48
You'd have to do this via scheduling queues. Node health is part of YARN automatically meaning if your Node Manager is unusable, it will mark itself as unavailable. See hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/….
– tk421
Nov 29 '18 at 20:06
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53420709%2fschedule-yarn-application-on-active-standby-nodes%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
To route jobs to specific nodes, you will need Node Labels. Capacity Scheduler has had them for a while (2.6 or earlier), but for Fair Scheduler I think they were planning on supporting them in Hadoop 3.x.
Another option to consider is YARN federation where you have more than one YARN cluster so your 2nd would be in zone 2 and you can re-route your job to zone 2 if zone 1 has issues.
References
- YARN Node Labels
- Hadoop: YARN Federation
Thanks @tk421. Using Node Labels, can I configure something like "prefer selecting nodes with 'active' labels, and if not healthy select others"? As far as I understood - I can't (though, in k8s it is possible).
– Shay
Nov 29 '18 at 17:48
You'd have to do this via scheduling queues. Node health is part of YARN automatically meaning if your Node Manager is unusable, it will mark itself as unavailable. See hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/….
– tk421
Nov 29 '18 at 20:06
add a comment |
To route jobs to specific nodes, you will need Node Labels. Capacity Scheduler has had them for a while (2.6 or earlier), but for Fair Scheduler I think they were planning on supporting them in Hadoop 3.x.
Another option to consider is YARN federation where you have more than one YARN cluster so your 2nd would be in zone 2 and you can re-route your job to zone 2 if zone 1 has issues.
References
- YARN Node Labels
- Hadoop: YARN Federation
Thanks @tk421. Using Node Labels, can I configure something like "prefer selecting nodes with 'active' labels, and if not healthy select others"? As far as I understood - I can't (though, in k8s it is possible).
– Shay
Nov 29 '18 at 17:48
You'd have to do this via scheduling queues. Node health is part of YARN automatically meaning if your Node Manager is unusable, it will mark itself as unavailable. See hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/….
– tk421
Nov 29 '18 at 20:06
add a comment |
To route jobs to specific nodes, you will need Node Labels. Capacity Scheduler has had them for a while (2.6 or earlier), but for Fair Scheduler I think they were planning on supporting them in Hadoop 3.x.
Another option to consider is YARN federation where you have more than one YARN cluster so your 2nd would be in zone 2 and you can re-route your job to zone 2 if zone 1 has issues.
References
- YARN Node Labels
- Hadoop: YARN Federation
To route jobs to specific nodes, you will need Node Labels. Capacity Scheduler has had them for a while (2.6 or earlier), but for Fair Scheduler I think they were planning on supporting them in Hadoop 3.x.
Another option to consider is YARN federation where you have more than one YARN cluster so your 2nd would be in zone 2 and you can re-route your job to zone 2 if zone 1 has issues.
References
- YARN Node Labels
- Hadoop: YARN Federation
answered Nov 28 '18 at 22:20
tk421tk421
3,50231526
3,50231526
Thanks @tk421. Using Node Labels, can I configure something like "prefer selecting nodes with 'active' labels, and if not healthy select others"? As far as I understood - I can't (though, in k8s it is possible).
– Shay
Nov 29 '18 at 17:48
You'd have to do this via scheduling queues. Node health is part of YARN automatically meaning if your Node Manager is unusable, it will mark itself as unavailable. See hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/….
– tk421
Nov 29 '18 at 20:06
add a comment |
Thanks @tk421. Using Node Labels, can I configure something like "prefer selecting nodes with 'active' labels, and if not healthy select others"? As far as I understood - I can't (though, in k8s it is possible).
– Shay
Nov 29 '18 at 17:48
You'd have to do this via scheduling queues. Node health is part of YARN automatically meaning if your Node Manager is unusable, it will mark itself as unavailable. See hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/….
– tk421
Nov 29 '18 at 20:06
Thanks @tk421. Using Node Labels, can I configure something like "prefer selecting nodes with 'active' labels, and if not healthy select others"? As far as I understood - I can't (though, in k8s it is possible).
– Shay
Nov 29 '18 at 17:48
Thanks @tk421. Using Node Labels, can I configure something like "prefer selecting nodes with 'active' labels, and if not healthy select others"? As far as I understood - I can't (though, in k8s it is possible).
– Shay
Nov 29 '18 at 17:48
You'd have to do this via scheduling queues. Node health is part of YARN automatically meaning if your Node Manager is unusable, it will mark itself as unavailable. See hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/….
– tk421
Nov 29 '18 at 20:06
You'd have to do this via scheduling queues. Node health is part of YARN automatically meaning if your Node Manager is unusable, it will mark itself as unavailable. See hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/….
– tk421
Nov 29 '18 at 20:06
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53420709%2fschedule-yarn-application-on-active-standby-nodes%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
What version of Hadoop are you running?
– tk421
Nov 22 '18 at 1:31
Currently we are just checking our options. We are open for any version that gives us that functionality. Thanks.
– Shay
Nov 22 '18 at 8:32