how to fetch data in a batch from hbase in Geomesa?
GeoTools api is one way for Geomesa ingest method to get data from Hbase, but when I use org.geotools.data.simple.SimpleFeatureCollection, it seems that only a Iterator can be manipulated by SimpleFeatureCollection.features(), one problem occurs in which when I want to traverse the results , the iterator.hasNext() method costs too much time, Can I fetch data in a batch way from hbase in Geomesa not only by the Iterator?
hbase geomesa
add a comment |
GeoTools api is one way for Geomesa ingest method to get data from Hbase, but when I use org.geotools.data.simple.SimpleFeatureCollection, it seems that only a Iterator can be manipulated by SimpleFeatureCollection.features(), one problem occurs in which when I want to traverse the results , the iterator.hasNext() method costs too much time, Can I fetch data in a batch way from hbase in Geomesa not only by the Iterator?
hbase geomesa
add a comment |
GeoTools api is one way for Geomesa ingest method to get data from Hbase, but when I use org.geotools.data.simple.SimpleFeatureCollection, it seems that only a Iterator can be manipulated by SimpleFeatureCollection.features(), one problem occurs in which when I want to traverse the results , the iterator.hasNext() method costs too much time, Can I fetch data in a batch way from hbase in Geomesa not only by the Iterator?
hbase geomesa
GeoTools api is one way for Geomesa ingest method to get data from Hbase, but when I use org.geotools.data.simple.SimpleFeatureCollection, it seems that only a Iterator can be manipulated by SimpleFeatureCollection.features(), one problem occurs in which when I want to traverse the results , the iterator.hasNext() method costs too much time, Can I fetch data in a batch way from hbase in Geomesa not only by the Iterator?
hbase geomesa
hbase geomesa
asked Nov 23 '18 at 1:07
luwayluway
61
61
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
Behind the scenes, there is some batching being done, but the batches are fetched lazily (i.e. on a call to hasNext
, if there isn't any local data it will do a remote fetch). You can control the HBase read-ahead through the system property geomesa.hbase.client.scanner.caching.size
(see here). The GeoTools API doesn't provide any batch mechanisms per-say, however.
For simple use cases, if you just want to fetch everything up front, you can pull the iterator into an ArrayList, then operate on it afterwards. To avoid waiting for the entire result set to be fetched, you could set up producer/consumer threads, so that one thread is continuously pre-fetching data and the second thread is operating on the results that have come back.
For more advanced use cases, you can use Spark (or map/reduce directly) to load an entire result set at once.
thank you very much
– luway
Dec 5 '18 at 9:13
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53439584%2fhow-to-fetch-data-in-a-batch-from-hbase-in-geomesa%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Behind the scenes, there is some batching being done, but the batches are fetched lazily (i.e. on a call to hasNext
, if there isn't any local data it will do a remote fetch). You can control the HBase read-ahead through the system property geomesa.hbase.client.scanner.caching.size
(see here). The GeoTools API doesn't provide any batch mechanisms per-say, however.
For simple use cases, if you just want to fetch everything up front, you can pull the iterator into an ArrayList, then operate on it afterwards. To avoid waiting for the entire result set to be fetched, you could set up producer/consumer threads, so that one thread is continuously pre-fetching data and the second thread is operating on the results that have come back.
For more advanced use cases, you can use Spark (or map/reduce directly) to load an entire result set at once.
thank you very much
– luway
Dec 5 '18 at 9:13
add a comment |
Behind the scenes, there is some batching being done, but the batches are fetched lazily (i.e. on a call to hasNext
, if there isn't any local data it will do a remote fetch). You can control the HBase read-ahead through the system property geomesa.hbase.client.scanner.caching.size
(see here). The GeoTools API doesn't provide any batch mechanisms per-say, however.
For simple use cases, if you just want to fetch everything up front, you can pull the iterator into an ArrayList, then operate on it afterwards. To avoid waiting for the entire result set to be fetched, you could set up producer/consumer threads, so that one thread is continuously pre-fetching data and the second thread is operating on the results that have come back.
For more advanced use cases, you can use Spark (or map/reduce directly) to load an entire result set at once.
thank you very much
– luway
Dec 5 '18 at 9:13
add a comment |
Behind the scenes, there is some batching being done, but the batches are fetched lazily (i.e. on a call to hasNext
, if there isn't any local data it will do a remote fetch). You can control the HBase read-ahead through the system property geomesa.hbase.client.scanner.caching.size
(see here). The GeoTools API doesn't provide any batch mechanisms per-say, however.
For simple use cases, if you just want to fetch everything up front, you can pull the iterator into an ArrayList, then operate on it afterwards. To avoid waiting for the entire result set to be fetched, you could set up producer/consumer threads, so that one thread is continuously pre-fetching data and the second thread is operating on the results that have come back.
For more advanced use cases, you can use Spark (or map/reduce directly) to load an entire result set at once.
Behind the scenes, there is some batching being done, but the batches are fetched lazily (i.e. on a call to hasNext
, if there isn't any local data it will do a remote fetch). You can control the HBase read-ahead through the system property geomesa.hbase.client.scanner.caching.size
(see here). The GeoTools API doesn't provide any batch mechanisms per-say, however.
For simple use cases, if you just want to fetch everything up front, you can pull the iterator into an ArrayList, then operate on it afterwards. To avoid waiting for the entire result set to be fetched, you could set up producer/consumer threads, so that one thread is continuously pre-fetching data and the second thread is operating on the results that have come back.
For more advanced use cases, you can use Spark (or map/reduce directly) to load an entire result set at once.
answered Nov 26 '18 at 14:01
Emilio Lahr-VivazEmilio Lahr-Vivaz
71615
71615
thank you very much
– luway
Dec 5 '18 at 9:13
add a comment |
thank you very much
– luway
Dec 5 '18 at 9:13
thank you very much
– luway
Dec 5 '18 at 9:13
thank you very much
– luway
Dec 5 '18 at 9:13
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53439584%2fhow-to-fetch-data-in-a-batch-from-hbase-in-geomesa%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown