find elements by xpath selenium phantomjs
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
I'm using Rselenium
for scrapping. For this, I have installed java
and JDK's
, chromedriver
, selenium server standalone
and the headless browser phantomjs
in my VM instance of Google Cloud.
I need to catch the text of the first rating:
remDr <- remoteDriver(browserName = 'chrome', port = 4444L)
remDr$open()
remDr$setWindowSize(1280L, 1024L)
remDr$navigate("https://www.ratebeer.com/reviews/sullerica-1561/294423")
text_post = remDr$findElements("xpath",'//*[@id="root"]/div/div[2]/div/div[2]/div[2]/div/div[1]/div[3]/div/div[2]/div[1]/div/div[2]/div/div[1]/div/div/div[1]')
text_post
## list()
Finally text_post
is empty.
However, If I test the same script on my local laptop with RSelenium, chrome browser and the same XPath, it's a success!
What's going on?
Is it due to using phantomjs?
Thanks in advance.
sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.5 LTS
r selenium selenium-webdriver phantomjs rselenium
add a comment |
I'm using Rselenium
for scrapping. For this, I have installed java
and JDK's
, chromedriver
, selenium server standalone
and the headless browser phantomjs
in my VM instance of Google Cloud.
I need to catch the text of the first rating:
remDr <- remoteDriver(browserName = 'chrome', port = 4444L)
remDr$open()
remDr$setWindowSize(1280L, 1024L)
remDr$navigate("https://www.ratebeer.com/reviews/sullerica-1561/294423")
text_post = remDr$findElements("xpath",'//*[@id="root"]/div/div[2]/div/div[2]/div[2]/div/div[1]/div[3]/div/div[2]/div[1]/div/div[2]/div/div[1]/div/div/div[1]')
text_post
## list()
Finally text_post
is empty.
However, If I test the same script on my local laptop with RSelenium, chrome browser and the same XPath, it's a success!
What's going on?
Is it due to using phantomjs?
Thanks in advance.
sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.5 LTS
r selenium selenium-webdriver phantomjs rselenium
add a comment |
I'm using Rselenium
for scrapping. For this, I have installed java
and JDK's
, chromedriver
, selenium server standalone
and the headless browser phantomjs
in my VM instance of Google Cloud.
I need to catch the text of the first rating:
remDr <- remoteDriver(browserName = 'chrome', port = 4444L)
remDr$open()
remDr$setWindowSize(1280L, 1024L)
remDr$navigate("https://www.ratebeer.com/reviews/sullerica-1561/294423")
text_post = remDr$findElements("xpath",'//*[@id="root"]/div/div[2]/div/div[2]/div[2]/div/div[1]/div[3]/div/div[2]/div[1]/div/div[2]/div/div[1]/div/div/div[1]')
text_post
## list()
Finally text_post
is empty.
However, If I test the same script on my local laptop with RSelenium, chrome browser and the same XPath, it's a success!
What's going on?
Is it due to using phantomjs?
Thanks in advance.
sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.5 LTS
r selenium selenium-webdriver phantomjs rselenium
I'm using Rselenium
for scrapping. For this, I have installed java
and JDK's
, chromedriver
, selenium server standalone
and the headless browser phantomjs
in my VM instance of Google Cloud.
I need to catch the text of the first rating:
remDr <- remoteDriver(browserName = 'chrome', port = 4444L)
remDr$open()
remDr$setWindowSize(1280L, 1024L)
remDr$navigate("https://www.ratebeer.com/reviews/sullerica-1561/294423")
text_post = remDr$findElements("xpath",'//*[@id="root"]/div/div[2]/div/div[2]/div[2]/div/div[1]/div[3]/div/div[2]/div[1]/div/div[2]/div/div[1]/div/div/div[1]')
text_post
## list()
Finally text_post
is empty.
However, If I test the same script on my local laptop with RSelenium, chrome browser and the same XPath, it's a success!
What's going on?
Is it due to using phantomjs?
Thanks in advance.
sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.5 LTS
r selenium selenium-webdriver phantomjs rselenium
r selenium selenium-webdriver phantomjs rselenium
edited Nov 23 '18 at 13:04
hrbrmstr
62k694154
62k694154
asked Nov 23 '18 at 12:23
Mario M.Mario M.
354213
354213
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
As per the HTML you can use the xpath as:
//div[@id="root"]//span[contains(.,'20')]//following::div[contains(@class,'LinesEllipsis')]
Note: As the elements are dynamically generated elements you have to induce WebDriverWait for the Elements to be visible.
@MarioM. Upvote the answer if this/any answer is/was helpful to you for the benefit of the future readers.
– DebanjanB
Nov 26 '18 at 11:15
add a comment |
You don't need a heavyweight, third-party dependency. That site uses graphql POST
requests under the hood in asynchronous XHR requests to retrieve the data. You can see it if you open Developer Tools and
I did a "Copy POST Data" (usually the same or rly similar context menu item in all browsers) and un-minimized the graphql query in the Response tab just to show you what it is and to also, perhaps, make it easier for you to see the query and augment it on your own (what I just said is out of scope for "but what about…" follow on questions in comments; please file a new question if you want help with that).
'[
{
"operationName": "beer",
"query": "query beer($beerId: ID!) {n info: beer(id: $beerId) {n idn namen __typenamen }n}n",
"variables": {
"beerId": "294423"
}
},
{
"operationName": "beer",
"query": "query beer($beerId: ID!) {n info: beer(id: $beerId) {n idn namen styleScoren overallScoren averageRatingn ratingCountn __typenamen }n}n",
"variables": {
"beerId": "294423"
}
},
{
"operationName": "beerReviews",
"query": "query beerReviews($beerId: ID!, $authorId: ID, $order: ReviewOrder, $after: ID) {n beerReviewsArr: beerReviews(beerId: $beerId, authorId: $authorId, order: $order, after: $after) {n items {n ...ReviewItemn __typenamen }n totalCountn lastn __typenamen }n}nnfragment ReviewItem on Review {n idn commentn scoren scores {n appearancen aroman flavorn mouthfeeln overalln __typenamen }n author {n idn usernamen reviewCountn __typenamen }n checkin {n idn place {n idn namen cityn state {n idn namen __typenamen }n country {n idn namen __typenamen }n __typenamen }n __typenamen }n servedInn likeCountn likedByMen createdAtn updatedAtn __typenamen}n",
"variables": {
"beerId": "294423",
"first": 7,
"order": "RECENT"
}
}
]' -> graphql_query
We will need to scrunch that back into one line for the API call (which I do with the gsub()
below. We also need to manually specify the content type and ensure httr
does not try to mangle the body data by setting the encoding to raw
:
httr::POST(
url = "https://beta.ratebeer.com/v1/api/graphql/",
httr::content_type("application/json"),
encode = "raw",
body = gsub("n", " ", graphql_query),
httr::verbose()
) -> res
Now we have a structured, but heavily nested, list with your ifo in it. Pretty sure you're after items
below:
str(httr::content(res), 4)
## List of 3
## $ :List of 1
## ..$ data:List of 1
## .. ..$ info:List of 3
## .. .. ..$ id : chr "294423"
## .. .. ..$ name : chr "Sullerica 1561"
## .. .. ..$ __typename: chr "Beer"
## $ :List of 1
## ..$ data:List of 1
## .. ..$ info:List of 7
## .. .. ..$ id : chr "294423"
## .. .. ..$ name : chr "Sullerica 1561"
## .. .. ..$ styleScore : num 35.1
## .. .. ..$ overallScore : num 51.8
## .. .. ..$ averageRating: num 3.25
## .. .. ..$ ratingCount : int 21
## .. .. ..$ __typename : chr "Beer"
## $ :List of 1
## ..$ data:List of 1
## .. ..$ beerReviewsArr:List of 4
## .. .. ..$ items :List of 10
## .. .. ..$ totalCount: int 21
## .. .. ..$ last : chr "7177326"
## .. .. ..$ __typename: chr "ReviewList"
It does only have 10 out of 21 so scroll down in your browser window with Developer Tools open and look at the second POST
request that gets made, see what parameters changed and now you will have an even better idea of how to access the site's back-end API vs have to scrape for content.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53446669%2ffind-elements-by-xpath-selenium-phantomjs%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
As per the HTML you can use the xpath as:
//div[@id="root"]//span[contains(.,'20')]//following::div[contains(@class,'LinesEllipsis')]
Note: As the elements are dynamically generated elements you have to induce WebDriverWait for the Elements to be visible.
@MarioM. Upvote the answer if this/any answer is/was helpful to you for the benefit of the future readers.
– DebanjanB
Nov 26 '18 at 11:15
add a comment |
As per the HTML you can use the xpath as:
//div[@id="root"]//span[contains(.,'20')]//following::div[contains(@class,'LinesEllipsis')]
Note: As the elements are dynamically generated elements you have to induce WebDriverWait for the Elements to be visible.
@MarioM. Upvote the answer if this/any answer is/was helpful to you for the benefit of the future readers.
– DebanjanB
Nov 26 '18 at 11:15
add a comment |
As per the HTML you can use the xpath as:
//div[@id="root"]//span[contains(.,'20')]//following::div[contains(@class,'LinesEllipsis')]
Note: As the elements are dynamically generated elements you have to induce WebDriverWait for the Elements to be visible.
As per the HTML you can use the xpath as:
//div[@id="root"]//span[contains(.,'20')]//following::div[contains(@class,'LinesEllipsis')]
Note: As the elements are dynamically generated elements you have to induce WebDriverWait for the Elements to be visible.
answered Nov 23 '18 at 12:42
DebanjanBDebanjanB
46.8k134790
46.8k134790
@MarioM. Upvote the answer if this/any answer is/was helpful to you for the benefit of the future readers.
– DebanjanB
Nov 26 '18 at 11:15
add a comment |
@MarioM. Upvote the answer if this/any answer is/was helpful to you for the benefit of the future readers.
– DebanjanB
Nov 26 '18 at 11:15
@MarioM. Upvote the answer if this/any answer is/was helpful to you for the benefit of the future readers.
– DebanjanB
Nov 26 '18 at 11:15
@MarioM. Upvote the answer if this/any answer is/was helpful to you for the benefit of the future readers.
– DebanjanB
Nov 26 '18 at 11:15
add a comment |
You don't need a heavyweight, third-party dependency. That site uses graphql POST
requests under the hood in asynchronous XHR requests to retrieve the data. You can see it if you open Developer Tools and
I did a "Copy POST Data" (usually the same or rly similar context menu item in all browsers) and un-minimized the graphql query in the Response tab just to show you what it is and to also, perhaps, make it easier for you to see the query and augment it on your own (what I just said is out of scope for "but what about…" follow on questions in comments; please file a new question if you want help with that).
'[
{
"operationName": "beer",
"query": "query beer($beerId: ID!) {n info: beer(id: $beerId) {n idn namen __typenamen }n}n",
"variables": {
"beerId": "294423"
}
},
{
"operationName": "beer",
"query": "query beer($beerId: ID!) {n info: beer(id: $beerId) {n idn namen styleScoren overallScoren averageRatingn ratingCountn __typenamen }n}n",
"variables": {
"beerId": "294423"
}
},
{
"operationName": "beerReviews",
"query": "query beerReviews($beerId: ID!, $authorId: ID, $order: ReviewOrder, $after: ID) {n beerReviewsArr: beerReviews(beerId: $beerId, authorId: $authorId, order: $order, after: $after) {n items {n ...ReviewItemn __typenamen }n totalCountn lastn __typenamen }n}nnfragment ReviewItem on Review {n idn commentn scoren scores {n appearancen aroman flavorn mouthfeeln overalln __typenamen }n author {n idn usernamen reviewCountn __typenamen }n checkin {n idn place {n idn namen cityn state {n idn namen __typenamen }n country {n idn namen __typenamen }n __typenamen }n __typenamen }n servedInn likeCountn likedByMen createdAtn updatedAtn __typenamen}n",
"variables": {
"beerId": "294423",
"first": 7,
"order": "RECENT"
}
}
]' -> graphql_query
We will need to scrunch that back into one line for the API call (which I do with the gsub()
below. We also need to manually specify the content type and ensure httr
does not try to mangle the body data by setting the encoding to raw
:
httr::POST(
url = "https://beta.ratebeer.com/v1/api/graphql/",
httr::content_type("application/json"),
encode = "raw",
body = gsub("n", " ", graphql_query),
httr::verbose()
) -> res
Now we have a structured, but heavily nested, list with your ifo in it. Pretty sure you're after items
below:
str(httr::content(res), 4)
## List of 3
## $ :List of 1
## ..$ data:List of 1
## .. ..$ info:List of 3
## .. .. ..$ id : chr "294423"
## .. .. ..$ name : chr "Sullerica 1561"
## .. .. ..$ __typename: chr "Beer"
## $ :List of 1
## ..$ data:List of 1
## .. ..$ info:List of 7
## .. .. ..$ id : chr "294423"
## .. .. ..$ name : chr "Sullerica 1561"
## .. .. ..$ styleScore : num 35.1
## .. .. ..$ overallScore : num 51.8
## .. .. ..$ averageRating: num 3.25
## .. .. ..$ ratingCount : int 21
## .. .. ..$ __typename : chr "Beer"
## $ :List of 1
## ..$ data:List of 1
## .. ..$ beerReviewsArr:List of 4
## .. .. ..$ items :List of 10
## .. .. ..$ totalCount: int 21
## .. .. ..$ last : chr "7177326"
## .. .. ..$ __typename: chr "ReviewList"
It does only have 10 out of 21 so scroll down in your browser window with Developer Tools open and look at the second POST
request that gets made, see what parameters changed and now you will have an even better idea of how to access the site's back-end API vs have to scrape for content.
add a comment |
You don't need a heavyweight, third-party dependency. That site uses graphql POST
requests under the hood in asynchronous XHR requests to retrieve the data. You can see it if you open Developer Tools and
I did a "Copy POST Data" (usually the same or rly similar context menu item in all browsers) and un-minimized the graphql query in the Response tab just to show you what it is and to also, perhaps, make it easier for you to see the query and augment it on your own (what I just said is out of scope for "but what about…" follow on questions in comments; please file a new question if you want help with that).
'[
{
"operationName": "beer",
"query": "query beer($beerId: ID!) {n info: beer(id: $beerId) {n idn namen __typenamen }n}n",
"variables": {
"beerId": "294423"
}
},
{
"operationName": "beer",
"query": "query beer($beerId: ID!) {n info: beer(id: $beerId) {n idn namen styleScoren overallScoren averageRatingn ratingCountn __typenamen }n}n",
"variables": {
"beerId": "294423"
}
},
{
"operationName": "beerReviews",
"query": "query beerReviews($beerId: ID!, $authorId: ID, $order: ReviewOrder, $after: ID) {n beerReviewsArr: beerReviews(beerId: $beerId, authorId: $authorId, order: $order, after: $after) {n items {n ...ReviewItemn __typenamen }n totalCountn lastn __typenamen }n}nnfragment ReviewItem on Review {n idn commentn scoren scores {n appearancen aroman flavorn mouthfeeln overalln __typenamen }n author {n idn usernamen reviewCountn __typenamen }n checkin {n idn place {n idn namen cityn state {n idn namen __typenamen }n country {n idn namen __typenamen }n __typenamen }n __typenamen }n servedInn likeCountn likedByMen createdAtn updatedAtn __typenamen}n",
"variables": {
"beerId": "294423",
"first": 7,
"order": "RECENT"
}
}
]' -> graphql_query
We will need to scrunch that back into one line for the API call (which I do with the gsub()
below. We also need to manually specify the content type and ensure httr
does not try to mangle the body data by setting the encoding to raw
:
httr::POST(
url = "https://beta.ratebeer.com/v1/api/graphql/",
httr::content_type("application/json"),
encode = "raw",
body = gsub("n", " ", graphql_query),
httr::verbose()
) -> res
Now we have a structured, but heavily nested, list with your ifo in it. Pretty sure you're after items
below:
str(httr::content(res), 4)
## List of 3
## $ :List of 1
## ..$ data:List of 1
## .. ..$ info:List of 3
## .. .. ..$ id : chr "294423"
## .. .. ..$ name : chr "Sullerica 1561"
## .. .. ..$ __typename: chr "Beer"
## $ :List of 1
## ..$ data:List of 1
## .. ..$ info:List of 7
## .. .. ..$ id : chr "294423"
## .. .. ..$ name : chr "Sullerica 1561"
## .. .. ..$ styleScore : num 35.1
## .. .. ..$ overallScore : num 51.8
## .. .. ..$ averageRating: num 3.25
## .. .. ..$ ratingCount : int 21
## .. .. ..$ __typename : chr "Beer"
## $ :List of 1
## ..$ data:List of 1
## .. ..$ beerReviewsArr:List of 4
## .. .. ..$ items :List of 10
## .. .. ..$ totalCount: int 21
## .. .. ..$ last : chr "7177326"
## .. .. ..$ __typename: chr "ReviewList"
It does only have 10 out of 21 so scroll down in your browser window with Developer Tools open and look at the second POST
request that gets made, see what parameters changed and now you will have an even better idea of how to access the site's back-end API vs have to scrape for content.
add a comment |
You don't need a heavyweight, third-party dependency. That site uses graphql POST
requests under the hood in asynchronous XHR requests to retrieve the data. You can see it if you open Developer Tools and
I did a "Copy POST Data" (usually the same or rly similar context menu item in all browsers) and un-minimized the graphql query in the Response tab just to show you what it is and to also, perhaps, make it easier for you to see the query and augment it on your own (what I just said is out of scope for "but what about…" follow on questions in comments; please file a new question if you want help with that).
'[
{
"operationName": "beer",
"query": "query beer($beerId: ID!) {n info: beer(id: $beerId) {n idn namen __typenamen }n}n",
"variables": {
"beerId": "294423"
}
},
{
"operationName": "beer",
"query": "query beer($beerId: ID!) {n info: beer(id: $beerId) {n idn namen styleScoren overallScoren averageRatingn ratingCountn __typenamen }n}n",
"variables": {
"beerId": "294423"
}
},
{
"operationName": "beerReviews",
"query": "query beerReviews($beerId: ID!, $authorId: ID, $order: ReviewOrder, $after: ID) {n beerReviewsArr: beerReviews(beerId: $beerId, authorId: $authorId, order: $order, after: $after) {n items {n ...ReviewItemn __typenamen }n totalCountn lastn __typenamen }n}nnfragment ReviewItem on Review {n idn commentn scoren scores {n appearancen aroman flavorn mouthfeeln overalln __typenamen }n author {n idn usernamen reviewCountn __typenamen }n checkin {n idn place {n idn namen cityn state {n idn namen __typenamen }n country {n idn namen __typenamen }n __typenamen }n __typenamen }n servedInn likeCountn likedByMen createdAtn updatedAtn __typenamen}n",
"variables": {
"beerId": "294423",
"first": 7,
"order": "RECENT"
}
}
]' -> graphql_query
We will need to scrunch that back into one line for the API call (which I do with the gsub()
below. We also need to manually specify the content type and ensure httr
does not try to mangle the body data by setting the encoding to raw
:
httr::POST(
url = "https://beta.ratebeer.com/v1/api/graphql/",
httr::content_type("application/json"),
encode = "raw",
body = gsub("n", " ", graphql_query),
httr::verbose()
) -> res
Now we have a structured, but heavily nested, list with your ifo in it. Pretty sure you're after items
below:
str(httr::content(res), 4)
## List of 3
## $ :List of 1
## ..$ data:List of 1
## .. ..$ info:List of 3
## .. .. ..$ id : chr "294423"
## .. .. ..$ name : chr "Sullerica 1561"
## .. .. ..$ __typename: chr "Beer"
## $ :List of 1
## ..$ data:List of 1
## .. ..$ info:List of 7
## .. .. ..$ id : chr "294423"
## .. .. ..$ name : chr "Sullerica 1561"
## .. .. ..$ styleScore : num 35.1
## .. .. ..$ overallScore : num 51.8
## .. .. ..$ averageRating: num 3.25
## .. .. ..$ ratingCount : int 21
## .. .. ..$ __typename : chr "Beer"
## $ :List of 1
## ..$ data:List of 1
## .. ..$ beerReviewsArr:List of 4
## .. .. ..$ items :List of 10
## .. .. ..$ totalCount: int 21
## .. .. ..$ last : chr "7177326"
## .. .. ..$ __typename: chr "ReviewList"
It does only have 10 out of 21 so scroll down in your browser window with Developer Tools open and look at the second POST
request that gets made, see what parameters changed and now you will have an even better idea of how to access the site's back-end API vs have to scrape for content.
You don't need a heavyweight, third-party dependency. That site uses graphql POST
requests under the hood in asynchronous XHR requests to retrieve the data. You can see it if you open Developer Tools and
I did a "Copy POST Data" (usually the same or rly similar context menu item in all browsers) and un-minimized the graphql query in the Response tab just to show you what it is and to also, perhaps, make it easier for you to see the query and augment it on your own (what I just said is out of scope for "but what about…" follow on questions in comments; please file a new question if you want help with that).
'[
{
"operationName": "beer",
"query": "query beer($beerId: ID!) {n info: beer(id: $beerId) {n idn namen __typenamen }n}n",
"variables": {
"beerId": "294423"
}
},
{
"operationName": "beer",
"query": "query beer($beerId: ID!) {n info: beer(id: $beerId) {n idn namen styleScoren overallScoren averageRatingn ratingCountn __typenamen }n}n",
"variables": {
"beerId": "294423"
}
},
{
"operationName": "beerReviews",
"query": "query beerReviews($beerId: ID!, $authorId: ID, $order: ReviewOrder, $after: ID) {n beerReviewsArr: beerReviews(beerId: $beerId, authorId: $authorId, order: $order, after: $after) {n items {n ...ReviewItemn __typenamen }n totalCountn lastn __typenamen }n}nnfragment ReviewItem on Review {n idn commentn scoren scores {n appearancen aroman flavorn mouthfeeln overalln __typenamen }n author {n idn usernamen reviewCountn __typenamen }n checkin {n idn place {n idn namen cityn state {n idn namen __typenamen }n country {n idn namen __typenamen }n __typenamen }n __typenamen }n servedInn likeCountn likedByMen createdAtn updatedAtn __typenamen}n",
"variables": {
"beerId": "294423",
"first": 7,
"order": "RECENT"
}
}
]' -> graphql_query
We will need to scrunch that back into one line for the API call (which I do with the gsub()
below. We also need to manually specify the content type and ensure httr
does not try to mangle the body data by setting the encoding to raw
:
httr::POST(
url = "https://beta.ratebeer.com/v1/api/graphql/",
httr::content_type("application/json"),
encode = "raw",
body = gsub("n", " ", graphql_query),
httr::verbose()
) -> res
Now we have a structured, but heavily nested, list with your ifo in it. Pretty sure you're after items
below:
str(httr::content(res), 4)
## List of 3
## $ :List of 1
## ..$ data:List of 1
## .. ..$ info:List of 3
## .. .. ..$ id : chr "294423"
## .. .. ..$ name : chr "Sullerica 1561"
## .. .. ..$ __typename: chr "Beer"
## $ :List of 1
## ..$ data:List of 1
## .. ..$ info:List of 7
## .. .. ..$ id : chr "294423"
## .. .. ..$ name : chr "Sullerica 1561"
## .. .. ..$ styleScore : num 35.1
## .. .. ..$ overallScore : num 51.8
## .. .. ..$ averageRating: num 3.25
## .. .. ..$ ratingCount : int 21
## .. .. ..$ __typename : chr "Beer"
## $ :List of 1
## ..$ data:List of 1
## .. ..$ beerReviewsArr:List of 4
## .. .. ..$ items :List of 10
## .. .. ..$ totalCount: int 21
## .. .. ..$ last : chr "7177326"
## .. .. ..$ __typename: chr "ReviewList"
It does only have 10 out of 21 so scroll down in your browser window with Developer Tools open and look at the second POST
request that gets made, see what parameters changed and now you will have an even better idea of how to access the site's back-end API vs have to scrape for content.
answered Nov 23 '18 at 13:02
hrbrmstrhrbrmstr
62k694154
62k694154
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53446669%2ffind-elements-by-xpath-selenium-phantomjs%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown