How to get R to read all the other human languages?
up vote
1
down vote
favorite
Can someone tell me how to get R to display normally all human languages. My problem is that I have a dataframe with news article headlines that are written in all languages in the EU. Poor database design comments to the side, how can I get R to show each row in its respective language?
I read this R bloggers post and it makes sense when changing the Sys.setlocale
to one of the languages, but then the last command executed is the one that counts. Separating the database manually into each language bin and running the script for each language is a possibility, but I would rather not do it.
Gratitude!
Edit:
Link to base .xls document
R code to import:
library(data.table)
library(XLConnect)
library(stringr)
library(stringi)
library(dplyr)
#load .xls
wb <- loadWorkbook('D:/MOMUT1/GIS_Workload/Other/alex/Book2_1.xls')
df <- readWorksheet(wb, 1, header = TRUE)
#remove NAs
df_final <- subset(df, !is.na(df$HEADLINE))
#take out HEADLINE column to work on
head_col <- data.table(df_final$HEADLINE)
Running on: Windows 10 Pro 1803 64bit
RStudio 3.4.4
r string non-english
add a comment |
up vote
1
down vote
favorite
Can someone tell me how to get R to display normally all human languages. My problem is that I have a dataframe with news article headlines that are written in all languages in the EU. Poor database design comments to the side, how can I get R to show each row in its respective language?
I read this R bloggers post and it makes sense when changing the Sys.setlocale
to one of the languages, but then the last command executed is the one that counts. Separating the database manually into each language bin and running the script for each language is a possibility, but I would rather not do it.
Gratitude!
Edit:
Link to base .xls document
R code to import:
library(data.table)
library(XLConnect)
library(stringr)
library(stringi)
library(dplyr)
#load .xls
wb <- loadWorkbook('D:/MOMUT1/GIS_Workload/Other/alex/Book2_1.xls')
df <- readWorksheet(wb, 1, header = TRUE)
#remove NAs
df_final <- subset(df, !is.na(df$HEADLINE))
#take out HEADLINE column to work on
head_col <- data.table(df_final$HEADLINE)
Running on: Windows 10 Pro 1803 64bit
RStudio 3.4.4
r string non-english
My first thought is that a vector of strings is typically only shown in one locale. If you provide a sample (perhaps 3-4 different languages), perhaps we can play with it. (I suggest pasting the output fromdput(head(x,n=4))
, with only the columns needed.)
– r2evans
Nov 19 at 19:24
What OS are you using? Are you using just R or Studio?
– MrFlick
Nov 19 at 19:32
I edited the question to include info
– Momchill
Nov 19 at 19:58
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
Can someone tell me how to get R to display normally all human languages. My problem is that I have a dataframe with news article headlines that are written in all languages in the EU. Poor database design comments to the side, how can I get R to show each row in its respective language?
I read this R bloggers post and it makes sense when changing the Sys.setlocale
to one of the languages, but then the last command executed is the one that counts. Separating the database manually into each language bin and running the script for each language is a possibility, but I would rather not do it.
Gratitude!
Edit:
Link to base .xls document
R code to import:
library(data.table)
library(XLConnect)
library(stringr)
library(stringi)
library(dplyr)
#load .xls
wb <- loadWorkbook('D:/MOMUT1/GIS_Workload/Other/alex/Book2_1.xls')
df <- readWorksheet(wb, 1, header = TRUE)
#remove NAs
df_final <- subset(df, !is.na(df$HEADLINE))
#take out HEADLINE column to work on
head_col <- data.table(df_final$HEADLINE)
Running on: Windows 10 Pro 1803 64bit
RStudio 3.4.4
r string non-english
Can someone tell me how to get R to display normally all human languages. My problem is that I have a dataframe with news article headlines that are written in all languages in the EU. Poor database design comments to the side, how can I get R to show each row in its respective language?
I read this R bloggers post and it makes sense when changing the Sys.setlocale
to one of the languages, but then the last command executed is the one that counts. Separating the database manually into each language bin and running the script for each language is a possibility, but I would rather not do it.
Gratitude!
Edit:
Link to base .xls document
R code to import:
library(data.table)
library(XLConnect)
library(stringr)
library(stringi)
library(dplyr)
#load .xls
wb <- loadWorkbook('D:/MOMUT1/GIS_Workload/Other/alex/Book2_1.xls')
df <- readWorksheet(wb, 1, header = TRUE)
#remove NAs
df_final <- subset(df, !is.na(df$HEADLINE))
#take out HEADLINE column to work on
head_col <- data.table(df_final$HEADLINE)
Running on: Windows 10 Pro 1803 64bit
RStudio 3.4.4
r string non-english
r string non-english
edited Nov 19 at 19:57
asked Nov 19 at 19:09
Momchill
347
347
My first thought is that a vector of strings is typically only shown in one locale. If you provide a sample (perhaps 3-4 different languages), perhaps we can play with it. (I suggest pasting the output fromdput(head(x,n=4))
, with only the columns needed.)
– r2evans
Nov 19 at 19:24
What OS are you using? Are you using just R or Studio?
– MrFlick
Nov 19 at 19:32
I edited the question to include info
– Momchill
Nov 19 at 19:58
add a comment |
My first thought is that a vector of strings is typically only shown in one locale. If you provide a sample (perhaps 3-4 different languages), perhaps we can play with it. (I suggest pasting the output fromdput(head(x,n=4))
, with only the columns needed.)
– r2evans
Nov 19 at 19:24
What OS are you using? Are you using just R or Studio?
– MrFlick
Nov 19 at 19:32
I edited the question to include info
– Momchill
Nov 19 at 19:58
My first thought is that a vector of strings is typically only shown in one locale. If you provide a sample (perhaps 3-4 different languages), perhaps we can play with it. (I suggest pasting the output from
dput(head(x,n=4))
, with only the columns needed.)– r2evans
Nov 19 at 19:24
My first thought is that a vector of strings is typically only shown in one locale. If you provide a sample (perhaps 3-4 different languages), perhaps we can play with it. (I suggest pasting the output from
dput(head(x,n=4))
, with only the columns needed.)– r2evans
Nov 19 at 19:24
What OS are you using? Are you using just R or Studio?
– MrFlick
Nov 19 at 19:32
What OS are you using? Are you using just R or Studio?
– MrFlick
Nov 19 at 19:32
I edited the question to include info
– Momchill
Nov 19 at 19:58
I edited the question to include info
– Momchill
Nov 19 at 19:58
add a comment |
1 Answer
1
active
oldest
votes
up vote
1
down vote
accepted
One solution when dealing with multiple languages is to run R in Linux, where UTF-8 is the standard encoding. Since you're on Win 10 Pro, you can do this in the Windows Subsystem for Linux without actually having to install an OS from scratch.
- Install WSL: https://docs.microsoft.com/en-us/windows/wsl/install-win10 (Ubuntu is probably the best choice of distro)
- Install R: http://sites.psu.edu/theubunturblog/installing-r-in-ubuntu/
- Install any packages you need via
install.packages
. You may have to install system library dependencies yourself. - Run your analysis.
Caveat: I haven't actually tried this. Also, you'll be running R from the commandline rather than with RStudio.
Linux it is then! Thank you kindly for the explanation!
– Momchill
Nov 19 at 20:27
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53381122%2fhow-to-get-r-to-read-all-the-other-human-languages%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
accepted
One solution when dealing with multiple languages is to run R in Linux, where UTF-8 is the standard encoding. Since you're on Win 10 Pro, you can do this in the Windows Subsystem for Linux without actually having to install an OS from scratch.
- Install WSL: https://docs.microsoft.com/en-us/windows/wsl/install-win10 (Ubuntu is probably the best choice of distro)
- Install R: http://sites.psu.edu/theubunturblog/installing-r-in-ubuntu/
- Install any packages you need via
install.packages
. You may have to install system library dependencies yourself. - Run your analysis.
Caveat: I haven't actually tried this. Also, you'll be running R from the commandline rather than with RStudio.
Linux it is then! Thank you kindly for the explanation!
– Momchill
Nov 19 at 20:27
add a comment |
up vote
1
down vote
accepted
One solution when dealing with multiple languages is to run R in Linux, where UTF-8 is the standard encoding. Since you're on Win 10 Pro, you can do this in the Windows Subsystem for Linux without actually having to install an OS from scratch.
- Install WSL: https://docs.microsoft.com/en-us/windows/wsl/install-win10 (Ubuntu is probably the best choice of distro)
- Install R: http://sites.psu.edu/theubunturblog/installing-r-in-ubuntu/
- Install any packages you need via
install.packages
. You may have to install system library dependencies yourself. - Run your analysis.
Caveat: I haven't actually tried this. Also, you'll be running R from the commandline rather than with RStudio.
Linux it is then! Thank you kindly for the explanation!
– Momchill
Nov 19 at 20:27
add a comment |
up vote
1
down vote
accepted
up vote
1
down vote
accepted
One solution when dealing with multiple languages is to run R in Linux, where UTF-8 is the standard encoding. Since you're on Win 10 Pro, you can do this in the Windows Subsystem for Linux without actually having to install an OS from scratch.
- Install WSL: https://docs.microsoft.com/en-us/windows/wsl/install-win10 (Ubuntu is probably the best choice of distro)
- Install R: http://sites.psu.edu/theubunturblog/installing-r-in-ubuntu/
- Install any packages you need via
install.packages
. You may have to install system library dependencies yourself. - Run your analysis.
Caveat: I haven't actually tried this. Also, you'll be running R from the commandline rather than with RStudio.
One solution when dealing with multiple languages is to run R in Linux, where UTF-8 is the standard encoding. Since you're on Win 10 Pro, you can do this in the Windows Subsystem for Linux without actually having to install an OS from scratch.
- Install WSL: https://docs.microsoft.com/en-us/windows/wsl/install-win10 (Ubuntu is probably the best choice of distro)
- Install R: http://sites.psu.edu/theubunturblog/installing-r-in-ubuntu/
- Install any packages you need via
install.packages
. You may have to install system library dependencies yourself. - Run your analysis.
Caveat: I haven't actually tried this. Also, you'll be running R from the commandline rather than with RStudio.
edited Nov 19 at 20:30
answered Nov 19 at 20:10
Hong Ooi
41.9k1090133
41.9k1090133
Linux it is then! Thank you kindly for the explanation!
– Momchill
Nov 19 at 20:27
add a comment |
Linux it is then! Thank you kindly for the explanation!
– Momchill
Nov 19 at 20:27
Linux it is then! Thank you kindly for the explanation!
– Momchill
Nov 19 at 20:27
Linux it is then! Thank you kindly for the explanation!
– Momchill
Nov 19 at 20:27
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53381122%2fhow-to-get-r-to-read-all-the-other-human-languages%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
My first thought is that a vector of strings is typically only shown in one locale. If you provide a sample (perhaps 3-4 different languages), perhaps we can play with it. (I suggest pasting the output from
dput(head(x,n=4))
, with only the columns needed.)– r2evans
Nov 19 at 19:24
What OS are you using? Are you using just R or Studio?
– MrFlick
Nov 19 at 19:32
I edited the question to include info
– Momchill
Nov 19 at 19:58