How to get R to read all the other human languages?











up vote
1
down vote

favorite












Can someone tell me how to get R to display normally all human languages. My problem is that I have a dataframe with news article headlines that are written in all languages in the EU. Poor database design comments to the side, how can I get R to show each row in its respective language?



I read this R bloggers post and it makes sense when changing the Sys.setlocale to one of the languages, but then the last command executed is the one that counts. Separating the database manually into each language bin and running the script for each language is a possibility, but I would rather not do it.



Gratitude!



Edit:



Link to base .xls document



R code to import:



 library(data.table)
library(XLConnect)
library(stringr)
library(stringi)
library(dplyr)

#load .xls
wb <- loadWorkbook('D:/MOMUT1/GIS_Workload/Other/alex/Book2_1.xls')
df <- readWorksheet(wb, 1, header = TRUE)

#remove NAs
df_final <- subset(df, !is.na(df$HEADLINE))

#take out HEADLINE column to work on
head_col <- data.table(df_final$HEADLINE)


Running on: Windows 10 Pro 1803 64bit
RStudio 3.4.4










share|improve this question
























  • My first thought is that a vector of strings is typically only shown in one locale. If you provide a sample (perhaps 3-4 different languages), perhaps we can play with it. (I suggest pasting the output from dput(head(x,n=4)), with only the columns needed.)
    – r2evans
    Nov 19 at 19:24










  • What OS are you using? Are you using just R or Studio?
    – MrFlick
    Nov 19 at 19:32










  • I edited the question to include info
    – Momchill
    Nov 19 at 19:58















up vote
1
down vote

favorite












Can someone tell me how to get R to display normally all human languages. My problem is that I have a dataframe with news article headlines that are written in all languages in the EU. Poor database design comments to the side, how can I get R to show each row in its respective language?



I read this R bloggers post and it makes sense when changing the Sys.setlocale to one of the languages, but then the last command executed is the one that counts. Separating the database manually into each language bin and running the script for each language is a possibility, but I would rather not do it.



Gratitude!



Edit:



Link to base .xls document



R code to import:



 library(data.table)
library(XLConnect)
library(stringr)
library(stringi)
library(dplyr)

#load .xls
wb <- loadWorkbook('D:/MOMUT1/GIS_Workload/Other/alex/Book2_1.xls')
df <- readWorksheet(wb, 1, header = TRUE)

#remove NAs
df_final <- subset(df, !is.na(df$HEADLINE))

#take out HEADLINE column to work on
head_col <- data.table(df_final$HEADLINE)


Running on: Windows 10 Pro 1803 64bit
RStudio 3.4.4










share|improve this question
























  • My first thought is that a vector of strings is typically only shown in one locale. If you provide a sample (perhaps 3-4 different languages), perhaps we can play with it. (I suggest pasting the output from dput(head(x,n=4)), with only the columns needed.)
    – r2evans
    Nov 19 at 19:24










  • What OS are you using? Are you using just R or Studio?
    – MrFlick
    Nov 19 at 19:32










  • I edited the question to include info
    – Momchill
    Nov 19 at 19:58













up vote
1
down vote

favorite









up vote
1
down vote

favorite











Can someone tell me how to get R to display normally all human languages. My problem is that I have a dataframe with news article headlines that are written in all languages in the EU. Poor database design comments to the side, how can I get R to show each row in its respective language?



I read this R bloggers post and it makes sense when changing the Sys.setlocale to one of the languages, but then the last command executed is the one that counts. Separating the database manually into each language bin and running the script for each language is a possibility, but I would rather not do it.



Gratitude!



Edit:



Link to base .xls document



R code to import:



 library(data.table)
library(XLConnect)
library(stringr)
library(stringi)
library(dplyr)

#load .xls
wb <- loadWorkbook('D:/MOMUT1/GIS_Workload/Other/alex/Book2_1.xls')
df <- readWorksheet(wb, 1, header = TRUE)

#remove NAs
df_final <- subset(df, !is.na(df$HEADLINE))

#take out HEADLINE column to work on
head_col <- data.table(df_final$HEADLINE)


Running on: Windows 10 Pro 1803 64bit
RStudio 3.4.4










share|improve this question















Can someone tell me how to get R to display normally all human languages. My problem is that I have a dataframe with news article headlines that are written in all languages in the EU. Poor database design comments to the side, how can I get R to show each row in its respective language?



I read this R bloggers post and it makes sense when changing the Sys.setlocale to one of the languages, but then the last command executed is the one that counts. Separating the database manually into each language bin and running the script for each language is a possibility, but I would rather not do it.



Gratitude!



Edit:



Link to base .xls document



R code to import:



 library(data.table)
library(XLConnect)
library(stringr)
library(stringi)
library(dplyr)

#load .xls
wb <- loadWorkbook('D:/MOMUT1/GIS_Workload/Other/alex/Book2_1.xls')
df <- readWorksheet(wb, 1, header = TRUE)

#remove NAs
df_final <- subset(df, !is.na(df$HEADLINE))

#take out HEADLINE column to work on
head_col <- data.table(df_final$HEADLINE)


Running on: Windows 10 Pro 1803 64bit
RStudio 3.4.4







r string non-english






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 19 at 19:57

























asked Nov 19 at 19:09









Momchill

347




347












  • My first thought is that a vector of strings is typically only shown in one locale. If you provide a sample (perhaps 3-4 different languages), perhaps we can play with it. (I suggest pasting the output from dput(head(x,n=4)), with only the columns needed.)
    – r2evans
    Nov 19 at 19:24










  • What OS are you using? Are you using just R or Studio?
    – MrFlick
    Nov 19 at 19:32










  • I edited the question to include info
    – Momchill
    Nov 19 at 19:58


















  • My first thought is that a vector of strings is typically only shown in one locale. If you provide a sample (perhaps 3-4 different languages), perhaps we can play with it. (I suggest pasting the output from dput(head(x,n=4)), with only the columns needed.)
    – r2evans
    Nov 19 at 19:24










  • What OS are you using? Are you using just R or Studio?
    – MrFlick
    Nov 19 at 19:32










  • I edited the question to include info
    – Momchill
    Nov 19 at 19:58
















My first thought is that a vector of strings is typically only shown in one locale. If you provide a sample (perhaps 3-4 different languages), perhaps we can play with it. (I suggest pasting the output from dput(head(x,n=4)), with only the columns needed.)
– r2evans
Nov 19 at 19:24




My first thought is that a vector of strings is typically only shown in one locale. If you provide a sample (perhaps 3-4 different languages), perhaps we can play with it. (I suggest pasting the output from dput(head(x,n=4)), with only the columns needed.)
– r2evans
Nov 19 at 19:24












What OS are you using? Are you using just R or Studio?
– MrFlick
Nov 19 at 19:32




What OS are you using? Are you using just R or Studio?
– MrFlick
Nov 19 at 19:32












I edited the question to include info
– Momchill
Nov 19 at 19:58




I edited the question to include info
– Momchill
Nov 19 at 19:58












1 Answer
1






active

oldest

votes

















up vote
1
down vote



accepted










One solution when dealing with multiple languages is to run R in Linux, where UTF-8 is the standard encoding. Since you're on Win 10 Pro, you can do this in the Windows Subsystem for Linux without actually having to install an OS from scratch.




  1. Install WSL: https://docs.microsoft.com/en-us/windows/wsl/install-win10 (Ubuntu is probably the best choice of distro)

  2. Install R: http://sites.psu.edu/theubunturblog/installing-r-in-ubuntu/

  3. Install any packages you need via install.packages. You may have to install system library dependencies yourself.

  4. Run your analysis.


Caveat: I haven't actually tried this. Also, you'll be running R from the commandline rather than with RStudio.






share|improve this answer























  • Linux it is then! Thank you kindly for the explanation!
    – Momchill
    Nov 19 at 20:27











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53381122%2fhow-to-get-r-to-read-all-the-other-human-languages%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
1
down vote



accepted










One solution when dealing with multiple languages is to run R in Linux, where UTF-8 is the standard encoding. Since you're on Win 10 Pro, you can do this in the Windows Subsystem for Linux without actually having to install an OS from scratch.




  1. Install WSL: https://docs.microsoft.com/en-us/windows/wsl/install-win10 (Ubuntu is probably the best choice of distro)

  2. Install R: http://sites.psu.edu/theubunturblog/installing-r-in-ubuntu/

  3. Install any packages you need via install.packages. You may have to install system library dependencies yourself.

  4. Run your analysis.


Caveat: I haven't actually tried this. Also, you'll be running R from the commandline rather than with RStudio.






share|improve this answer























  • Linux it is then! Thank you kindly for the explanation!
    – Momchill
    Nov 19 at 20:27















up vote
1
down vote



accepted










One solution when dealing with multiple languages is to run R in Linux, where UTF-8 is the standard encoding. Since you're on Win 10 Pro, you can do this in the Windows Subsystem for Linux without actually having to install an OS from scratch.




  1. Install WSL: https://docs.microsoft.com/en-us/windows/wsl/install-win10 (Ubuntu is probably the best choice of distro)

  2. Install R: http://sites.psu.edu/theubunturblog/installing-r-in-ubuntu/

  3. Install any packages you need via install.packages. You may have to install system library dependencies yourself.

  4. Run your analysis.


Caveat: I haven't actually tried this. Also, you'll be running R from the commandline rather than with RStudio.






share|improve this answer























  • Linux it is then! Thank you kindly for the explanation!
    – Momchill
    Nov 19 at 20:27













up vote
1
down vote



accepted







up vote
1
down vote



accepted






One solution when dealing with multiple languages is to run R in Linux, where UTF-8 is the standard encoding. Since you're on Win 10 Pro, you can do this in the Windows Subsystem for Linux without actually having to install an OS from scratch.




  1. Install WSL: https://docs.microsoft.com/en-us/windows/wsl/install-win10 (Ubuntu is probably the best choice of distro)

  2. Install R: http://sites.psu.edu/theubunturblog/installing-r-in-ubuntu/

  3. Install any packages you need via install.packages. You may have to install system library dependencies yourself.

  4. Run your analysis.


Caveat: I haven't actually tried this. Also, you'll be running R from the commandline rather than with RStudio.






share|improve this answer














One solution when dealing with multiple languages is to run R in Linux, where UTF-8 is the standard encoding. Since you're on Win 10 Pro, you can do this in the Windows Subsystem for Linux without actually having to install an OS from scratch.




  1. Install WSL: https://docs.microsoft.com/en-us/windows/wsl/install-win10 (Ubuntu is probably the best choice of distro)

  2. Install R: http://sites.psu.edu/theubunturblog/installing-r-in-ubuntu/

  3. Install any packages you need via install.packages. You may have to install system library dependencies yourself.

  4. Run your analysis.


Caveat: I haven't actually tried this. Also, you'll be running R from the commandline rather than with RStudio.







share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 19 at 20:30

























answered Nov 19 at 20:10









Hong Ooi

41.9k1090133




41.9k1090133












  • Linux it is then! Thank you kindly for the explanation!
    – Momchill
    Nov 19 at 20:27


















  • Linux it is then! Thank you kindly for the explanation!
    – Momchill
    Nov 19 at 20:27
















Linux it is then! Thank you kindly for the explanation!
– Momchill
Nov 19 at 20:27




Linux it is then! Thank you kindly for the explanation!
– Momchill
Nov 19 at 20:27


















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53381122%2fhow-to-get-r-to-read-all-the-other-human-languages%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

"Incorrect syntax near the keyword 'ON'. (on update cascade, on delete cascade,)

Alcedinidae

RAC Tourist Trophy