Filter out all rows with only one period in R












1















I have this column, Identifier with character values.



structure(list(Identifier = c("RL.K", "RL.K.1", "RL.K.2", "RL.K.3", 
"RL.K.4", "RL.K.5", "RL.K.6", "RL.K.7", "RL.K.9", "RL.K.10",
"RI.K", "RI.K.1", "RI.K.2", "RI.K.3", "RI.K.4", "RI.K.5", "RI.K.6",
"RI.K.7", "RI.K.9", "RI.K.10", "RF.K", "RF.K.1")), row.names = c(NA,
-22L), class = c("tbl_df", "tbl", "data.frame"))


How do I filter out the values with only one period? so that I can take out rows 1, 11, and 21










share|improve this question





























    1















    I have this column, Identifier with character values.



    structure(list(Identifier = c("RL.K", "RL.K.1", "RL.K.2", "RL.K.3", 
    "RL.K.4", "RL.K.5", "RL.K.6", "RL.K.7", "RL.K.9", "RL.K.10",
    "RI.K", "RI.K.1", "RI.K.2", "RI.K.3", "RI.K.4", "RI.K.5", "RI.K.6",
    "RI.K.7", "RI.K.9", "RI.K.10", "RF.K", "RF.K.1")), row.names = c(NA,
    -22L), class = c("tbl_df", "tbl", "data.frame"))


    How do I filter out the values with only one period? so that I can take out rows 1, 11, and 21










    share|improve this question



























      1












      1








      1








      I have this column, Identifier with character values.



      structure(list(Identifier = c("RL.K", "RL.K.1", "RL.K.2", "RL.K.3", 
      "RL.K.4", "RL.K.5", "RL.K.6", "RL.K.7", "RL.K.9", "RL.K.10",
      "RI.K", "RI.K.1", "RI.K.2", "RI.K.3", "RI.K.4", "RI.K.5", "RI.K.6",
      "RI.K.7", "RI.K.9", "RI.K.10", "RF.K", "RF.K.1")), row.names = c(NA,
      -22L), class = c("tbl_df", "tbl", "data.frame"))


      How do I filter out the values with only one period? so that I can take out rows 1, 11, and 21










      share|improve this question
















      I have this column, Identifier with character values.



      structure(list(Identifier = c("RL.K", "RL.K.1", "RL.K.2", "RL.K.3", 
      "RL.K.4", "RL.K.5", "RL.K.6", "RL.K.7", "RL.K.9", "RL.K.10",
      "RI.K", "RI.K.1", "RI.K.2", "RI.K.3", "RI.K.4", "RI.K.5", "RI.K.6",
      "RI.K.7", "RI.K.9", "RI.K.10", "RF.K", "RF.K.1")), row.names = c(NA,
      -22L), class = c("tbl_df", "tbl", "data.frame"))


      How do I filter out the values with only one period? so that I can take out rows 1, 11, and 21







      r dplyr






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 20 '18 at 17:46









      Wiktor Stribiżew

      310k16131206




      310k16131206










      asked Nov 20 '18 at 17:43









      JasonBaikJasonBaik

      17510




      17510
























          4 Answers
          4






          active

          oldest

          votes


















          2














          We can count the number of . in the 'Identifier' and create a logical condition for filtering the rows



          library(tidyverse)
          df1 %>%
          filter(str_count(Identifier, "[.]") == 1)
          # A tibble: 3 x 1
          # Identifier
          # <chr>
          #1 RL.K
          #2 RI.K
          #3 RF.K


          Or as @WiktorStribizew mentioned, fixed can be wrapped to make it more faster



          df1 %>% 
          filter(str_count(Identifier, fixed(".")) == 1)




          Or without using any external libraries,



          df1[nchar(gsub("[^.]*", "", df1$Identifier)) == 1,]


          Or using gregexpr from base R



          df1[lengths(gregexpr(".", df1$Identifier, fixed = TRUE)) == 1,]





          share|improve this answer


























          • Why regex? There is just a dot to find, use str_count(Identifier, fixed("."))

            – Wiktor Stribiżew
            Nov 20 '18 at 17:45






          • 1





            Wow, that was a quickie!

            – JasonBaik
            Nov 20 '18 at 17:47



















          2














          If we're going to use base and grepl, there's a simpler regex code:



          df[grepl("\..*\.", df$Identifier),]


          (explanation for the regex: \. finds a literal ., .* finds anything, so this code finds cases where there are two literal dots separated by anything)






          share|improve this answer































            2














            A solution using base R. (find all strings with exactly one dot)



            grepl("^[^.]*[.][^.]*$", df1$Identifier)




            To remove the rows with one dot use:



            df1[
            !grepl("^[^.]*[.][^.]*$", df1$Identifier),
            ]





            share|improve this answer





















            • 1





              it nees a ! in front of grepl expression, since you want to filter out those with only one . for which the regex is searching.

              – Gwang-Jin Kim
              Nov 20 '18 at 17:59











            • thanks @Gwang-JinKim. I just realized "filter out" meant "remove".

              – Andre Elrico
              Nov 22 '18 at 11:09



















            0














            With as little Regex as possible ;):



            has.only.one.dot <- function(str_vec) sapply(strsplit(str_vec, "\."), function(vec) length(vec) == 2)
            df[!has.only.one.dot(df$Identifier), ]


            However, the list functions sapply and strsplit are slower than regex solution.



            has.only.one.dot <- function(str_vec) grepl("\.", str_vec) & ! grepl("\..*\.", str_vec)
            df[!has.only.one.dot(df$Identifier), ]





            share|improve this answer

























              Your Answer






              StackExchange.ifUsing("editor", function () {
              StackExchange.using("externalEditor", function () {
              StackExchange.using("snippets", function () {
              StackExchange.snippets.init();
              });
              });
              }, "code-snippets");

              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "1"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: true,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: 10,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });














              draft saved

              draft discarded


















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53398636%2ffilter-out-all-rows-with-only-one-period-in-r%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              4 Answers
              4






              active

              oldest

              votes








              4 Answers
              4






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              2














              We can count the number of . in the 'Identifier' and create a logical condition for filtering the rows



              library(tidyverse)
              df1 %>%
              filter(str_count(Identifier, "[.]") == 1)
              # A tibble: 3 x 1
              # Identifier
              # <chr>
              #1 RL.K
              #2 RI.K
              #3 RF.K


              Or as @WiktorStribizew mentioned, fixed can be wrapped to make it more faster



              df1 %>% 
              filter(str_count(Identifier, fixed(".")) == 1)




              Or without using any external libraries,



              df1[nchar(gsub("[^.]*", "", df1$Identifier)) == 1,]


              Or using gregexpr from base R



              df1[lengths(gregexpr(".", df1$Identifier, fixed = TRUE)) == 1,]





              share|improve this answer


























              • Why regex? There is just a dot to find, use str_count(Identifier, fixed("."))

                – Wiktor Stribiżew
                Nov 20 '18 at 17:45






              • 1





                Wow, that was a quickie!

                – JasonBaik
                Nov 20 '18 at 17:47
















              2














              We can count the number of . in the 'Identifier' and create a logical condition for filtering the rows



              library(tidyverse)
              df1 %>%
              filter(str_count(Identifier, "[.]") == 1)
              # A tibble: 3 x 1
              # Identifier
              # <chr>
              #1 RL.K
              #2 RI.K
              #3 RF.K


              Or as @WiktorStribizew mentioned, fixed can be wrapped to make it more faster



              df1 %>% 
              filter(str_count(Identifier, fixed(".")) == 1)




              Or without using any external libraries,



              df1[nchar(gsub("[^.]*", "", df1$Identifier)) == 1,]


              Or using gregexpr from base R



              df1[lengths(gregexpr(".", df1$Identifier, fixed = TRUE)) == 1,]





              share|improve this answer


























              • Why regex? There is just a dot to find, use str_count(Identifier, fixed("."))

                – Wiktor Stribiżew
                Nov 20 '18 at 17:45






              • 1





                Wow, that was a quickie!

                – JasonBaik
                Nov 20 '18 at 17:47














              2












              2








              2







              We can count the number of . in the 'Identifier' and create a logical condition for filtering the rows



              library(tidyverse)
              df1 %>%
              filter(str_count(Identifier, "[.]") == 1)
              # A tibble: 3 x 1
              # Identifier
              # <chr>
              #1 RL.K
              #2 RI.K
              #3 RF.K


              Or as @WiktorStribizew mentioned, fixed can be wrapped to make it more faster



              df1 %>% 
              filter(str_count(Identifier, fixed(".")) == 1)




              Or without using any external libraries,



              df1[nchar(gsub("[^.]*", "", df1$Identifier)) == 1,]


              Or using gregexpr from base R



              df1[lengths(gregexpr(".", df1$Identifier, fixed = TRUE)) == 1,]





              share|improve this answer















              We can count the number of . in the 'Identifier' and create a logical condition for filtering the rows



              library(tidyverse)
              df1 %>%
              filter(str_count(Identifier, "[.]") == 1)
              # A tibble: 3 x 1
              # Identifier
              # <chr>
              #1 RL.K
              #2 RI.K
              #3 RF.K


              Or as @WiktorStribizew mentioned, fixed can be wrapped to make it more faster



              df1 %>% 
              filter(str_count(Identifier, fixed(".")) == 1)




              Or without using any external libraries,



              df1[nchar(gsub("[^.]*", "", df1$Identifier)) == 1,]


              Or using gregexpr from base R



              df1[lengths(gregexpr(".", df1$Identifier, fixed = TRUE)) == 1,]






              share|improve this answer














              share|improve this answer



              share|improve this answer








              edited Nov 20 '18 at 18:12

























              answered Nov 20 '18 at 17:44









              akrunakrun

              400k13190265




              400k13190265













              • Why regex? There is just a dot to find, use str_count(Identifier, fixed("."))

                – Wiktor Stribiżew
                Nov 20 '18 at 17:45






              • 1





                Wow, that was a quickie!

                – JasonBaik
                Nov 20 '18 at 17:47



















              • Why regex? There is just a dot to find, use str_count(Identifier, fixed("."))

                – Wiktor Stribiżew
                Nov 20 '18 at 17:45






              • 1





                Wow, that was a quickie!

                – JasonBaik
                Nov 20 '18 at 17:47

















              Why regex? There is just a dot to find, use str_count(Identifier, fixed("."))

              – Wiktor Stribiżew
              Nov 20 '18 at 17:45





              Why regex? There is just a dot to find, use str_count(Identifier, fixed("."))

              – Wiktor Stribiżew
              Nov 20 '18 at 17:45




              1




              1





              Wow, that was a quickie!

              – JasonBaik
              Nov 20 '18 at 17:47





              Wow, that was a quickie!

              – JasonBaik
              Nov 20 '18 at 17:47













              2














              If we're going to use base and grepl, there's a simpler regex code:



              df[grepl("\..*\.", df$Identifier),]


              (explanation for the regex: \. finds a literal ., .* finds anything, so this code finds cases where there are two literal dots separated by anything)






              share|improve this answer




























                2














                If we're going to use base and grepl, there's a simpler regex code:



                df[grepl("\..*\.", df$Identifier),]


                (explanation for the regex: \. finds a literal ., .* finds anything, so this code finds cases where there are two literal dots separated by anything)






                share|improve this answer


























                  2












                  2








                  2







                  If we're going to use base and grepl, there's a simpler regex code:



                  df[grepl("\..*\.", df$Identifier),]


                  (explanation for the regex: \. finds a literal ., .* finds anything, so this code finds cases where there are two literal dots separated by anything)






                  share|improve this answer













                  If we're going to use base and grepl, there's a simpler regex code:



                  df[grepl("\..*\.", df$Identifier),]


                  (explanation for the regex: \. finds a literal ., .* finds anything, so this code finds cases where there are two literal dots separated by anything)







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 20 '18 at 17:58









                  iodiod

                  3,5792722




                  3,5792722























                      2














                      A solution using base R. (find all strings with exactly one dot)



                      grepl("^[^.]*[.][^.]*$", df1$Identifier)




                      To remove the rows with one dot use:



                      df1[
                      !grepl("^[^.]*[.][^.]*$", df1$Identifier),
                      ]





                      share|improve this answer





















                      • 1





                        it nees a ! in front of grepl expression, since you want to filter out those with only one . for which the regex is searching.

                        – Gwang-Jin Kim
                        Nov 20 '18 at 17:59











                      • thanks @Gwang-JinKim. I just realized "filter out" meant "remove".

                        – Andre Elrico
                        Nov 22 '18 at 11:09
















                      2














                      A solution using base R. (find all strings with exactly one dot)



                      grepl("^[^.]*[.][^.]*$", df1$Identifier)




                      To remove the rows with one dot use:



                      df1[
                      !grepl("^[^.]*[.][^.]*$", df1$Identifier),
                      ]





                      share|improve this answer





















                      • 1





                        it nees a ! in front of grepl expression, since you want to filter out those with only one . for which the regex is searching.

                        – Gwang-Jin Kim
                        Nov 20 '18 at 17:59











                      • thanks @Gwang-JinKim. I just realized "filter out" meant "remove".

                        – Andre Elrico
                        Nov 22 '18 at 11:09














                      2












                      2








                      2







                      A solution using base R. (find all strings with exactly one dot)



                      grepl("^[^.]*[.][^.]*$", df1$Identifier)




                      To remove the rows with one dot use:



                      df1[
                      !grepl("^[^.]*[.][^.]*$", df1$Identifier),
                      ]





                      share|improve this answer















                      A solution using base R. (find all strings with exactly one dot)



                      grepl("^[^.]*[.][^.]*$", df1$Identifier)




                      To remove the rows with one dot use:



                      df1[
                      !grepl("^[^.]*[.][^.]*$", df1$Identifier),
                      ]






                      share|improve this answer














                      share|improve this answer



                      share|improve this answer








                      edited Nov 22 '18 at 11:11

























                      answered Nov 20 '18 at 17:48









                      Andre ElricoAndre Elrico

                      5,63311027




                      5,63311027








                      • 1





                        it nees a ! in front of grepl expression, since you want to filter out those with only one . for which the regex is searching.

                        – Gwang-Jin Kim
                        Nov 20 '18 at 17:59











                      • thanks @Gwang-JinKim. I just realized "filter out" meant "remove".

                        – Andre Elrico
                        Nov 22 '18 at 11:09














                      • 1





                        it nees a ! in front of grepl expression, since you want to filter out those with only one . for which the regex is searching.

                        – Gwang-Jin Kim
                        Nov 20 '18 at 17:59











                      • thanks @Gwang-JinKim. I just realized "filter out" meant "remove".

                        – Andre Elrico
                        Nov 22 '18 at 11:09








                      1




                      1





                      it nees a ! in front of grepl expression, since you want to filter out those with only one . for which the regex is searching.

                      – Gwang-Jin Kim
                      Nov 20 '18 at 17:59





                      it nees a ! in front of grepl expression, since you want to filter out those with only one . for which the regex is searching.

                      – Gwang-Jin Kim
                      Nov 20 '18 at 17:59













                      thanks @Gwang-JinKim. I just realized "filter out" meant "remove".

                      – Andre Elrico
                      Nov 22 '18 at 11:09





                      thanks @Gwang-JinKim. I just realized "filter out" meant "remove".

                      – Andre Elrico
                      Nov 22 '18 at 11:09











                      0














                      With as little Regex as possible ;):



                      has.only.one.dot <- function(str_vec) sapply(strsplit(str_vec, "\."), function(vec) length(vec) == 2)
                      df[!has.only.one.dot(df$Identifier), ]


                      However, the list functions sapply and strsplit are slower than regex solution.



                      has.only.one.dot <- function(str_vec) grepl("\.", str_vec) & ! grepl("\..*\.", str_vec)
                      df[!has.only.one.dot(df$Identifier), ]





                      share|improve this answer






























                        0














                        With as little Regex as possible ;):



                        has.only.one.dot <- function(str_vec) sapply(strsplit(str_vec, "\."), function(vec) length(vec) == 2)
                        df[!has.only.one.dot(df$Identifier), ]


                        However, the list functions sapply and strsplit are slower than regex solution.



                        has.only.one.dot <- function(str_vec) grepl("\.", str_vec) & ! grepl("\..*\.", str_vec)
                        df[!has.only.one.dot(df$Identifier), ]





                        share|improve this answer




























                          0












                          0








                          0







                          With as little Regex as possible ;):



                          has.only.one.dot <- function(str_vec) sapply(strsplit(str_vec, "\."), function(vec) length(vec) == 2)
                          df[!has.only.one.dot(df$Identifier), ]


                          However, the list functions sapply and strsplit are slower than regex solution.



                          has.only.one.dot <- function(str_vec) grepl("\.", str_vec) & ! grepl("\..*\.", str_vec)
                          df[!has.only.one.dot(df$Identifier), ]





                          share|improve this answer















                          With as little Regex as possible ;):



                          has.only.one.dot <- function(str_vec) sapply(strsplit(str_vec, "\."), function(vec) length(vec) == 2)
                          df[!has.only.one.dot(df$Identifier), ]


                          However, the list functions sapply and strsplit are slower than regex solution.



                          has.only.one.dot <- function(str_vec) grepl("\.", str_vec) & ! grepl("\..*\.", str_vec)
                          df[!has.only.one.dot(df$Identifier), ]






                          share|improve this answer














                          share|improve this answer



                          share|improve this answer








                          edited Nov 20 '18 at 18:11

























                          answered Nov 20 '18 at 18:06









                          Gwang-Jin KimGwang-Jin Kim

                          2,421116




                          2,421116






























                              draft saved

                              draft discarded




















































                              Thanks for contributing an answer to Stack Overflow!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function () {
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53398636%2ffilter-out-all-rows-with-only-one-period-in-r%23new-answer', 'question_page');
                              }
                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              Paul Cézanne

                              UIScrollView CustomStickyHeader Resize height generates problems when scroll is too fast

                              Angular material date-picker (MatDatepicker) auto completes the date on focus out