A function to fill in a column with NA of the same type











up vote
14
down vote

favorite
2












I have a data frame with many columns of different types. I would like to replace each column with NA of the corresponding class.



for example:



df = data_frame(x = c(1,2,3), y = c("a", "b", "c"))

df[, 1:2] <- NA


yields a data frame with two logical columns, rather than numeric and character.
I know I can tell R:



df[,1] = as.numeric(NA)
df[,2] = as.character(NA)


But how do I do this collectively in a loop for all columns with all possible types of NA?










share|improve this question




















  • 3




    Good question +1, but why does this matter?
    – Tim Biegeleisen
    Dec 11 at 7:48










  • It's a very weird problem, I later need to join the data frame with another frame of the original type...
    – Omry Atia
    Dec 11 at 7:49






  • 1




    But why? Please give us more context, seems like pointless (but fun) step.
    – zx8754
    Dec 11 at 8:45










  • I have a data frame created in the beginning of my program, which sometimes need to get all NA's in some columns based on a condition. This data frame needs to be joined with another data frame in the end of the program, which might not get these NA's. In order for the join to work, the two data frames need to have exactly the same types of columns.
    – Omry Atia
    Dec 11 at 9:27






  • 1




    Just a minor correction, you shouldn't talk about classes here but about atomic types, and it would be more idiomatic to use NA_character_ and NA_numeric_ than as.character(NA) and as.numeric(NA).
    – Moody_Mudskipper
    2 days ago















up vote
14
down vote

favorite
2












I have a data frame with many columns of different types. I would like to replace each column with NA of the corresponding class.



for example:



df = data_frame(x = c(1,2,3), y = c("a", "b", "c"))

df[, 1:2] <- NA


yields a data frame with two logical columns, rather than numeric and character.
I know I can tell R:



df[,1] = as.numeric(NA)
df[,2] = as.character(NA)


But how do I do this collectively in a loop for all columns with all possible types of NA?










share|improve this question




















  • 3




    Good question +1, but why does this matter?
    – Tim Biegeleisen
    Dec 11 at 7:48










  • It's a very weird problem, I later need to join the data frame with another frame of the original type...
    – Omry Atia
    Dec 11 at 7:49






  • 1




    But why? Please give us more context, seems like pointless (but fun) step.
    – zx8754
    Dec 11 at 8:45










  • I have a data frame created in the beginning of my program, which sometimes need to get all NA's in some columns based on a condition. This data frame needs to be joined with another data frame in the end of the program, which might not get these NA's. In order for the join to work, the two data frames need to have exactly the same types of columns.
    – Omry Atia
    Dec 11 at 9:27






  • 1




    Just a minor correction, you shouldn't talk about classes here but about atomic types, and it would be more idiomatic to use NA_character_ and NA_numeric_ than as.character(NA) and as.numeric(NA).
    – Moody_Mudskipper
    2 days ago













up vote
14
down vote

favorite
2









up vote
14
down vote

favorite
2






2





I have a data frame with many columns of different types. I would like to replace each column with NA of the corresponding class.



for example:



df = data_frame(x = c(1,2,3), y = c("a", "b", "c"))

df[, 1:2] <- NA


yields a data frame with two logical columns, rather than numeric and character.
I know I can tell R:



df[,1] = as.numeric(NA)
df[,2] = as.character(NA)


But how do I do this collectively in a loop for all columns with all possible types of NA?










share|improve this question















I have a data frame with many columns of different types. I would like to replace each column with NA of the corresponding class.



for example:



df = data_frame(x = c(1,2,3), y = c("a", "b", "c"))

df[, 1:2] <- NA


yields a data frame with two logical columns, rather than numeric and character.
I know I can tell R:



df[,1] = as.numeric(NA)
df[,2] = as.character(NA)


But how do I do this collectively in a loop for all columns with all possible types of NA?







r dplyr na






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Dec 11 at 8:18









zx8754

29.1k76396




29.1k76396










asked Dec 11 at 7:36









Omry Atia

819412




819412








  • 3




    Good question +1, but why does this matter?
    – Tim Biegeleisen
    Dec 11 at 7:48










  • It's a very weird problem, I later need to join the data frame with another frame of the original type...
    – Omry Atia
    Dec 11 at 7:49






  • 1




    But why? Please give us more context, seems like pointless (but fun) step.
    – zx8754
    Dec 11 at 8:45










  • I have a data frame created in the beginning of my program, which sometimes need to get all NA's in some columns based on a condition. This data frame needs to be joined with another data frame in the end of the program, which might not get these NA's. In order for the join to work, the two data frames need to have exactly the same types of columns.
    – Omry Atia
    Dec 11 at 9:27






  • 1




    Just a minor correction, you shouldn't talk about classes here but about atomic types, and it would be more idiomatic to use NA_character_ and NA_numeric_ than as.character(NA) and as.numeric(NA).
    – Moody_Mudskipper
    2 days ago














  • 3




    Good question +1, but why does this matter?
    – Tim Biegeleisen
    Dec 11 at 7:48










  • It's a very weird problem, I later need to join the data frame with another frame of the original type...
    – Omry Atia
    Dec 11 at 7:49






  • 1




    But why? Please give us more context, seems like pointless (but fun) step.
    – zx8754
    Dec 11 at 8:45










  • I have a data frame created in the beginning of my program, which sometimes need to get all NA's in some columns based on a condition. This data frame needs to be joined with another data frame in the end of the program, which might not get these NA's. In order for the join to work, the two data frames need to have exactly the same types of columns.
    – Omry Atia
    Dec 11 at 9:27






  • 1




    Just a minor correction, you shouldn't talk about classes here but about atomic types, and it would be more idiomatic to use NA_character_ and NA_numeric_ than as.character(NA) and as.numeric(NA).
    – Moody_Mudskipper
    2 days ago








3




3




Good question +1, but why does this matter?
– Tim Biegeleisen
Dec 11 at 7:48




Good question +1, but why does this matter?
– Tim Biegeleisen
Dec 11 at 7:48












It's a very weird problem, I later need to join the data frame with another frame of the original type...
– Omry Atia
Dec 11 at 7:49




It's a very weird problem, I later need to join the data frame with another frame of the original type...
– Omry Atia
Dec 11 at 7:49




1




1




But why? Please give us more context, seems like pointless (but fun) step.
– zx8754
Dec 11 at 8:45




But why? Please give us more context, seems like pointless (but fun) step.
– zx8754
Dec 11 at 8:45












I have a data frame created in the beginning of my program, which sometimes need to get all NA's in some columns based on a condition. This data frame needs to be joined with another data frame in the end of the program, which might not get these NA's. In order for the join to work, the two data frames need to have exactly the same types of columns.
– Omry Atia
Dec 11 at 9:27




I have a data frame created in the beginning of my program, which sometimes need to get all NA's in some columns based on a condition. This data frame needs to be joined with another data frame in the end of the program, which might not get these NA's. In order for the join to work, the two data frames need to have exactly the same types of columns.
– Omry Atia
Dec 11 at 9:27




1




1




Just a minor correction, you shouldn't talk about classes here but about atomic types, and it would be more idiomatic to use NA_character_ and NA_numeric_ than as.character(NA) and as.numeric(NA).
– Moody_Mudskipper
2 days ago




Just a minor correction, you shouldn't talk about classes here but about atomic types, and it would be more idiomatic to use NA_character_ and NA_numeric_ than as.character(NA) and as.numeric(NA).
– Moody_Mudskipper
2 days ago












5 Answers
5






active

oldest

votes

















up vote
9
down vote



accepted










You can use this "trick" :



df[1:nrow(df),1] <- NA
df[1:nrow(df),2] <- NA


the [1:nrow(df),] basically tells R to replace all values in the column with NA and in this way the logical NA is coerced to the original type of the column before replacing the other values.



Also, if you have a lot of columns to replace and the data_frame has a lot of rows, I suggest to store the row indexes and reuse them :



rowIdxs <- 1:nrow(df)
df[rowIdxs ,1] <- NA
df[rowIdxs ,2] <- NA
df[rowIdxs ,3] <- NA
...




As cleverly suggested by @RonakShah, you can also use :



df[TRUE, 1] <- NA
df[TRUE, 2] <- NA
...




As pointed out by @Cath both the methods still work when you select more than one column e.g. :



df[TRUE, 1:3] <- NA
# or
df[1:nrow(df), 1:3] <- NA





share|improve this answer























  • This doesn't seem to work... df is still logical :(
    – Omry Atia
    Dec 11 at 7:45










  • @OmryAtia : edited. it should work now ;)
    – digEmAll
    Dec 11 at 7:50










  • Awesome... so simple :)
    – Omry Atia
    Dec 11 at 7:51






  • 3




    why not directly df[TRUE, 1:2] <- NA?
    – Cath
    Dec 11 at 8:42










  • @Cath: sure, added in the answer, thanks !
    – digEmAll
    Dec 11 at 8:47


















up vote
8
down vote













Another solution that applies to all the columns can be to specify the non-NAs and replace with NA, i.e.



df[!is.na(df)] <- NA


which gives,




# A tibble: 3 x 2
x y
<dbl> <chr>
1 NA <NA>
2 NA <NA>
3 NA <NA>






share|improve this answer




























    up vote
    5
    down vote













    Another way to change all columns at once while keeping the variables' classes:



    df <- lapply(df, function(x) {type <- class(x); x <- NA; class(x) <- type; x})

    df
    # A tibble: 3 x 2
    # x y
    # <dbl> <chr>
    #1 NA <NA>
    #2 NA <NA>
    #3 NA <NA>




    As @digEmAll notified in comments, there is another similar but shorter way:



    df <- lapply(df, function(x) as(NA,class(x)))





    share|improve this answer



















    • 2




      Also lapply(df, function(x)as(NA,class(x))) should work
      – digEmAll
      Dec 11 at 8:34










    • @digEmAll indeed and much shorter thanks!
      – Cath
      Dec 11 at 8:35






    • 2




      another base option df <- lapply(df, replace, TRUE, NA)
      – docendo discimus
      Dec 11 at 10:05










    • This works in many cases, but not always. The problem is that some classes don't have methods that automatically convert the underlying typeof, and sometimes as doesn't know how to handle classes. Try it with a POSIXct: as throws an error, and manually setting the class to c("POSIXt", "POSIXct") seems to work, but does not convert the underlying NA, the result is different then as.POSIXct(NA)
      – Emil Bode
      Dec 11 at 17:33










    • It would be better to use typeof instead of class, here it works "by chance" but will fail in the general case (e.g. factors).
      – Moody_Mudskipper
      2 days ago


















    up vote
    4
    down vote













    Using dplyr::na_if:



    library(dplyr)

    df %>%
    mutate(x = na_if(x, x),
    y = na_if(y, y))

    # # A tibble: 3 x 2
    # x y
    # <dbl> <chr>
    # 1 NA NA
    # 2 NA NA
    # 3 NA NA


    If we want to mutate only subset of columns to NA, then:



    # dataframe with extra column that stay unchanged
    df = data_frame(x = c(1,2,3), y = c("a", "b", "c"), z = c(4:6))

    df %>%
    mutate_at(vars(x, y), funs(na_if(.,.)))

    # # A tibble: 3 x 3
    # x y z
    # <dbl> <chr> <int>
    # 1 NA NA 4
    # 2 NA NA 5
    # 3 NA NA 6





    share|improve this answer























    • Or df <- mutate_all(df,~na_if(.,.)) (or modify(df,~na_if(.,.))) while you're there :)
      – Moody_Mudskipper
      2 days ago












    • @Moody_Mudskipper I am using mutate_at as OP might want to do this on subset of columns. If they want to apply this to all columns, then why not just create an empty dataframe with 0 rows...
      – zx8754
      2 days ago










    • I don't know... OP's use case is obscure, but he mentions replacing each column by NAs.
      – Moody_Mudskipper
      2 days ago


















    up vote
    0
    down vote













    Using bind_cols() from dplyr you can also do:



    df <- data_frame(x = c(1,2,3), y = c("a", "b", "c"))
    classes <- sapply(df, class)
    df[,1:2] <- NA

    bind_cols(lapply(colnames(x), function(x){eval(parse(text=paste0("as.", classes[names(classes[x])], "(", df[,x],")")))}))

    V1 V2
    <dbl> <chr>
    1 NA NA
    2 NA NA
    3 NA NA


    Please note that this will change the colnames.






    share|improve this answer





















      Your Answer






      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "1"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53719387%2fa-function-to-fill-in-a-column-with-na-of-the-same-type%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      5 Answers
      5






      active

      oldest

      votes








      5 Answers
      5






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      9
      down vote



      accepted










      You can use this "trick" :



      df[1:nrow(df),1] <- NA
      df[1:nrow(df),2] <- NA


      the [1:nrow(df),] basically tells R to replace all values in the column with NA and in this way the logical NA is coerced to the original type of the column before replacing the other values.



      Also, if you have a lot of columns to replace and the data_frame has a lot of rows, I suggest to store the row indexes and reuse them :



      rowIdxs <- 1:nrow(df)
      df[rowIdxs ,1] <- NA
      df[rowIdxs ,2] <- NA
      df[rowIdxs ,3] <- NA
      ...




      As cleverly suggested by @RonakShah, you can also use :



      df[TRUE, 1] <- NA
      df[TRUE, 2] <- NA
      ...




      As pointed out by @Cath both the methods still work when you select more than one column e.g. :



      df[TRUE, 1:3] <- NA
      # or
      df[1:nrow(df), 1:3] <- NA





      share|improve this answer























      • This doesn't seem to work... df is still logical :(
        – Omry Atia
        Dec 11 at 7:45










      • @OmryAtia : edited. it should work now ;)
        – digEmAll
        Dec 11 at 7:50










      • Awesome... so simple :)
        – Omry Atia
        Dec 11 at 7:51






      • 3




        why not directly df[TRUE, 1:2] <- NA?
        – Cath
        Dec 11 at 8:42










      • @Cath: sure, added in the answer, thanks !
        – digEmAll
        Dec 11 at 8:47















      up vote
      9
      down vote



      accepted










      You can use this "trick" :



      df[1:nrow(df),1] <- NA
      df[1:nrow(df),2] <- NA


      the [1:nrow(df),] basically tells R to replace all values in the column with NA and in this way the logical NA is coerced to the original type of the column before replacing the other values.



      Also, if you have a lot of columns to replace and the data_frame has a lot of rows, I suggest to store the row indexes and reuse them :



      rowIdxs <- 1:nrow(df)
      df[rowIdxs ,1] <- NA
      df[rowIdxs ,2] <- NA
      df[rowIdxs ,3] <- NA
      ...




      As cleverly suggested by @RonakShah, you can also use :



      df[TRUE, 1] <- NA
      df[TRUE, 2] <- NA
      ...




      As pointed out by @Cath both the methods still work when you select more than one column e.g. :



      df[TRUE, 1:3] <- NA
      # or
      df[1:nrow(df), 1:3] <- NA





      share|improve this answer























      • This doesn't seem to work... df is still logical :(
        – Omry Atia
        Dec 11 at 7:45










      • @OmryAtia : edited. it should work now ;)
        – digEmAll
        Dec 11 at 7:50










      • Awesome... so simple :)
        – Omry Atia
        Dec 11 at 7:51






      • 3




        why not directly df[TRUE, 1:2] <- NA?
        – Cath
        Dec 11 at 8:42










      • @Cath: sure, added in the answer, thanks !
        – digEmAll
        Dec 11 at 8:47













      up vote
      9
      down vote



      accepted







      up vote
      9
      down vote



      accepted






      You can use this "trick" :



      df[1:nrow(df),1] <- NA
      df[1:nrow(df),2] <- NA


      the [1:nrow(df),] basically tells R to replace all values in the column with NA and in this way the logical NA is coerced to the original type of the column before replacing the other values.



      Also, if you have a lot of columns to replace and the data_frame has a lot of rows, I suggest to store the row indexes and reuse them :



      rowIdxs <- 1:nrow(df)
      df[rowIdxs ,1] <- NA
      df[rowIdxs ,2] <- NA
      df[rowIdxs ,3] <- NA
      ...




      As cleverly suggested by @RonakShah, you can also use :



      df[TRUE, 1] <- NA
      df[TRUE, 2] <- NA
      ...




      As pointed out by @Cath both the methods still work when you select more than one column e.g. :



      df[TRUE, 1:3] <- NA
      # or
      df[1:nrow(df), 1:3] <- NA





      share|improve this answer














      You can use this "trick" :



      df[1:nrow(df),1] <- NA
      df[1:nrow(df),2] <- NA


      the [1:nrow(df),] basically tells R to replace all values in the column with NA and in this way the logical NA is coerced to the original type of the column before replacing the other values.



      Also, if you have a lot of columns to replace and the data_frame has a lot of rows, I suggest to store the row indexes and reuse them :



      rowIdxs <- 1:nrow(df)
      df[rowIdxs ,1] <- NA
      df[rowIdxs ,2] <- NA
      df[rowIdxs ,3] <- NA
      ...




      As cleverly suggested by @RonakShah, you can also use :



      df[TRUE, 1] <- NA
      df[TRUE, 2] <- NA
      ...




      As pointed out by @Cath both the methods still work when you select more than one column e.g. :



      df[TRUE, 1:3] <- NA
      # or
      df[1:nrow(df), 1:3] <- NA






      share|improve this answer














      share|improve this answer



      share|improve this answer








      edited Dec 11 at 8:47

























      answered Dec 11 at 7:43









      digEmAll

      46.2k984120




      46.2k984120












      • This doesn't seem to work... df is still logical :(
        – Omry Atia
        Dec 11 at 7:45










      • @OmryAtia : edited. it should work now ;)
        – digEmAll
        Dec 11 at 7:50










      • Awesome... so simple :)
        – Omry Atia
        Dec 11 at 7:51






      • 3




        why not directly df[TRUE, 1:2] <- NA?
        – Cath
        Dec 11 at 8:42










      • @Cath: sure, added in the answer, thanks !
        – digEmAll
        Dec 11 at 8:47


















      • This doesn't seem to work... df is still logical :(
        – Omry Atia
        Dec 11 at 7:45










      • @OmryAtia : edited. it should work now ;)
        – digEmAll
        Dec 11 at 7:50










      • Awesome... so simple :)
        – Omry Atia
        Dec 11 at 7:51






      • 3




        why not directly df[TRUE, 1:2] <- NA?
        – Cath
        Dec 11 at 8:42










      • @Cath: sure, added in the answer, thanks !
        – digEmAll
        Dec 11 at 8:47
















      This doesn't seem to work... df is still logical :(
      – Omry Atia
      Dec 11 at 7:45




      This doesn't seem to work... df is still logical :(
      – Omry Atia
      Dec 11 at 7:45












      @OmryAtia : edited. it should work now ;)
      – digEmAll
      Dec 11 at 7:50




      @OmryAtia : edited. it should work now ;)
      – digEmAll
      Dec 11 at 7:50












      Awesome... so simple :)
      – Omry Atia
      Dec 11 at 7:51




      Awesome... so simple :)
      – Omry Atia
      Dec 11 at 7:51




      3




      3




      why not directly df[TRUE, 1:2] <- NA?
      – Cath
      Dec 11 at 8:42




      why not directly df[TRUE, 1:2] <- NA?
      – Cath
      Dec 11 at 8:42












      @Cath: sure, added in the answer, thanks !
      – digEmAll
      Dec 11 at 8:47




      @Cath: sure, added in the answer, thanks !
      – digEmAll
      Dec 11 at 8:47












      up vote
      8
      down vote













      Another solution that applies to all the columns can be to specify the non-NAs and replace with NA, i.e.



      df[!is.na(df)] <- NA


      which gives,




      # A tibble: 3 x 2
      x y
      <dbl> <chr>
      1 NA <NA>
      2 NA <NA>
      3 NA <NA>






      share|improve this answer

























        up vote
        8
        down vote













        Another solution that applies to all the columns can be to specify the non-NAs and replace with NA, i.e.



        df[!is.na(df)] <- NA


        which gives,




        # A tibble: 3 x 2
        x y
        <dbl> <chr>
        1 NA <NA>
        2 NA <NA>
        3 NA <NA>






        share|improve this answer























          up vote
          8
          down vote










          up vote
          8
          down vote









          Another solution that applies to all the columns can be to specify the non-NAs and replace with NA, i.e.



          df[!is.na(df)] <- NA


          which gives,




          # A tibble: 3 x 2
          x y
          <dbl> <chr>
          1 NA <NA>
          2 NA <NA>
          3 NA <NA>






          share|improve this answer












          Another solution that applies to all the columns can be to specify the non-NAs and replace with NA, i.e.



          df[!is.na(df)] <- NA


          which gives,




          # A tibble: 3 x 2
          x y
          <dbl> <chr>
          1 NA <NA>
          2 NA <NA>
          3 NA <NA>







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Dec 11 at 8:00









          Sotos

          27.6k51640




          27.6k51640






















              up vote
              5
              down vote













              Another way to change all columns at once while keeping the variables' classes:



              df <- lapply(df, function(x) {type <- class(x); x <- NA; class(x) <- type; x})

              df
              # A tibble: 3 x 2
              # x y
              # <dbl> <chr>
              #1 NA <NA>
              #2 NA <NA>
              #3 NA <NA>




              As @digEmAll notified in comments, there is another similar but shorter way:



              df <- lapply(df, function(x) as(NA,class(x)))





              share|improve this answer



















              • 2




                Also lapply(df, function(x)as(NA,class(x))) should work
                – digEmAll
                Dec 11 at 8:34










              • @digEmAll indeed and much shorter thanks!
                – Cath
                Dec 11 at 8:35






              • 2




                another base option df <- lapply(df, replace, TRUE, NA)
                – docendo discimus
                Dec 11 at 10:05










              • This works in many cases, but not always. The problem is that some classes don't have methods that automatically convert the underlying typeof, and sometimes as doesn't know how to handle classes. Try it with a POSIXct: as throws an error, and manually setting the class to c("POSIXt", "POSIXct") seems to work, but does not convert the underlying NA, the result is different then as.POSIXct(NA)
                – Emil Bode
                Dec 11 at 17:33










              • It would be better to use typeof instead of class, here it works "by chance" but will fail in the general case (e.g. factors).
                – Moody_Mudskipper
                2 days ago















              up vote
              5
              down vote













              Another way to change all columns at once while keeping the variables' classes:



              df <- lapply(df, function(x) {type <- class(x); x <- NA; class(x) <- type; x})

              df
              # A tibble: 3 x 2
              # x y
              # <dbl> <chr>
              #1 NA <NA>
              #2 NA <NA>
              #3 NA <NA>




              As @digEmAll notified in comments, there is another similar but shorter way:



              df <- lapply(df, function(x) as(NA,class(x)))





              share|improve this answer



















              • 2




                Also lapply(df, function(x)as(NA,class(x))) should work
                – digEmAll
                Dec 11 at 8:34










              • @digEmAll indeed and much shorter thanks!
                – Cath
                Dec 11 at 8:35






              • 2




                another base option df <- lapply(df, replace, TRUE, NA)
                – docendo discimus
                Dec 11 at 10:05










              • This works in many cases, but not always. The problem is that some classes don't have methods that automatically convert the underlying typeof, and sometimes as doesn't know how to handle classes. Try it with a POSIXct: as throws an error, and manually setting the class to c("POSIXt", "POSIXct") seems to work, but does not convert the underlying NA, the result is different then as.POSIXct(NA)
                – Emil Bode
                Dec 11 at 17:33










              • It would be better to use typeof instead of class, here it works "by chance" but will fail in the general case (e.g. factors).
                – Moody_Mudskipper
                2 days ago













              up vote
              5
              down vote










              up vote
              5
              down vote









              Another way to change all columns at once while keeping the variables' classes:



              df <- lapply(df, function(x) {type <- class(x); x <- NA; class(x) <- type; x})

              df
              # A tibble: 3 x 2
              # x y
              # <dbl> <chr>
              #1 NA <NA>
              #2 NA <NA>
              #3 NA <NA>




              As @digEmAll notified in comments, there is another similar but shorter way:



              df <- lapply(df, function(x) as(NA,class(x)))





              share|improve this answer














              Another way to change all columns at once while keeping the variables' classes:



              df <- lapply(df, function(x) {type <- class(x); x <- NA; class(x) <- type; x})

              df
              # A tibble: 3 x 2
              # x y
              # <dbl> <chr>
              #1 NA <NA>
              #2 NA <NA>
              #3 NA <NA>




              As @digEmAll notified in comments, there is another similar but shorter way:



              df <- lapply(df, function(x) as(NA,class(x)))






              share|improve this answer














              share|improve this answer



              share|improve this answer








              edited Dec 11 at 8:37

























              answered Dec 11 at 8:32









              Cath

              19.6k43464




              19.6k43464








              • 2




                Also lapply(df, function(x)as(NA,class(x))) should work
                – digEmAll
                Dec 11 at 8:34










              • @digEmAll indeed and much shorter thanks!
                – Cath
                Dec 11 at 8:35






              • 2




                another base option df <- lapply(df, replace, TRUE, NA)
                – docendo discimus
                Dec 11 at 10:05










              • This works in many cases, but not always. The problem is that some classes don't have methods that automatically convert the underlying typeof, and sometimes as doesn't know how to handle classes. Try it with a POSIXct: as throws an error, and manually setting the class to c("POSIXt", "POSIXct") seems to work, but does not convert the underlying NA, the result is different then as.POSIXct(NA)
                – Emil Bode
                Dec 11 at 17:33










              • It would be better to use typeof instead of class, here it works "by chance" but will fail in the general case (e.g. factors).
                – Moody_Mudskipper
                2 days ago














              • 2




                Also lapply(df, function(x)as(NA,class(x))) should work
                – digEmAll
                Dec 11 at 8:34










              • @digEmAll indeed and much shorter thanks!
                – Cath
                Dec 11 at 8:35






              • 2




                another base option df <- lapply(df, replace, TRUE, NA)
                – docendo discimus
                Dec 11 at 10:05










              • This works in many cases, but not always. The problem is that some classes don't have methods that automatically convert the underlying typeof, and sometimes as doesn't know how to handle classes. Try it with a POSIXct: as throws an error, and manually setting the class to c("POSIXt", "POSIXct") seems to work, but does not convert the underlying NA, the result is different then as.POSIXct(NA)
                – Emil Bode
                Dec 11 at 17:33










              • It would be better to use typeof instead of class, here it works "by chance" but will fail in the general case (e.g. factors).
                – Moody_Mudskipper
                2 days ago








              2




              2




              Also lapply(df, function(x)as(NA,class(x))) should work
              – digEmAll
              Dec 11 at 8:34




              Also lapply(df, function(x)as(NA,class(x))) should work
              – digEmAll
              Dec 11 at 8:34












              @digEmAll indeed and much shorter thanks!
              – Cath
              Dec 11 at 8:35




              @digEmAll indeed and much shorter thanks!
              – Cath
              Dec 11 at 8:35




              2




              2




              another base option df <- lapply(df, replace, TRUE, NA)
              – docendo discimus
              Dec 11 at 10:05




              another base option df <- lapply(df, replace, TRUE, NA)
              – docendo discimus
              Dec 11 at 10:05












              This works in many cases, but not always. The problem is that some classes don't have methods that automatically convert the underlying typeof, and sometimes as doesn't know how to handle classes. Try it with a POSIXct: as throws an error, and manually setting the class to c("POSIXt", "POSIXct") seems to work, but does not convert the underlying NA, the result is different then as.POSIXct(NA)
              – Emil Bode
              Dec 11 at 17:33




              This works in many cases, but not always. The problem is that some classes don't have methods that automatically convert the underlying typeof, and sometimes as doesn't know how to handle classes. Try it with a POSIXct: as throws an error, and manually setting the class to c("POSIXt", "POSIXct") seems to work, but does not convert the underlying NA, the result is different then as.POSIXct(NA)
              – Emil Bode
              Dec 11 at 17:33












              It would be better to use typeof instead of class, here it works "by chance" but will fail in the general case (e.g. factors).
              – Moody_Mudskipper
              2 days ago




              It would be better to use typeof instead of class, here it works "by chance" but will fail in the general case (e.g. factors).
              – Moody_Mudskipper
              2 days ago










              up vote
              4
              down vote













              Using dplyr::na_if:



              library(dplyr)

              df %>%
              mutate(x = na_if(x, x),
              y = na_if(y, y))

              # # A tibble: 3 x 2
              # x y
              # <dbl> <chr>
              # 1 NA NA
              # 2 NA NA
              # 3 NA NA


              If we want to mutate only subset of columns to NA, then:



              # dataframe with extra column that stay unchanged
              df = data_frame(x = c(1,2,3), y = c("a", "b", "c"), z = c(4:6))

              df %>%
              mutate_at(vars(x, y), funs(na_if(.,.)))

              # # A tibble: 3 x 3
              # x y z
              # <dbl> <chr> <int>
              # 1 NA NA 4
              # 2 NA NA 5
              # 3 NA NA 6





              share|improve this answer























              • Or df <- mutate_all(df,~na_if(.,.)) (or modify(df,~na_if(.,.))) while you're there :)
                – Moody_Mudskipper
                2 days ago












              • @Moody_Mudskipper I am using mutate_at as OP might want to do this on subset of columns. If they want to apply this to all columns, then why not just create an empty dataframe with 0 rows...
                – zx8754
                2 days ago










              • I don't know... OP's use case is obscure, but he mentions replacing each column by NAs.
                – Moody_Mudskipper
                2 days ago















              up vote
              4
              down vote













              Using dplyr::na_if:



              library(dplyr)

              df %>%
              mutate(x = na_if(x, x),
              y = na_if(y, y))

              # # A tibble: 3 x 2
              # x y
              # <dbl> <chr>
              # 1 NA NA
              # 2 NA NA
              # 3 NA NA


              If we want to mutate only subset of columns to NA, then:



              # dataframe with extra column that stay unchanged
              df = data_frame(x = c(1,2,3), y = c("a", "b", "c"), z = c(4:6))

              df %>%
              mutate_at(vars(x, y), funs(na_if(.,.)))

              # # A tibble: 3 x 3
              # x y z
              # <dbl> <chr> <int>
              # 1 NA NA 4
              # 2 NA NA 5
              # 3 NA NA 6





              share|improve this answer























              • Or df <- mutate_all(df,~na_if(.,.)) (or modify(df,~na_if(.,.))) while you're there :)
                – Moody_Mudskipper
                2 days ago












              • @Moody_Mudskipper I am using mutate_at as OP might want to do this on subset of columns. If they want to apply this to all columns, then why not just create an empty dataframe with 0 rows...
                – zx8754
                2 days ago










              • I don't know... OP's use case is obscure, but he mentions replacing each column by NAs.
                – Moody_Mudskipper
                2 days ago













              up vote
              4
              down vote










              up vote
              4
              down vote









              Using dplyr::na_if:



              library(dplyr)

              df %>%
              mutate(x = na_if(x, x),
              y = na_if(y, y))

              # # A tibble: 3 x 2
              # x y
              # <dbl> <chr>
              # 1 NA NA
              # 2 NA NA
              # 3 NA NA


              If we want to mutate only subset of columns to NA, then:



              # dataframe with extra column that stay unchanged
              df = data_frame(x = c(1,2,3), y = c("a", "b", "c"), z = c(4:6))

              df %>%
              mutate_at(vars(x, y), funs(na_if(.,.)))

              # # A tibble: 3 x 3
              # x y z
              # <dbl> <chr> <int>
              # 1 NA NA 4
              # 2 NA NA 5
              # 3 NA NA 6





              share|improve this answer














              Using dplyr::na_if:



              library(dplyr)

              df %>%
              mutate(x = na_if(x, x),
              y = na_if(y, y))

              # # A tibble: 3 x 2
              # x y
              # <dbl> <chr>
              # 1 NA NA
              # 2 NA NA
              # 3 NA NA


              If we want to mutate only subset of columns to NA, then:



              # dataframe with extra column that stay unchanged
              df = data_frame(x = c(1,2,3), y = c("a", "b", "c"), z = c(4:6))

              df %>%
              mutate_at(vars(x, y), funs(na_if(.,.)))

              # # A tibble: 3 x 3
              # x y z
              # <dbl> <chr> <int>
              # 1 NA NA 4
              # 2 NA NA 5
              # 3 NA NA 6






              share|improve this answer














              share|improve this answer



              share|improve this answer








              edited Dec 11 at 8:21

























              answered Dec 11 at 8:11









              zx8754

              29.1k76396




              29.1k76396












              • Or df <- mutate_all(df,~na_if(.,.)) (or modify(df,~na_if(.,.))) while you're there :)
                – Moody_Mudskipper
                2 days ago












              • @Moody_Mudskipper I am using mutate_at as OP might want to do this on subset of columns. If they want to apply this to all columns, then why not just create an empty dataframe with 0 rows...
                – zx8754
                2 days ago










              • I don't know... OP's use case is obscure, but he mentions replacing each column by NAs.
                – Moody_Mudskipper
                2 days ago


















              • Or df <- mutate_all(df,~na_if(.,.)) (or modify(df,~na_if(.,.))) while you're there :)
                – Moody_Mudskipper
                2 days ago












              • @Moody_Mudskipper I am using mutate_at as OP might want to do this on subset of columns. If they want to apply this to all columns, then why not just create an empty dataframe with 0 rows...
                – zx8754
                2 days ago










              • I don't know... OP's use case is obscure, but he mentions replacing each column by NAs.
                – Moody_Mudskipper
                2 days ago
















              Or df <- mutate_all(df,~na_if(.,.)) (or modify(df,~na_if(.,.))) while you're there :)
              – Moody_Mudskipper
              2 days ago






              Or df <- mutate_all(df,~na_if(.,.)) (or modify(df,~na_if(.,.))) while you're there :)
              – Moody_Mudskipper
              2 days ago














              @Moody_Mudskipper I am using mutate_at as OP might want to do this on subset of columns. If they want to apply this to all columns, then why not just create an empty dataframe with 0 rows...
              – zx8754
              2 days ago




              @Moody_Mudskipper I am using mutate_at as OP might want to do this on subset of columns. If they want to apply this to all columns, then why not just create an empty dataframe with 0 rows...
              – zx8754
              2 days ago












              I don't know... OP's use case is obscure, but he mentions replacing each column by NAs.
              – Moody_Mudskipper
              2 days ago




              I don't know... OP's use case is obscure, but he mentions replacing each column by NAs.
              – Moody_Mudskipper
              2 days ago










              up vote
              0
              down vote













              Using bind_cols() from dplyr you can also do:



              df <- data_frame(x = c(1,2,3), y = c("a", "b", "c"))
              classes <- sapply(df, class)
              df[,1:2] <- NA

              bind_cols(lapply(colnames(x), function(x){eval(parse(text=paste0("as.", classes[names(classes[x])], "(", df[,x],")")))}))

              V1 V2
              <dbl> <chr>
              1 NA NA
              2 NA NA
              3 NA NA


              Please note that this will change the colnames.






              share|improve this answer

























                up vote
                0
                down vote













                Using bind_cols() from dplyr you can also do:



                df <- data_frame(x = c(1,2,3), y = c("a", "b", "c"))
                classes <- sapply(df, class)
                df[,1:2] <- NA

                bind_cols(lapply(colnames(x), function(x){eval(parse(text=paste0("as.", classes[names(classes[x])], "(", df[,x],")")))}))

                V1 V2
                <dbl> <chr>
                1 NA NA
                2 NA NA
                3 NA NA


                Please note that this will change the colnames.






                share|improve this answer























                  up vote
                  0
                  down vote










                  up vote
                  0
                  down vote









                  Using bind_cols() from dplyr you can also do:



                  df <- data_frame(x = c(1,2,3), y = c("a", "b", "c"))
                  classes <- sapply(df, class)
                  df[,1:2] <- NA

                  bind_cols(lapply(colnames(x), function(x){eval(parse(text=paste0("as.", classes[names(classes[x])], "(", df[,x],")")))}))

                  V1 V2
                  <dbl> <chr>
                  1 NA NA
                  2 NA NA
                  3 NA NA


                  Please note that this will change the colnames.






                  share|improve this answer












                  Using bind_cols() from dplyr you can also do:



                  df <- data_frame(x = c(1,2,3), y = c("a", "b", "c"))
                  classes <- sapply(df, class)
                  df[,1:2] <- NA

                  bind_cols(lapply(colnames(x), function(x){eval(parse(text=paste0("as.", classes[names(classes[x])], "(", df[,x],")")))}))

                  V1 V2
                  <dbl> <chr>
                  1 NA NA
                  2 NA NA
                  3 NA NA


                  Please note that this will change the colnames.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Dec 11 at 7:59









                  alex_555

                  666315




                  666315






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.





                      Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                      Please pay close attention to the following guidance:


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53719387%2fa-function-to-fill-in-a-column-with-na-of-the-same-type%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      "Incorrect syntax near the keyword 'ON'. (on update cascade, on delete cascade,)

                      Alcedinidae

                      RAC Tourist Trophy