Function call to c_str() vs const char* in hash function












-2















I was looking at hash functions on stackoverflow when I found one that was pretty interesting. It involves casting a const char* to a size_t* and then de-referencing the size_t. This is then bit shifted to a certain precision. This works for const char*, producing the same value each time. However, when I use an actual string type, and call c_str() instead, the two values produced do not match. Furthermore, on each run of the code, the string produces different values each run. Anyone have an idea of why this is occurring?



const string l = "BA";
const char* k = l.c_str();
const char* p = "BA";
cout << k << " " << *((size_t*)k) << endl;
cout << p << " " << *((size_t*)p) << endl;


Run 1:



BA 140736766951746
BA 7162260525311607106


Run 2:



BA 140736985055554
BA 7162260525311607106


Original question: Have a good hash function for a C++ hash table?










share|improve this question























  • A size_t is 32 or 64 bits long, depending on whether you compile for 32 or 64 bits. What do you think where the missing 8 or 40 bits come from?

    – tkausl
    Nov 22 '18 at 2:21











  • Null terminator?

    – bandittoaxe
    Nov 22 '18 at 2:28






  • 4





    Firstly, it's undefined behaviour due to strict aliasing violation.

    – HolyBlackCat
    Nov 22 '18 at 4:50
















-2















I was looking at hash functions on stackoverflow when I found one that was pretty interesting. It involves casting a const char* to a size_t* and then de-referencing the size_t. This is then bit shifted to a certain precision. This works for const char*, producing the same value each time. However, when I use an actual string type, and call c_str() instead, the two values produced do not match. Furthermore, on each run of the code, the string produces different values each run. Anyone have an idea of why this is occurring?



const string l = "BA";
const char* k = l.c_str();
const char* p = "BA";
cout << k << " " << *((size_t*)k) << endl;
cout << p << " " << *((size_t*)p) << endl;


Run 1:



BA 140736766951746
BA 7162260525311607106


Run 2:



BA 140736985055554
BA 7162260525311607106


Original question: Have a good hash function for a C++ hash table?










share|improve this question























  • A size_t is 32 or 64 bits long, depending on whether you compile for 32 or 64 bits. What do you think where the missing 8 or 40 bits come from?

    – tkausl
    Nov 22 '18 at 2:21











  • Null terminator?

    – bandittoaxe
    Nov 22 '18 at 2:28






  • 4





    Firstly, it's undefined behaviour due to strict aliasing violation.

    – HolyBlackCat
    Nov 22 '18 at 4:50














-2












-2








-2


0






I was looking at hash functions on stackoverflow when I found one that was pretty interesting. It involves casting a const char* to a size_t* and then de-referencing the size_t. This is then bit shifted to a certain precision. This works for const char*, producing the same value each time. However, when I use an actual string type, and call c_str() instead, the two values produced do not match. Furthermore, on each run of the code, the string produces different values each run. Anyone have an idea of why this is occurring?



const string l = "BA";
const char* k = l.c_str();
const char* p = "BA";
cout << k << " " << *((size_t*)k) << endl;
cout << p << " " << *((size_t*)p) << endl;


Run 1:



BA 140736766951746
BA 7162260525311607106


Run 2:



BA 140736985055554
BA 7162260525311607106


Original question: Have a good hash function for a C++ hash table?










share|improve this question














I was looking at hash functions on stackoverflow when I found one that was pretty interesting. It involves casting a const char* to a size_t* and then de-referencing the size_t. This is then bit shifted to a certain precision. This works for const char*, producing the same value each time. However, when I use an actual string type, and call c_str() instead, the two values produced do not match. Furthermore, on each run of the code, the string produces different values each run. Anyone have an idea of why this is occurring?



const string l = "BA";
const char* k = l.c_str();
const char* p = "BA";
cout << k << " " << *((size_t*)k) << endl;
cout << p << " " << *((size_t*)p) << endl;


Run 1:



BA 140736766951746
BA 7162260525311607106


Run 2:



BA 140736985055554
BA 7162260525311607106


Original question: Have a good hash function for a C++ hash table?







c++ string casting size-t c-str






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 22 '18 at 2:18









bandittoaxebandittoaxe

11




11













  • A size_t is 32 or 64 bits long, depending on whether you compile for 32 or 64 bits. What do you think where the missing 8 or 40 bits come from?

    – tkausl
    Nov 22 '18 at 2:21











  • Null terminator?

    – bandittoaxe
    Nov 22 '18 at 2:28






  • 4





    Firstly, it's undefined behaviour due to strict aliasing violation.

    – HolyBlackCat
    Nov 22 '18 at 4:50



















  • A size_t is 32 or 64 bits long, depending on whether you compile for 32 or 64 bits. What do you think where the missing 8 or 40 bits come from?

    – tkausl
    Nov 22 '18 at 2:21











  • Null terminator?

    – bandittoaxe
    Nov 22 '18 at 2:28






  • 4





    Firstly, it's undefined behaviour due to strict aliasing violation.

    – HolyBlackCat
    Nov 22 '18 at 4:50

















A size_t is 32 or 64 bits long, depending on whether you compile for 32 or 64 bits. What do you think where the missing 8 or 40 bits come from?

– tkausl
Nov 22 '18 at 2:21





A size_t is 32 or 64 bits long, depending on whether you compile for 32 or 64 bits. What do you think where the missing 8 or 40 bits come from?

– tkausl
Nov 22 '18 at 2:21













Null terminator?

– bandittoaxe
Nov 22 '18 at 2:28





Null terminator?

– bandittoaxe
Nov 22 '18 at 2:28




4




4





Firstly, it's undefined behaviour due to strict aliasing violation.

– HolyBlackCat
Nov 22 '18 at 4:50





Firstly, it's undefined behaviour due to strict aliasing violation.

– HolyBlackCat
Nov 22 '18 at 4:50












3 Answers
3






active

oldest

votes


















3














*((size_t*)k) causes undefined behaviour by violating the strict aliasing rule. This code is only valid if k actually points to an object of type size_t.



Being undefined behaviour, seeing weird numbers is a possible result (as would be anything else).





I guess you intended something akin to:



size_t x;
memcpy(&x, k, sizeof x);
cout << k << " " << x << 'n';


It should now be clear what the problem is. Your string only contains 3 characters (2 plus the null terminator), however you attempt to read more than 3 characters which also causes undefined behaviour.






share|improve this answer































    0














    I'll start with saying that in:



    const string l = "BA";
    const char* k = l.c_str();
    const char* p = "BA";
    cout << k << " " << *((size_t*)k) << endl;
    cout << p << " " << *((size_t*)p) << endl;


    Both *((size_t*)k) and *((size_t*)p) invoke undefined behavior. This is so, since on most systems it will access data beyond the boundary of the char array. Note that, sizeof(size_t) > 3 * sizeof(char) for 32 and 64 bit system, so that *((size_t*)k) accesses at least one byte beyond the boundary.



    In the whole example, the string literals (on your system) are possibly aligned to at least sizeof(size_t), with zero padding (don't count on it, but it seems so). This means the junk after the string literal "BA" (and the NUL terminator) is NUL character(s). This is consistent across runs.



    In case of k, which comes from std::string you are not so lucky. The string is short, so most system will employ short string optimization. This means that that char buffer is in the std::string object. In your case, the string is so short that the remainder of it is still in the buffer dedicated for the short string optimization. As it seems, the remainder of the buffer is not initialized, and contains junk. The junk had been left over from before the function was called. As a result other than the first 3 bytes of BA, the rest is random junk.



    You were lucky that this case of undefined behavior ends up with some additional junk, and not something more perplexing (like always returning zero, or calling unrelated functions). Don't rely on UB, ever.






    share|improve this answer































      0














      // Simple null terminated character that is represented in memory as:
      //
      // ['B', 'A', '']
      const char* p = "BA";

      // From the other side `std::string` isn't so simple
      //
      // c_str() returns a pointer to some kind of buffer.
      //
      // ['B', 'A', '', ... reserved_memory]
      //
      const std::string l = "BA";
      const char* k = l.c_str();

      // Then you do a C-style cast.
      //
      // (size_t*)k that gives you the address to the beginning of the underlying
      // data of the std::string (possibly it will be pointer on the heap or on
      // stack depending on the SSO) and after that you dereference it to receive
      // the value. BTW it can lead to the undefined behavior because you
      // attempt to receive the value for 8 bytes (depending on the size_t size)
      // but your actual string may be less than it, e.g. 4 bytes. As a result
      // you will receive the garbage.
      std::cout << k << " " << *((size_t*)k) << std::endl;

      // Two strings created as
      //
      // const char* foo = "foo";
      // const char* bar = "foo";
      //
      // are stored in the Read only segment of data in your executable. Actually
      // two different pointers will point to the same string in this segment. Also
      // note the same undefined behavior mentioned earlier.
      std::cout << p << " " << *((size_t*)p) << std::endl;





      share|improve this answer





















      • 1





        BTW it can lead to the undefined behavior because you attempt to receive the value for 8 bytes (depending on the size_t size) – No "can" necessary here. * ((size_t*)k) with k being anything other that a size_t* (or a pointer to a struct with a first member of type size_t) IS undefined behaviour. Always. And the round trip to some random implementation of std::string is not needed at all.

        – Swordfish
        Nov 22 '18 at 14:19













      • @Swordfish, agree about the round trip. Just wanted to point that sometimes a person can go and explore the sources and understand the reason of unknown garbage. But your are right, this stuff is redundant, removed.

        – dshil
        Nov 22 '18 at 15:56











      Your Answer






      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "1"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53422995%2ffunction-call-to-c-str-vs-const-char-in-hash-function%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      3 Answers
      3






      active

      oldest

      votes








      3 Answers
      3






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      3














      *((size_t*)k) causes undefined behaviour by violating the strict aliasing rule. This code is only valid if k actually points to an object of type size_t.



      Being undefined behaviour, seeing weird numbers is a possible result (as would be anything else).





      I guess you intended something akin to:



      size_t x;
      memcpy(&x, k, sizeof x);
      cout << k << " " << x << 'n';


      It should now be clear what the problem is. Your string only contains 3 characters (2 plus the null terminator), however you attempt to read more than 3 characters which also causes undefined behaviour.






      share|improve this answer




























        3














        *((size_t*)k) causes undefined behaviour by violating the strict aliasing rule. This code is only valid if k actually points to an object of type size_t.



        Being undefined behaviour, seeing weird numbers is a possible result (as would be anything else).





        I guess you intended something akin to:



        size_t x;
        memcpy(&x, k, sizeof x);
        cout << k << " " << x << 'n';


        It should now be clear what the problem is. Your string only contains 3 characters (2 plus the null terminator), however you attempt to read more than 3 characters which also causes undefined behaviour.






        share|improve this answer


























          3












          3








          3







          *((size_t*)k) causes undefined behaviour by violating the strict aliasing rule. This code is only valid if k actually points to an object of type size_t.



          Being undefined behaviour, seeing weird numbers is a possible result (as would be anything else).





          I guess you intended something akin to:



          size_t x;
          memcpy(&x, k, sizeof x);
          cout << k << " " << x << 'n';


          It should now be clear what the problem is. Your string only contains 3 characters (2 plus the null terminator), however you attempt to read more than 3 characters which also causes undefined behaviour.






          share|improve this answer













          *((size_t*)k) causes undefined behaviour by violating the strict aliasing rule. This code is only valid if k actually points to an object of type size_t.



          Being undefined behaviour, seeing weird numbers is a possible result (as would be anything else).





          I guess you intended something akin to:



          size_t x;
          memcpy(&x, k, sizeof x);
          cout << k << " " << x << 'n';


          It should now be clear what the problem is. Your string only contains 3 characters (2 plus the null terminator), however you attempt to read more than 3 characters which also causes undefined behaviour.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 22 '18 at 5:07









          M.MM.M

          106k11118240




          106k11118240

























              0














              I'll start with saying that in:



              const string l = "BA";
              const char* k = l.c_str();
              const char* p = "BA";
              cout << k << " " << *((size_t*)k) << endl;
              cout << p << " " << *((size_t*)p) << endl;


              Both *((size_t*)k) and *((size_t*)p) invoke undefined behavior. This is so, since on most systems it will access data beyond the boundary of the char array. Note that, sizeof(size_t) > 3 * sizeof(char) for 32 and 64 bit system, so that *((size_t*)k) accesses at least one byte beyond the boundary.



              In the whole example, the string literals (on your system) are possibly aligned to at least sizeof(size_t), with zero padding (don't count on it, but it seems so). This means the junk after the string literal "BA" (and the NUL terminator) is NUL character(s). This is consistent across runs.



              In case of k, which comes from std::string you are not so lucky. The string is short, so most system will employ short string optimization. This means that that char buffer is in the std::string object. In your case, the string is so short that the remainder of it is still in the buffer dedicated for the short string optimization. As it seems, the remainder of the buffer is not initialized, and contains junk. The junk had been left over from before the function was called. As a result other than the first 3 bytes of BA, the rest is random junk.



              You were lucky that this case of undefined behavior ends up with some additional junk, and not something more perplexing (like always returning zero, or calling unrelated functions). Don't rely on UB, ever.






              share|improve this answer




























                0














                I'll start with saying that in:



                const string l = "BA";
                const char* k = l.c_str();
                const char* p = "BA";
                cout << k << " " << *((size_t*)k) << endl;
                cout << p << " " << *((size_t*)p) << endl;


                Both *((size_t*)k) and *((size_t*)p) invoke undefined behavior. This is so, since on most systems it will access data beyond the boundary of the char array. Note that, sizeof(size_t) > 3 * sizeof(char) for 32 and 64 bit system, so that *((size_t*)k) accesses at least one byte beyond the boundary.



                In the whole example, the string literals (on your system) are possibly aligned to at least sizeof(size_t), with zero padding (don't count on it, but it seems so). This means the junk after the string literal "BA" (and the NUL terminator) is NUL character(s). This is consistent across runs.



                In case of k, which comes from std::string you are not so lucky. The string is short, so most system will employ short string optimization. This means that that char buffer is in the std::string object. In your case, the string is so short that the remainder of it is still in the buffer dedicated for the short string optimization. As it seems, the remainder of the buffer is not initialized, and contains junk. The junk had been left over from before the function was called. As a result other than the first 3 bytes of BA, the rest is random junk.



                You were lucky that this case of undefined behavior ends up with some additional junk, and not something more perplexing (like always returning zero, or calling unrelated functions). Don't rely on UB, ever.






                share|improve this answer


























                  0












                  0








                  0







                  I'll start with saying that in:



                  const string l = "BA";
                  const char* k = l.c_str();
                  const char* p = "BA";
                  cout << k << " " << *((size_t*)k) << endl;
                  cout << p << " " << *((size_t*)p) << endl;


                  Both *((size_t*)k) and *((size_t*)p) invoke undefined behavior. This is so, since on most systems it will access data beyond the boundary of the char array. Note that, sizeof(size_t) > 3 * sizeof(char) for 32 and 64 bit system, so that *((size_t*)k) accesses at least one byte beyond the boundary.



                  In the whole example, the string literals (on your system) are possibly aligned to at least sizeof(size_t), with zero padding (don't count on it, but it seems so). This means the junk after the string literal "BA" (and the NUL terminator) is NUL character(s). This is consistent across runs.



                  In case of k, which comes from std::string you are not so lucky. The string is short, so most system will employ short string optimization. This means that that char buffer is in the std::string object. In your case, the string is so short that the remainder of it is still in the buffer dedicated for the short string optimization. As it seems, the remainder of the buffer is not initialized, and contains junk. The junk had been left over from before the function was called. As a result other than the first 3 bytes of BA, the rest is random junk.



                  You were lucky that this case of undefined behavior ends up with some additional junk, and not something more perplexing (like always returning zero, or calling unrelated functions). Don't rely on UB, ever.






                  share|improve this answer













                  I'll start with saying that in:



                  const string l = "BA";
                  const char* k = l.c_str();
                  const char* p = "BA";
                  cout << k << " " << *((size_t*)k) << endl;
                  cout << p << " " << *((size_t*)p) << endl;


                  Both *((size_t*)k) and *((size_t*)p) invoke undefined behavior. This is so, since on most systems it will access data beyond the boundary of the char array. Note that, sizeof(size_t) > 3 * sizeof(char) for 32 and 64 bit system, so that *((size_t*)k) accesses at least one byte beyond the boundary.



                  In the whole example, the string literals (on your system) are possibly aligned to at least sizeof(size_t), with zero padding (don't count on it, but it seems so). This means the junk after the string literal "BA" (and the NUL terminator) is NUL character(s). This is consistent across runs.



                  In case of k, which comes from std::string you are not so lucky. The string is short, so most system will employ short string optimization. This means that that char buffer is in the std::string object. In your case, the string is so short that the remainder of it is still in the buffer dedicated for the short string optimization. As it seems, the remainder of the buffer is not initialized, and contains junk. The junk had been left over from before the function was called. As a result other than the first 3 bytes of BA, the rest is random junk.



                  You were lucky that this case of undefined behavior ends up with some additional junk, and not something more perplexing (like always returning zero, or calling unrelated functions). Don't rely on UB, ever.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 22 '18 at 4:56









                  Michael VekslerMichael Veksler

                  2,8321516




                  2,8321516























                      0














                      // Simple null terminated character that is represented in memory as:
                      //
                      // ['B', 'A', '']
                      const char* p = "BA";

                      // From the other side `std::string` isn't so simple
                      //
                      // c_str() returns a pointer to some kind of buffer.
                      //
                      // ['B', 'A', '', ... reserved_memory]
                      //
                      const std::string l = "BA";
                      const char* k = l.c_str();

                      // Then you do a C-style cast.
                      //
                      // (size_t*)k that gives you the address to the beginning of the underlying
                      // data of the std::string (possibly it will be pointer on the heap or on
                      // stack depending on the SSO) and after that you dereference it to receive
                      // the value. BTW it can lead to the undefined behavior because you
                      // attempt to receive the value for 8 bytes (depending on the size_t size)
                      // but your actual string may be less than it, e.g. 4 bytes. As a result
                      // you will receive the garbage.
                      std::cout << k << " " << *((size_t*)k) << std::endl;

                      // Two strings created as
                      //
                      // const char* foo = "foo";
                      // const char* bar = "foo";
                      //
                      // are stored in the Read only segment of data in your executable. Actually
                      // two different pointers will point to the same string in this segment. Also
                      // note the same undefined behavior mentioned earlier.
                      std::cout << p << " " << *((size_t*)p) << std::endl;





                      share|improve this answer





















                      • 1





                        BTW it can lead to the undefined behavior because you attempt to receive the value for 8 bytes (depending on the size_t size) – No "can" necessary here. * ((size_t*)k) with k being anything other that a size_t* (or a pointer to a struct with a first member of type size_t) IS undefined behaviour. Always. And the round trip to some random implementation of std::string is not needed at all.

                        – Swordfish
                        Nov 22 '18 at 14:19













                      • @Swordfish, agree about the round trip. Just wanted to point that sometimes a person can go and explore the sources and understand the reason of unknown garbage. But your are right, this stuff is redundant, removed.

                        – dshil
                        Nov 22 '18 at 15:56
















                      0














                      // Simple null terminated character that is represented in memory as:
                      //
                      // ['B', 'A', '']
                      const char* p = "BA";

                      // From the other side `std::string` isn't so simple
                      //
                      // c_str() returns a pointer to some kind of buffer.
                      //
                      // ['B', 'A', '', ... reserved_memory]
                      //
                      const std::string l = "BA";
                      const char* k = l.c_str();

                      // Then you do a C-style cast.
                      //
                      // (size_t*)k that gives you the address to the beginning of the underlying
                      // data of the std::string (possibly it will be pointer on the heap or on
                      // stack depending on the SSO) and after that you dereference it to receive
                      // the value. BTW it can lead to the undefined behavior because you
                      // attempt to receive the value for 8 bytes (depending on the size_t size)
                      // but your actual string may be less than it, e.g. 4 bytes. As a result
                      // you will receive the garbage.
                      std::cout << k << " " << *((size_t*)k) << std::endl;

                      // Two strings created as
                      //
                      // const char* foo = "foo";
                      // const char* bar = "foo";
                      //
                      // are stored in the Read only segment of data in your executable. Actually
                      // two different pointers will point to the same string in this segment. Also
                      // note the same undefined behavior mentioned earlier.
                      std::cout << p << " " << *((size_t*)p) << std::endl;





                      share|improve this answer





















                      • 1





                        BTW it can lead to the undefined behavior because you attempt to receive the value for 8 bytes (depending on the size_t size) – No "can" necessary here. * ((size_t*)k) with k being anything other that a size_t* (or a pointer to a struct with a first member of type size_t) IS undefined behaviour. Always. And the round trip to some random implementation of std::string is not needed at all.

                        – Swordfish
                        Nov 22 '18 at 14:19













                      • @Swordfish, agree about the round trip. Just wanted to point that sometimes a person can go and explore the sources and understand the reason of unknown garbage. But your are right, this stuff is redundant, removed.

                        – dshil
                        Nov 22 '18 at 15:56














                      0












                      0








                      0







                      // Simple null terminated character that is represented in memory as:
                      //
                      // ['B', 'A', '']
                      const char* p = "BA";

                      // From the other side `std::string` isn't so simple
                      //
                      // c_str() returns a pointer to some kind of buffer.
                      //
                      // ['B', 'A', '', ... reserved_memory]
                      //
                      const std::string l = "BA";
                      const char* k = l.c_str();

                      // Then you do a C-style cast.
                      //
                      // (size_t*)k that gives you the address to the beginning of the underlying
                      // data of the std::string (possibly it will be pointer on the heap or on
                      // stack depending on the SSO) and after that you dereference it to receive
                      // the value. BTW it can lead to the undefined behavior because you
                      // attempt to receive the value for 8 bytes (depending on the size_t size)
                      // but your actual string may be less than it, e.g. 4 bytes. As a result
                      // you will receive the garbage.
                      std::cout << k << " " << *((size_t*)k) << std::endl;

                      // Two strings created as
                      //
                      // const char* foo = "foo";
                      // const char* bar = "foo";
                      //
                      // are stored in the Read only segment of data in your executable. Actually
                      // two different pointers will point to the same string in this segment. Also
                      // note the same undefined behavior mentioned earlier.
                      std::cout << p << " " << *((size_t*)p) << std::endl;





                      share|improve this answer















                      // Simple null terminated character that is represented in memory as:
                      //
                      // ['B', 'A', '']
                      const char* p = "BA";

                      // From the other side `std::string` isn't so simple
                      //
                      // c_str() returns a pointer to some kind of buffer.
                      //
                      // ['B', 'A', '', ... reserved_memory]
                      //
                      const std::string l = "BA";
                      const char* k = l.c_str();

                      // Then you do a C-style cast.
                      //
                      // (size_t*)k that gives you the address to the beginning of the underlying
                      // data of the std::string (possibly it will be pointer on the heap or on
                      // stack depending on the SSO) and after that you dereference it to receive
                      // the value. BTW it can lead to the undefined behavior because you
                      // attempt to receive the value for 8 bytes (depending on the size_t size)
                      // but your actual string may be less than it, e.g. 4 bytes. As a result
                      // you will receive the garbage.
                      std::cout << k << " " << *((size_t*)k) << std::endl;

                      // Two strings created as
                      //
                      // const char* foo = "foo";
                      // const char* bar = "foo";
                      //
                      // are stored in the Read only segment of data in your executable. Actually
                      // two different pointers will point to the same string in this segment. Also
                      // note the same undefined behavior mentioned earlier.
                      std::cout << p << " " << *((size_t*)p) << std::endl;






                      share|improve this answer














                      share|improve this answer



                      share|improve this answer








                      edited Nov 22 '18 at 15:49

























                      answered Nov 22 '18 at 4:35









                      dshildshil

                      276110




                      276110








                      • 1





                        BTW it can lead to the undefined behavior because you attempt to receive the value for 8 bytes (depending on the size_t size) – No "can" necessary here. * ((size_t*)k) with k being anything other that a size_t* (or a pointer to a struct with a first member of type size_t) IS undefined behaviour. Always. And the round trip to some random implementation of std::string is not needed at all.

                        – Swordfish
                        Nov 22 '18 at 14:19













                      • @Swordfish, agree about the round trip. Just wanted to point that sometimes a person can go and explore the sources and understand the reason of unknown garbage. But your are right, this stuff is redundant, removed.

                        – dshil
                        Nov 22 '18 at 15:56














                      • 1





                        BTW it can lead to the undefined behavior because you attempt to receive the value for 8 bytes (depending on the size_t size) – No "can" necessary here. * ((size_t*)k) with k being anything other that a size_t* (or a pointer to a struct with a first member of type size_t) IS undefined behaviour. Always. And the round trip to some random implementation of std::string is not needed at all.

                        – Swordfish
                        Nov 22 '18 at 14:19













                      • @Swordfish, agree about the round trip. Just wanted to point that sometimes a person can go and explore the sources and understand the reason of unknown garbage. But your are right, this stuff is redundant, removed.

                        – dshil
                        Nov 22 '18 at 15:56








                      1




                      1





                      BTW it can lead to the undefined behavior because you attempt to receive the value for 8 bytes (depending on the size_t size) – No "can" necessary here. * ((size_t*)k) with k being anything other that a size_t* (or a pointer to a struct with a first member of type size_t) IS undefined behaviour. Always. And the round trip to some random implementation of std::string is not needed at all.

                      – Swordfish
                      Nov 22 '18 at 14:19







                      BTW it can lead to the undefined behavior because you attempt to receive the value for 8 bytes (depending on the size_t size) – No "can" necessary here. * ((size_t*)k) with k being anything other that a size_t* (or a pointer to a struct with a first member of type size_t) IS undefined behaviour. Always. And the round trip to some random implementation of std::string is not needed at all.

                      – Swordfish
                      Nov 22 '18 at 14:19















                      @Swordfish, agree about the round trip. Just wanted to point that sometimes a person can go and explore the sources and understand the reason of unknown garbage. But your are right, this stuff is redundant, removed.

                      – dshil
                      Nov 22 '18 at 15:56





                      @Swordfish, agree about the round trip. Just wanted to point that sometimes a person can go and explore the sources and understand the reason of unknown garbage. But your are right, this stuff is redundant, removed.

                      – dshil
                      Nov 22 '18 at 15:56


















                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53422995%2ffunction-call-to-c-str-vs-const-char-in-hash-function%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Paul Cézanne

                      UIScrollView CustomStickyHeader Resize height generates problems when scroll is too fast

                      Angular material date-picker (MatDatepicker) auto completes the date on focus out