The necessity to limit VARCHAR length for indexed columns using coallation utf8_unicode_ci












0















I am trying to understand a MySQL error and it is related to some defaults Rails generates with ActiveRecord migrations. Given this:



rails generate migration AddDetailsToProducts supplier:index:references{polymorphic}

class AddDetailsToProducts < ActiveRecord::Migration
def change
add_reference :products, :supplier, polymorphic: true, index: true, foreign_key: true
end
end


So what does this do? First, let's look at polymorphic. Polymorphic will create a supplier_type VARCHAR(255) and supplier_id INT column in products. I believe the VARCHAR is set to 256 characters, since early versions of MySQL did not support more. But remember that VARCHAR columns in the database are variable length, so there's no storage advantage to a ten character value in a VARCHAR(255) versus a VARCHAR(20).



references adds the supplier_type and supplier_id as foreign keys of products to the suppliers primary key. A FOREIGN KEY is a field (or collection of fields) in one table that refers to the PRIMARY KEY in another table. It is a key used to link two tables together.



So I think "add_reference" does something like this:



CREATE TABLE products (
PRIMARY KEY (id),
FOREIGN KEY (supplier_id) REFERENCES suppliers(id)
FOREIGN KEY (supplier_type) REFERENCES suppliers(id)
)
CREATE INDEX `index_suppliers_on_supplier_type` ON `suppliers` (`supplier_type`)
CREATE INDEX `index_suppliers_on_supplier_id` ON `suppliers` (`supplier_id`)


Now I received an error like this:




Specified key was too long; max key length is 767 bytes: CREATE INDEX
index_suppliers_on_supplier_type ON suppliers (supplier_type)




So what we have is a supplier_type column which is VARCHAR(255) and we attempted to place an index on it. I am using the utf8_unicode_ci coallation. My understanding is this uses 1 to 3 bytes per character. So even if this was to use 3 bytes for all characters with a maximum of 256 characters, that is 256 * 3 = 768. One byte over. It really doesn't make sense. Is the solution really just to add a limit to the maximum character size for the column? Am I understanding this correctly?



Because when I do this, the error goes away:



class ChangeSuppliers < ActiveRecord::Migration
def change
change_column :suppliers, :supplier_type, :string, limit: 191
end
end









share|improve this question



























    0















    I am trying to understand a MySQL error and it is related to some defaults Rails generates with ActiveRecord migrations. Given this:



    rails generate migration AddDetailsToProducts supplier:index:references{polymorphic}

    class AddDetailsToProducts < ActiveRecord::Migration
    def change
    add_reference :products, :supplier, polymorphic: true, index: true, foreign_key: true
    end
    end


    So what does this do? First, let's look at polymorphic. Polymorphic will create a supplier_type VARCHAR(255) and supplier_id INT column in products. I believe the VARCHAR is set to 256 characters, since early versions of MySQL did not support more. But remember that VARCHAR columns in the database are variable length, so there's no storage advantage to a ten character value in a VARCHAR(255) versus a VARCHAR(20).



    references adds the supplier_type and supplier_id as foreign keys of products to the suppliers primary key. A FOREIGN KEY is a field (or collection of fields) in one table that refers to the PRIMARY KEY in another table. It is a key used to link two tables together.



    So I think "add_reference" does something like this:



    CREATE TABLE products (
    PRIMARY KEY (id),
    FOREIGN KEY (supplier_id) REFERENCES suppliers(id)
    FOREIGN KEY (supplier_type) REFERENCES suppliers(id)
    )
    CREATE INDEX `index_suppliers_on_supplier_type` ON `suppliers` (`supplier_type`)
    CREATE INDEX `index_suppliers_on_supplier_id` ON `suppliers` (`supplier_id`)


    Now I received an error like this:




    Specified key was too long; max key length is 767 bytes: CREATE INDEX
    index_suppliers_on_supplier_type ON suppliers (supplier_type)




    So what we have is a supplier_type column which is VARCHAR(255) and we attempted to place an index on it. I am using the utf8_unicode_ci coallation. My understanding is this uses 1 to 3 bytes per character. So even if this was to use 3 bytes for all characters with a maximum of 256 characters, that is 256 * 3 = 768. One byte over. It really doesn't make sense. Is the solution really just to add a limit to the maximum character size for the column? Am I understanding this correctly?



    Because when I do this, the error goes away:



    class ChangeSuppliers < ActiveRecord::Migration
    def change
    change_column :suppliers, :supplier_type, :string, limit: 191
    end
    end









    share|improve this question

























      0












      0








      0








      I am trying to understand a MySQL error and it is related to some defaults Rails generates with ActiveRecord migrations. Given this:



      rails generate migration AddDetailsToProducts supplier:index:references{polymorphic}

      class AddDetailsToProducts < ActiveRecord::Migration
      def change
      add_reference :products, :supplier, polymorphic: true, index: true, foreign_key: true
      end
      end


      So what does this do? First, let's look at polymorphic. Polymorphic will create a supplier_type VARCHAR(255) and supplier_id INT column in products. I believe the VARCHAR is set to 256 characters, since early versions of MySQL did not support more. But remember that VARCHAR columns in the database are variable length, so there's no storage advantage to a ten character value in a VARCHAR(255) versus a VARCHAR(20).



      references adds the supplier_type and supplier_id as foreign keys of products to the suppliers primary key. A FOREIGN KEY is a field (or collection of fields) in one table that refers to the PRIMARY KEY in another table. It is a key used to link two tables together.



      So I think "add_reference" does something like this:



      CREATE TABLE products (
      PRIMARY KEY (id),
      FOREIGN KEY (supplier_id) REFERENCES suppliers(id)
      FOREIGN KEY (supplier_type) REFERENCES suppliers(id)
      )
      CREATE INDEX `index_suppliers_on_supplier_type` ON `suppliers` (`supplier_type`)
      CREATE INDEX `index_suppliers_on_supplier_id` ON `suppliers` (`supplier_id`)


      Now I received an error like this:




      Specified key was too long; max key length is 767 bytes: CREATE INDEX
      index_suppliers_on_supplier_type ON suppliers (supplier_type)




      So what we have is a supplier_type column which is VARCHAR(255) and we attempted to place an index on it. I am using the utf8_unicode_ci coallation. My understanding is this uses 1 to 3 bytes per character. So even if this was to use 3 bytes for all characters with a maximum of 256 characters, that is 256 * 3 = 768. One byte over. It really doesn't make sense. Is the solution really just to add a limit to the maximum character size for the column? Am I understanding this correctly?



      Because when I do this, the error goes away:



      class ChangeSuppliers < ActiveRecord::Migration
      def change
      change_column :suppliers, :supplier_type, :string, limit: 191
      end
      end









      share|improve this question














      I am trying to understand a MySQL error and it is related to some defaults Rails generates with ActiveRecord migrations. Given this:



      rails generate migration AddDetailsToProducts supplier:index:references{polymorphic}

      class AddDetailsToProducts < ActiveRecord::Migration
      def change
      add_reference :products, :supplier, polymorphic: true, index: true, foreign_key: true
      end
      end


      So what does this do? First, let's look at polymorphic. Polymorphic will create a supplier_type VARCHAR(255) and supplier_id INT column in products. I believe the VARCHAR is set to 256 characters, since early versions of MySQL did not support more. But remember that VARCHAR columns in the database are variable length, so there's no storage advantage to a ten character value in a VARCHAR(255) versus a VARCHAR(20).



      references adds the supplier_type and supplier_id as foreign keys of products to the suppliers primary key. A FOREIGN KEY is a field (or collection of fields) in one table that refers to the PRIMARY KEY in another table. It is a key used to link two tables together.



      So I think "add_reference" does something like this:



      CREATE TABLE products (
      PRIMARY KEY (id),
      FOREIGN KEY (supplier_id) REFERENCES suppliers(id)
      FOREIGN KEY (supplier_type) REFERENCES suppliers(id)
      )
      CREATE INDEX `index_suppliers_on_supplier_type` ON `suppliers` (`supplier_type`)
      CREATE INDEX `index_suppliers_on_supplier_id` ON `suppliers` (`supplier_id`)


      Now I received an error like this:




      Specified key was too long; max key length is 767 bytes: CREATE INDEX
      index_suppliers_on_supplier_type ON suppliers (supplier_type)




      So what we have is a supplier_type column which is VARCHAR(255) and we attempted to place an index on it. I am using the utf8_unicode_ci coallation. My understanding is this uses 1 to 3 bytes per character. So even if this was to use 3 bytes for all characters with a maximum of 256 characters, that is 256 * 3 = 768. One byte over. It really doesn't make sense. Is the solution really just to add a limit to the maximum character size for the column? Am I understanding this correctly?



      Because when I do this, the error goes away:



      class ChangeSuppliers < ActiveRecord::Migration
      def change
      change_column :suppliers, :supplier_type, :string, limit: 191
      end
      end






      mysql ruby-on-rails database






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 20 '18 at 19:59









      DonatoDonato

      1,92231230




      1,92231230
























          1 Answer
          1






          active

          oldest

          votes


















          0














          The collation is just the order, the real cause is the character set. This answer shows a couple of work-arounds. An index limit like you've done it also works.



          Its been a very long time (MySQL-4.0) since 255 what a varchar limit - its rather and arbitrary choice and a choice of limit based on your data is the best approach.



          In addition to index sizes being larger for larger lengths, joins involving these sometimes use the MEMORY storage engine result in varchar(X) being converted to char(X) which can use a lot more memory.






          share|improve this answer























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53400663%2fthe-necessity-to-limit-varchar-length-for-indexed-columns-using-coallation-utf8%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            0














            The collation is just the order, the real cause is the character set. This answer shows a couple of work-arounds. An index limit like you've done it also works.



            Its been a very long time (MySQL-4.0) since 255 what a varchar limit - its rather and arbitrary choice and a choice of limit based on your data is the best approach.



            In addition to index sizes being larger for larger lengths, joins involving these sometimes use the MEMORY storage engine result in varchar(X) being converted to char(X) which can use a lot more memory.






            share|improve this answer




























              0














              The collation is just the order, the real cause is the character set. This answer shows a couple of work-arounds. An index limit like you've done it also works.



              Its been a very long time (MySQL-4.0) since 255 what a varchar limit - its rather and arbitrary choice and a choice of limit based on your data is the best approach.



              In addition to index sizes being larger for larger lengths, joins involving these sometimes use the MEMORY storage engine result in varchar(X) being converted to char(X) which can use a lot more memory.






              share|improve this answer


























                0












                0








                0







                The collation is just the order, the real cause is the character set. This answer shows a couple of work-arounds. An index limit like you've done it also works.



                Its been a very long time (MySQL-4.0) since 255 what a varchar limit - its rather and arbitrary choice and a choice of limit based on your data is the best approach.



                In addition to index sizes being larger for larger lengths, joins involving these sometimes use the MEMORY storage engine result in varchar(X) being converted to char(X) which can use a lot more memory.






                share|improve this answer













                The collation is just the order, the real cause is the character set. This answer shows a couple of work-arounds. An index limit like you've done it also works.



                Its been a very long time (MySQL-4.0) since 255 what a varchar limit - its rather and arbitrary choice and a choice of limit based on your data is the best approach.



                In addition to index sizes being larger for larger lengths, joins involving these sometimes use the MEMORY storage engine result in varchar(X) being converted to char(X) which can use a lot more memory.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Nov 20 '18 at 21:45









                danblackdanblack

                1,6221214




                1,6221214






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53400663%2fthe-necessity-to-limit-varchar-length-for-indexed-columns-using-coallation-utf8%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    "Incorrect syntax near the keyword 'ON'. (on update cascade, on delete cascade,)

                    Alcedinidae

                    Origin of the phrase “under your belt”?