Why does the `updatedb` program run so fast?












21














Usually when I have programs that are doing a full disk scan and going over all files in the system they take a very long time to run. Why does updatedb run so fast in comparison?










share|improve this question





























    21














    Usually when I have programs that are doing a full disk scan and going over all files in the system they take a very long time to run. Why does updatedb run so fast in comparison?










    share|improve this question



























      21












      21








      21


      1





      Usually when I have programs that are doing a full disk scan and going over all files in the system they take a very long time to run. Why does updatedb run so fast in comparison?










      share|improve this question















      Usually when I have programs that are doing a full disk scan and going over all files in the system they take a very long time to run. Why does updatedb run so fast in comparison?







      performance updatedb






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited 2 days ago









      Jeff Schaller

      39k1053125




      39k1053125










      asked Jan 2 at 16:14









      hugomg

      1,84731634




      1,84731634






















          2 Answers
          2






          active

          oldest

          votes


















          21














          The answer depends on the version of locate you’re using, but there’s a fair chance it’s mlocate, whose updatedb runs quickly by avoiding doing full disk scans:




          mlocate is a locate/updatedb implementation. The 'm' stands for "merging":
          updatedb reuses the existing database to avoid rereading most of the file
          system, which makes updatedb faster and does not trash the system caches as
          much.




          (The database stores each directory’s timestamp, ctime or mtime, whichever is newer.)



          Like most implementations of updatedb, mlocate’s will also skip file systems and paths which it is configured to ignore. By default there are none in mlocate’s case, but distributions typically provide a basic updatedb.conf which ignores networked file systems, virtual file systems etc. (see Debian’s configuration file for example; this is standard practice in Debian, so GNU’s updatedb is configured similarly).






          share|improve this answer























          • Fairly good question and answer, did not even know there were "differencial" scannings.
            – Rui F Ribeiro
            Jan 2 at 16:25








          • 1




            Thanks! I had never noticed that modifying a file also changes the ctime and mtime of all its parent directories.
            – hugomg
            Jan 2 at 17:21






          • 4




            @hugomg I don't think it actually does. It should only change the mtime of its immediate parent.
            – Kusalananda
            2 days ago










          • So if I understand it correctly, mlocate cares about ctime and mtime which implies it cares only of whether list of directory entries is still the same ( no removed or added files), which means it doesn't care about actual files themselves. Is that correct ?
            – Sergiy Kolodyazhnyy
            2 days ago










          • @Sergiy: Of course. locate isn't grep -R. It does not read file content.
            – Kevin
            2 days ago



















          9














          In addition to checking modification times, mlocate also ignores certain subtrees of the file system that have lots of uninteresting or potentially duplicate files, as specified in /etc/updatedb.conf (and described in man updatedb.conf):




          • Bind mounts

          • Some kinds of file systems (9p, afs, bdev, etc)

          • VCS repository databases (.git, .hg, etc)

          • Some hard-coded directories (/media, /tmp, /var/spool/cups, etc).






          share|improve this answer





















          • This isn’t the case by default though, so the base behaviour depends on the distribution being used. (Other updatedb implementations also support configured exclusions.)
            – Stephen Kitt
            2 days ago










          • Indeed. I was describing the defaults for Fedora.
            – hugomg
            2 days ago











          Your Answer








          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "106"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f492044%2fwhy-does-the-updatedb-program-run-so-fast%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          2 Answers
          2






          active

          oldest

          votes








          2 Answers
          2






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          21














          The answer depends on the version of locate you’re using, but there’s a fair chance it’s mlocate, whose updatedb runs quickly by avoiding doing full disk scans:




          mlocate is a locate/updatedb implementation. The 'm' stands for "merging":
          updatedb reuses the existing database to avoid rereading most of the file
          system, which makes updatedb faster and does not trash the system caches as
          much.




          (The database stores each directory’s timestamp, ctime or mtime, whichever is newer.)



          Like most implementations of updatedb, mlocate’s will also skip file systems and paths which it is configured to ignore. By default there are none in mlocate’s case, but distributions typically provide a basic updatedb.conf which ignores networked file systems, virtual file systems etc. (see Debian’s configuration file for example; this is standard practice in Debian, so GNU’s updatedb is configured similarly).






          share|improve this answer























          • Fairly good question and answer, did not even know there were "differencial" scannings.
            – Rui F Ribeiro
            Jan 2 at 16:25








          • 1




            Thanks! I had never noticed that modifying a file also changes the ctime and mtime of all its parent directories.
            – hugomg
            Jan 2 at 17:21






          • 4




            @hugomg I don't think it actually does. It should only change the mtime of its immediate parent.
            – Kusalananda
            2 days ago










          • So if I understand it correctly, mlocate cares about ctime and mtime which implies it cares only of whether list of directory entries is still the same ( no removed or added files), which means it doesn't care about actual files themselves. Is that correct ?
            – Sergiy Kolodyazhnyy
            2 days ago










          • @Sergiy: Of course. locate isn't grep -R. It does not read file content.
            – Kevin
            2 days ago
















          21














          The answer depends on the version of locate you’re using, but there’s a fair chance it’s mlocate, whose updatedb runs quickly by avoiding doing full disk scans:




          mlocate is a locate/updatedb implementation. The 'm' stands for "merging":
          updatedb reuses the existing database to avoid rereading most of the file
          system, which makes updatedb faster and does not trash the system caches as
          much.




          (The database stores each directory’s timestamp, ctime or mtime, whichever is newer.)



          Like most implementations of updatedb, mlocate’s will also skip file systems and paths which it is configured to ignore. By default there are none in mlocate’s case, but distributions typically provide a basic updatedb.conf which ignores networked file systems, virtual file systems etc. (see Debian’s configuration file for example; this is standard practice in Debian, so GNU’s updatedb is configured similarly).






          share|improve this answer























          • Fairly good question and answer, did not even know there were "differencial" scannings.
            – Rui F Ribeiro
            Jan 2 at 16:25








          • 1




            Thanks! I had never noticed that modifying a file also changes the ctime and mtime of all its parent directories.
            – hugomg
            Jan 2 at 17:21






          • 4




            @hugomg I don't think it actually does. It should only change the mtime of its immediate parent.
            – Kusalananda
            2 days ago










          • So if I understand it correctly, mlocate cares about ctime and mtime which implies it cares only of whether list of directory entries is still the same ( no removed or added files), which means it doesn't care about actual files themselves. Is that correct ?
            – Sergiy Kolodyazhnyy
            2 days ago










          • @Sergiy: Of course. locate isn't grep -R. It does not read file content.
            – Kevin
            2 days ago














          21












          21








          21






          The answer depends on the version of locate you’re using, but there’s a fair chance it’s mlocate, whose updatedb runs quickly by avoiding doing full disk scans:




          mlocate is a locate/updatedb implementation. The 'm' stands for "merging":
          updatedb reuses the existing database to avoid rereading most of the file
          system, which makes updatedb faster and does not trash the system caches as
          much.




          (The database stores each directory’s timestamp, ctime or mtime, whichever is newer.)



          Like most implementations of updatedb, mlocate’s will also skip file systems and paths which it is configured to ignore. By default there are none in mlocate’s case, but distributions typically provide a basic updatedb.conf which ignores networked file systems, virtual file systems etc. (see Debian’s configuration file for example; this is standard practice in Debian, so GNU’s updatedb is configured similarly).






          share|improve this answer














          The answer depends on the version of locate you’re using, but there’s a fair chance it’s mlocate, whose updatedb runs quickly by avoiding doing full disk scans:




          mlocate is a locate/updatedb implementation. The 'm' stands for "merging":
          updatedb reuses the existing database to avoid rereading most of the file
          system, which makes updatedb faster and does not trash the system caches as
          much.




          (The database stores each directory’s timestamp, ctime or mtime, whichever is newer.)



          Like most implementations of updatedb, mlocate’s will also skip file systems and paths which it is configured to ignore. By default there are none in mlocate’s case, but distributions typically provide a basic updatedb.conf which ignores networked file systems, virtual file systems etc. (see Debian’s configuration file for example; this is standard practice in Debian, so GNU’s updatedb is configured similarly).







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited 2 days ago

























          answered Jan 2 at 16:20









          Stephen Kitt

          165k24366445




          165k24366445












          • Fairly good question and answer, did not even know there were "differencial" scannings.
            – Rui F Ribeiro
            Jan 2 at 16:25








          • 1




            Thanks! I had never noticed that modifying a file also changes the ctime and mtime of all its parent directories.
            – hugomg
            Jan 2 at 17:21






          • 4




            @hugomg I don't think it actually does. It should only change the mtime of its immediate parent.
            – Kusalananda
            2 days ago










          • So if I understand it correctly, mlocate cares about ctime and mtime which implies it cares only of whether list of directory entries is still the same ( no removed or added files), which means it doesn't care about actual files themselves. Is that correct ?
            – Sergiy Kolodyazhnyy
            2 days ago










          • @Sergiy: Of course. locate isn't grep -R. It does not read file content.
            – Kevin
            2 days ago


















          • Fairly good question and answer, did not even know there were "differencial" scannings.
            – Rui F Ribeiro
            Jan 2 at 16:25








          • 1




            Thanks! I had never noticed that modifying a file also changes the ctime and mtime of all its parent directories.
            – hugomg
            Jan 2 at 17:21






          • 4




            @hugomg I don't think it actually does. It should only change the mtime of its immediate parent.
            – Kusalananda
            2 days ago










          • So if I understand it correctly, mlocate cares about ctime and mtime which implies it cares only of whether list of directory entries is still the same ( no removed or added files), which means it doesn't care about actual files themselves. Is that correct ?
            – Sergiy Kolodyazhnyy
            2 days ago










          • @Sergiy: Of course. locate isn't grep -R. It does not read file content.
            – Kevin
            2 days ago
















          Fairly good question and answer, did not even know there were "differencial" scannings.
          – Rui F Ribeiro
          Jan 2 at 16:25






          Fairly good question and answer, did not even know there were "differencial" scannings.
          – Rui F Ribeiro
          Jan 2 at 16:25






          1




          1




          Thanks! I had never noticed that modifying a file also changes the ctime and mtime of all its parent directories.
          – hugomg
          Jan 2 at 17:21




          Thanks! I had never noticed that modifying a file also changes the ctime and mtime of all its parent directories.
          – hugomg
          Jan 2 at 17:21




          4




          4




          @hugomg I don't think it actually does. It should only change the mtime of its immediate parent.
          – Kusalananda
          2 days ago




          @hugomg I don't think it actually does. It should only change the mtime of its immediate parent.
          – Kusalananda
          2 days ago












          So if I understand it correctly, mlocate cares about ctime and mtime which implies it cares only of whether list of directory entries is still the same ( no removed or added files), which means it doesn't care about actual files themselves. Is that correct ?
          – Sergiy Kolodyazhnyy
          2 days ago




          So if I understand it correctly, mlocate cares about ctime and mtime which implies it cares only of whether list of directory entries is still the same ( no removed or added files), which means it doesn't care about actual files themselves. Is that correct ?
          – Sergiy Kolodyazhnyy
          2 days ago












          @Sergiy: Of course. locate isn't grep -R. It does not read file content.
          – Kevin
          2 days ago




          @Sergiy: Of course. locate isn't grep -R. It does not read file content.
          – Kevin
          2 days ago













          9














          In addition to checking modification times, mlocate also ignores certain subtrees of the file system that have lots of uninteresting or potentially duplicate files, as specified in /etc/updatedb.conf (and described in man updatedb.conf):




          • Bind mounts

          • Some kinds of file systems (9p, afs, bdev, etc)

          • VCS repository databases (.git, .hg, etc)

          • Some hard-coded directories (/media, /tmp, /var/spool/cups, etc).






          share|improve this answer





















          • This isn’t the case by default though, so the base behaviour depends on the distribution being used. (Other updatedb implementations also support configured exclusions.)
            – Stephen Kitt
            2 days ago










          • Indeed. I was describing the defaults for Fedora.
            – hugomg
            2 days ago
















          9














          In addition to checking modification times, mlocate also ignores certain subtrees of the file system that have lots of uninteresting or potentially duplicate files, as specified in /etc/updatedb.conf (and described in man updatedb.conf):




          • Bind mounts

          • Some kinds of file systems (9p, afs, bdev, etc)

          • VCS repository databases (.git, .hg, etc)

          • Some hard-coded directories (/media, /tmp, /var/spool/cups, etc).






          share|improve this answer





















          • This isn’t the case by default though, so the base behaviour depends on the distribution being used. (Other updatedb implementations also support configured exclusions.)
            – Stephen Kitt
            2 days ago










          • Indeed. I was describing the defaults for Fedora.
            – hugomg
            2 days ago














          9












          9








          9






          In addition to checking modification times, mlocate also ignores certain subtrees of the file system that have lots of uninteresting or potentially duplicate files, as specified in /etc/updatedb.conf (and described in man updatedb.conf):




          • Bind mounts

          • Some kinds of file systems (9p, afs, bdev, etc)

          • VCS repository databases (.git, .hg, etc)

          • Some hard-coded directories (/media, /tmp, /var/spool/cups, etc).






          share|improve this answer












          In addition to checking modification times, mlocate also ignores certain subtrees of the file system that have lots of uninteresting or potentially duplicate files, as specified in /etc/updatedb.conf (and described in man updatedb.conf):




          • Bind mounts

          • Some kinds of file systems (9p, afs, bdev, etc)

          • VCS repository databases (.git, .hg, etc)

          • Some hard-coded directories (/media, /tmp, /var/spool/cups, etc).







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered 2 days ago









          hugomg

          1,84731634




          1,84731634












          • This isn’t the case by default though, so the base behaviour depends on the distribution being used. (Other updatedb implementations also support configured exclusions.)
            – Stephen Kitt
            2 days ago










          • Indeed. I was describing the defaults for Fedora.
            – hugomg
            2 days ago


















          • This isn’t the case by default though, so the base behaviour depends on the distribution being used. (Other updatedb implementations also support configured exclusions.)
            – Stephen Kitt
            2 days ago










          • Indeed. I was describing the defaults for Fedora.
            – hugomg
            2 days ago
















          This isn’t the case by default though, so the base behaviour depends on the distribution being used. (Other updatedb implementations also support configured exclusions.)
          – Stephen Kitt
          2 days ago




          This isn’t the case by default though, so the base behaviour depends on the distribution being used. (Other updatedb implementations also support configured exclusions.)
          – Stephen Kitt
          2 days ago












          Indeed. I was describing the defaults for Fedora.
          – hugomg
          2 days ago




          Indeed. I was describing the defaults for Fedora.
          – hugomg
          2 days ago


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Unix & Linux Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.





          Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


          Please pay close attention to the following guidance:


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f492044%2fwhy-does-the-updatedb-program-run-so-fast%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          If I really need a card on my start hand, how many mulligans make sense? [duplicate]

          Alcedinidae

          Can an atomic nucleus contain both particles and antiparticles? [duplicate]