Why does awk stop and wait if the filename contains = and how to work around that?












23














awk 'processing_script_here' my=file.txt


seems to stop and wait indefinitely...

What's going on here and how do I make it work ?










share|improve this question


















  • 3




    I've seen several comments about this so I thought I might as well ask a question so that we have an answer that can be easily found and linked to...
    – don_crissti
    Dec 22 at 20:44










  • related: unix.stackexchange.com/a/475013/308316
    – mosvy
    Dec 23 at 4:11
















23














awk 'processing_script_here' my=file.txt


seems to stop and wait indefinitely...

What's going on here and how do I make it work ?










share|improve this question


















  • 3




    I've seen several comments about this so I thought I might as well ask a question so that we have an answer that can be easily found and linked to...
    – don_crissti
    Dec 22 at 20:44










  • related: unix.stackexchange.com/a/475013/308316
    – mosvy
    Dec 23 at 4:11














23












23








23


5





awk 'processing_script_here' my=file.txt


seems to stop and wait indefinitely...

What's going on here and how do I make it work ?










share|improve this question













awk 'processing_script_here' my=file.txt


seems to stop and wait indefinitely...

What's going on here and how do I make it work ?







awk filenames






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Dec 22 at 20:44









don_crissti

49.6k15130159




49.6k15130159








  • 3




    I've seen several comments about this so I thought I might as well ask a question so that we have an answer that can be easily found and linked to...
    – don_crissti
    Dec 22 at 20:44










  • related: unix.stackexchange.com/a/475013/308316
    – mosvy
    Dec 23 at 4:11














  • 3




    I've seen several comments about this so I thought I might as well ask a question so that we have an answer that can be easily found and linked to...
    – don_crissti
    Dec 22 at 20:44










  • related: unix.stackexchange.com/a/475013/308316
    – mosvy
    Dec 23 at 4:11








3




3




I've seen several comments about this so I thought I might as well ask a question so that we have an answer that can be easily found and linked to...
– don_crissti
Dec 22 at 20:44




I've seen several comments about this so I thought I might as well ask a question so that we have an answer that can be easily found and linked to...
– don_crissti
Dec 22 at 20:44












related: unix.stackexchange.com/a/475013/308316
– mosvy
Dec 23 at 4:11




related: unix.stackexchange.com/a/475013/308316
– mosvy
Dec 23 at 4:11










3 Answers
3






active

oldest

votes


















16














As Chris says, arguments of the form variablename=anything are treated as variable assignment (that are performed at the time the arguments are processed as opposed to the (newer) -v var=value ones which are performed before the BEGIN statements) instead of input file names.



That can be useful in things like:



awk '{print $1}' FS=/ RS='n' file1 FS='n' RS= file2


Where you can specify a different FS/RS per file. It's also commonly used in:



awk '!file1_processed{a[$0]; next}; {...}' file1 file1_processed=1 file2


Which is a safer version of:



awk 'NR==FNR{a[$0]; next}; {...}' file1 file2


(which doesn't work if file1 is empty)



But that gets in the way when you have files whose name contains = characters.



Now, that's only a problem when what's left of the first = is a valid awk variable name.



What constitutes a valid variable name in awk is stricter than in sh.



POSIX requires it to be something like:



[_a-zA-Z][_a-zA-Z0-9]*


With only characters of the portable character set. However the /usr/xpg4/bin/awk of Solaris 11 at least is not compliant in that regard and allows any alphabetical characters in the locale in variable names, not just a-zA-Z.



So an argument like x+y=foo or =bar or ./foo=bar is still treated as an input file name and not an assignment as what's left of the first = is not a valid variable name. An argument like Stéphane=Chazelas.txt may or may not depending on the awk implementation and locale.



That's why with awk, it's recommended to use:



awk '...' ./*.txt


instead of



awk '...' *.txt


for instance to avoid the problem if you can't guarantee the name of the txt files won't contain = characters.



Also, beware that an argument like -vfoo=bar.txt may be treated as an option if you use:



awk -f file.awk -vfoo=bar.txt


Again, using ./*.txt works around that (using a ./ prefix also helps with files called - which otherwise awk understands as meaning standard input instead).



That's also why



#! /usr/bin/awk -f


shebangs don't really work. While the var=value ones can be worked around by fixing the ARGV values (add a ./ prefix) in a BEGIN statement:



#! /usr/bin/awk -f
BEGIN {
for (i = 1; i < ARGC; i++)
if (ARGV[i] ~ /^[_[:alpha:]][_[:alnum:]]*=/)
ARGV[i] = "./" ARGV[i]
}
# rest of awk script


That won't help with the option ones as those ones are seen by awk and not the awk script.



One potential cosmetic issue with using that ./ prefix is it ends up in FILENAME, but you can always use substr(FILENAME, 3) to strip it if you don't want it.



The GNU implementation of awk fixes all those issues with its -E option.



After -E, gawk expects only the path of the awk script (where - still means stdin) and then a list of input file paths only (and there, not even - is treated specially).



It's specially designed for:



#! /usr/bin/gawk -E


shebangs where the list of arguments are always input files (note that you're still free to edit that ARGV list in a BEGIN statement).



You can also use it as:



gawk -e '...awk code here...' -E /dev/null *.txt


We use -E with an empty script (/dev/null) just to make sure those *.txt afterwards are always treated as input files, even if they contain = characters.






share|improve this answer























  • I don't see how the explicit path ending up in FILENAME is a problem. Either the awk script is general, in which case it should handle all kind of paths ending up in FILENAME (including but not limited to ../foo, /path/to/foo and paths that are in a different encoding) -- in which case substr(FILENAME,3) won't be enough, or it's a one shot script where the user basically knows what the filenames are -- in which case s/he probably shouldn't bother with any of them containing = either ;-)
    – mosvy
    Dec 23 at 4:09






  • 2




    @mosvy I don't think it states so much that ./ is a problem, but that it may be undesirable under certain conditions, such as cases where filename has to be included in the output, in which case ./ should be redundant and unnecessary, so you'll need to get rid of it somehow. Here's at least one example. As for user knowing what filenames are - well, in this case we also know what filename is, but = still gets in the way of proper processing. So can leading - get in the way.
    – Sergiy Kolodyazhnyy
    2 days ago










  • @mosvy, yes the idea is that you want to use the ./ prefix to work around that awk (mis)feature but then you end up with a that ./ on output which you may want to strip. See how to check if the first line of file contain a specific string? as an example.
    – Stéphane Chazelas
    2 days ago





















18














In most versions of awk, arguments after the program to execute are either:




  1. A file

  2. An assignment of the form x=y


Since your filename is being interpreted as case #2, awk is still waiting for something to read on stdin (since it doesn't perceive that there has been any filename passed).



Portably, this behaviour is documented in POSIX:




Either of the following two types of argument can be intermixed:




  • file: A pathname of a file that contains the input to be read, which is matched against the set of patterns in the program. If no file operands are specified, or if a file operand is '-', the standard input shall be used.

  • assignment: An operand that begins with an underscore or alphabetic character from the portable character set (see the table in the Base Definitions volume of IEEE Std 1003.1-2001, Section 6.1, Portable Character Set), followed by a sequence of underscores, digits, and alphabetics from the portable character set, followed by the '=' character, shall specify a variable assignment rather than a pathname.




As such, portably, you have a few options (#1 is likely the least intrusive):




  1. Use awk ... ./my=file, which sidesteps this since . is not "an underscore or alphabetic character from the portable character set".

  2. Put the file on stdin using awk ... < my=file. However, this doesn't work well with multiple files.

  3. Make a hardlink to the file temporarily, and use that. You can do something like ln my=file my_file, and then use my_file as normal. No copying will be performed, and both files will be backed by the same data and inode metadata. After using it, it's safe to remove the link created as the number of references to the inode will still be greater than 0.






share|improve this answer



















  • 6




    Doesn't ./my=file work? % awk 'processing_script_here' ./my=file.txt awk: fatal: cannot open file ./my=file.txt' for reading (No such file or directory). This should be portable because ./my isn't a valid variable name, so shouldn't be parsed that way.
    – Stephen Harris
    Dec 22 at 21:17








  • 2




    As that POSIX text says, the problem is only when the first = is preceded by an underscore or alphabetic character from the portable character set (see the table in the Base Definitions volume of IEEE Std 1003.1-2001, Section 6.1, Portable Character Set), followed by a sequence of underscores, digits, and alphabetics from the portable character set. so a file path like ++foo=bar.txt or =foo or ./foo=bar are all OK as that . or + is not a [_a-zA-Z].
    – Stéphane Chazelas
    Dec 22 at 22:04








  • 1




    @SergiyKolodyazhnyy awk is external to the shell, so it doesn't matter which you use. ./my=file will be passed through verbatim.
    – Chris Down
    Dec 23 at 0:00






  • 1




    @SergiyKolodyazhnyy, same for awk '{print $1,$2}' /etc/passwd. The point is that having the shell open the file as opposed to awk doesn't make any difference as to whether it makes it seekable or not. Actually, in awk '{exit}' < /etc/passwd, you'd expect awk to seek back to the end of the first record upon that exit to make sure it leaves the position within stdin there. POSIX requires that. /usr/xpg4/bin/awk does it on Solaris, but neither gawk nor mawk seem to do it on GNU/Linux.
    – Stéphane Chazelas
    Dec 23 at 0:28






  • 3




    @mosvy, see INPUT FILES section at pubs.opengroup.org/onlinepubs/9699919799/utilities/… It's useful in a number of usage patterns that only make sense with regular files like when you want to truncate a file or write data into it at a position identified by awk that way.
    – Stéphane Chazelas
    2 days ago



















3














To quote gawk documentation ( note emphasis added ):




Any additional arguments on the command line are normally treated as input files to be processed in the order specified. However, an argument that has the form var=value, assigns the value value to the variable var—it does not specify a file at all.




Why does the command stop and wait ? Because in the form awk 'processing_script_here' my=file.txt there is no file specified by the above definition - my=file.txt is interpreted as variable assignment, and if there's no file defined awk will read stdin ( also evident from strace which shows that awk in such command is waiting on read(0,'...) syscall.



This is also documented in POSIX awk specifications, see OPERANDS section and assignments part of that )



Variable assignment is evident in awk '{print foo}' foo=bar /etc/passwd that value of foo is printed for every line in /etc/passwd. Specifying ./foo=bar or full path however does work.



Note that running strace on awk '1' foo=bar as well as checking with cat foo=bar shows that this is awk-specific issue, and execve does show filename as argument passed, so shells have nothing to do with env variable assignments in this case.



Additionally, please note that awk '...script...' foo=bar will not cause environment variable creation by shell, since environment variable assignments should be preceding a command to take effect. See POSIX Shell Grammar Rules, point number 7. Additionally this can be verified via awk '{print ENVIRON["foo"]}' foo=bar /etc/passwd






share|improve this answer























    Your Answer








    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "106"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f490524%2fwhy-does-awk-stop-and-wait-if-the-filename-contains-and-how-to-work-around-tha%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    3 Answers
    3






    active

    oldest

    votes








    3 Answers
    3






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    16














    As Chris says, arguments of the form variablename=anything are treated as variable assignment (that are performed at the time the arguments are processed as opposed to the (newer) -v var=value ones which are performed before the BEGIN statements) instead of input file names.



    That can be useful in things like:



    awk '{print $1}' FS=/ RS='n' file1 FS='n' RS= file2


    Where you can specify a different FS/RS per file. It's also commonly used in:



    awk '!file1_processed{a[$0]; next}; {...}' file1 file1_processed=1 file2


    Which is a safer version of:



    awk 'NR==FNR{a[$0]; next}; {...}' file1 file2


    (which doesn't work if file1 is empty)



    But that gets in the way when you have files whose name contains = characters.



    Now, that's only a problem when what's left of the first = is a valid awk variable name.



    What constitutes a valid variable name in awk is stricter than in sh.



    POSIX requires it to be something like:



    [_a-zA-Z][_a-zA-Z0-9]*


    With only characters of the portable character set. However the /usr/xpg4/bin/awk of Solaris 11 at least is not compliant in that regard and allows any alphabetical characters in the locale in variable names, not just a-zA-Z.



    So an argument like x+y=foo or =bar or ./foo=bar is still treated as an input file name and not an assignment as what's left of the first = is not a valid variable name. An argument like Stéphane=Chazelas.txt may or may not depending on the awk implementation and locale.



    That's why with awk, it's recommended to use:



    awk '...' ./*.txt


    instead of



    awk '...' *.txt


    for instance to avoid the problem if you can't guarantee the name of the txt files won't contain = characters.



    Also, beware that an argument like -vfoo=bar.txt may be treated as an option if you use:



    awk -f file.awk -vfoo=bar.txt


    Again, using ./*.txt works around that (using a ./ prefix also helps with files called - which otherwise awk understands as meaning standard input instead).



    That's also why



    #! /usr/bin/awk -f


    shebangs don't really work. While the var=value ones can be worked around by fixing the ARGV values (add a ./ prefix) in a BEGIN statement:



    #! /usr/bin/awk -f
    BEGIN {
    for (i = 1; i < ARGC; i++)
    if (ARGV[i] ~ /^[_[:alpha:]][_[:alnum:]]*=/)
    ARGV[i] = "./" ARGV[i]
    }
    # rest of awk script


    That won't help with the option ones as those ones are seen by awk and not the awk script.



    One potential cosmetic issue with using that ./ prefix is it ends up in FILENAME, but you can always use substr(FILENAME, 3) to strip it if you don't want it.



    The GNU implementation of awk fixes all those issues with its -E option.



    After -E, gawk expects only the path of the awk script (where - still means stdin) and then a list of input file paths only (and there, not even - is treated specially).



    It's specially designed for:



    #! /usr/bin/gawk -E


    shebangs where the list of arguments are always input files (note that you're still free to edit that ARGV list in a BEGIN statement).



    You can also use it as:



    gawk -e '...awk code here...' -E /dev/null *.txt


    We use -E with an empty script (/dev/null) just to make sure those *.txt afterwards are always treated as input files, even if they contain = characters.






    share|improve this answer























    • I don't see how the explicit path ending up in FILENAME is a problem. Either the awk script is general, in which case it should handle all kind of paths ending up in FILENAME (including but not limited to ../foo, /path/to/foo and paths that are in a different encoding) -- in which case substr(FILENAME,3) won't be enough, or it's a one shot script where the user basically knows what the filenames are -- in which case s/he probably shouldn't bother with any of them containing = either ;-)
      – mosvy
      Dec 23 at 4:09






    • 2




      @mosvy I don't think it states so much that ./ is a problem, but that it may be undesirable under certain conditions, such as cases where filename has to be included in the output, in which case ./ should be redundant and unnecessary, so you'll need to get rid of it somehow. Here's at least one example. As for user knowing what filenames are - well, in this case we also know what filename is, but = still gets in the way of proper processing. So can leading - get in the way.
      – Sergiy Kolodyazhnyy
      2 days ago










    • @mosvy, yes the idea is that you want to use the ./ prefix to work around that awk (mis)feature but then you end up with a that ./ on output which you may want to strip. See how to check if the first line of file contain a specific string? as an example.
      – Stéphane Chazelas
      2 days ago


















    16














    As Chris says, arguments of the form variablename=anything are treated as variable assignment (that are performed at the time the arguments are processed as opposed to the (newer) -v var=value ones which are performed before the BEGIN statements) instead of input file names.



    That can be useful in things like:



    awk '{print $1}' FS=/ RS='n' file1 FS='n' RS= file2


    Where you can specify a different FS/RS per file. It's also commonly used in:



    awk '!file1_processed{a[$0]; next}; {...}' file1 file1_processed=1 file2


    Which is a safer version of:



    awk 'NR==FNR{a[$0]; next}; {...}' file1 file2


    (which doesn't work if file1 is empty)



    But that gets in the way when you have files whose name contains = characters.



    Now, that's only a problem when what's left of the first = is a valid awk variable name.



    What constitutes a valid variable name in awk is stricter than in sh.



    POSIX requires it to be something like:



    [_a-zA-Z][_a-zA-Z0-9]*


    With only characters of the portable character set. However the /usr/xpg4/bin/awk of Solaris 11 at least is not compliant in that regard and allows any alphabetical characters in the locale in variable names, not just a-zA-Z.



    So an argument like x+y=foo or =bar or ./foo=bar is still treated as an input file name and not an assignment as what's left of the first = is not a valid variable name. An argument like Stéphane=Chazelas.txt may or may not depending on the awk implementation and locale.



    That's why with awk, it's recommended to use:



    awk '...' ./*.txt


    instead of



    awk '...' *.txt


    for instance to avoid the problem if you can't guarantee the name of the txt files won't contain = characters.



    Also, beware that an argument like -vfoo=bar.txt may be treated as an option if you use:



    awk -f file.awk -vfoo=bar.txt


    Again, using ./*.txt works around that (using a ./ prefix also helps with files called - which otherwise awk understands as meaning standard input instead).



    That's also why



    #! /usr/bin/awk -f


    shebangs don't really work. While the var=value ones can be worked around by fixing the ARGV values (add a ./ prefix) in a BEGIN statement:



    #! /usr/bin/awk -f
    BEGIN {
    for (i = 1; i < ARGC; i++)
    if (ARGV[i] ~ /^[_[:alpha:]][_[:alnum:]]*=/)
    ARGV[i] = "./" ARGV[i]
    }
    # rest of awk script


    That won't help with the option ones as those ones are seen by awk and not the awk script.



    One potential cosmetic issue with using that ./ prefix is it ends up in FILENAME, but you can always use substr(FILENAME, 3) to strip it if you don't want it.



    The GNU implementation of awk fixes all those issues with its -E option.



    After -E, gawk expects only the path of the awk script (where - still means stdin) and then a list of input file paths only (and there, not even - is treated specially).



    It's specially designed for:



    #! /usr/bin/gawk -E


    shebangs where the list of arguments are always input files (note that you're still free to edit that ARGV list in a BEGIN statement).



    You can also use it as:



    gawk -e '...awk code here...' -E /dev/null *.txt


    We use -E with an empty script (/dev/null) just to make sure those *.txt afterwards are always treated as input files, even if they contain = characters.






    share|improve this answer























    • I don't see how the explicit path ending up in FILENAME is a problem. Either the awk script is general, in which case it should handle all kind of paths ending up in FILENAME (including but not limited to ../foo, /path/to/foo and paths that are in a different encoding) -- in which case substr(FILENAME,3) won't be enough, or it's a one shot script where the user basically knows what the filenames are -- in which case s/he probably shouldn't bother with any of them containing = either ;-)
      – mosvy
      Dec 23 at 4:09






    • 2




      @mosvy I don't think it states so much that ./ is a problem, but that it may be undesirable under certain conditions, such as cases where filename has to be included in the output, in which case ./ should be redundant and unnecessary, so you'll need to get rid of it somehow. Here's at least one example. As for user knowing what filenames are - well, in this case we also know what filename is, but = still gets in the way of proper processing. So can leading - get in the way.
      – Sergiy Kolodyazhnyy
      2 days ago










    • @mosvy, yes the idea is that you want to use the ./ prefix to work around that awk (mis)feature but then you end up with a that ./ on output which you may want to strip. See how to check if the first line of file contain a specific string? as an example.
      – Stéphane Chazelas
      2 days ago
















    16












    16








    16






    As Chris says, arguments of the form variablename=anything are treated as variable assignment (that are performed at the time the arguments are processed as opposed to the (newer) -v var=value ones which are performed before the BEGIN statements) instead of input file names.



    That can be useful in things like:



    awk '{print $1}' FS=/ RS='n' file1 FS='n' RS= file2


    Where you can specify a different FS/RS per file. It's also commonly used in:



    awk '!file1_processed{a[$0]; next}; {...}' file1 file1_processed=1 file2


    Which is a safer version of:



    awk 'NR==FNR{a[$0]; next}; {...}' file1 file2


    (which doesn't work if file1 is empty)



    But that gets in the way when you have files whose name contains = characters.



    Now, that's only a problem when what's left of the first = is a valid awk variable name.



    What constitutes a valid variable name in awk is stricter than in sh.



    POSIX requires it to be something like:



    [_a-zA-Z][_a-zA-Z0-9]*


    With only characters of the portable character set. However the /usr/xpg4/bin/awk of Solaris 11 at least is not compliant in that regard and allows any alphabetical characters in the locale in variable names, not just a-zA-Z.



    So an argument like x+y=foo or =bar or ./foo=bar is still treated as an input file name and not an assignment as what's left of the first = is not a valid variable name. An argument like Stéphane=Chazelas.txt may or may not depending on the awk implementation and locale.



    That's why with awk, it's recommended to use:



    awk '...' ./*.txt


    instead of



    awk '...' *.txt


    for instance to avoid the problem if you can't guarantee the name of the txt files won't contain = characters.



    Also, beware that an argument like -vfoo=bar.txt may be treated as an option if you use:



    awk -f file.awk -vfoo=bar.txt


    Again, using ./*.txt works around that (using a ./ prefix also helps with files called - which otherwise awk understands as meaning standard input instead).



    That's also why



    #! /usr/bin/awk -f


    shebangs don't really work. While the var=value ones can be worked around by fixing the ARGV values (add a ./ prefix) in a BEGIN statement:



    #! /usr/bin/awk -f
    BEGIN {
    for (i = 1; i < ARGC; i++)
    if (ARGV[i] ~ /^[_[:alpha:]][_[:alnum:]]*=/)
    ARGV[i] = "./" ARGV[i]
    }
    # rest of awk script


    That won't help with the option ones as those ones are seen by awk and not the awk script.



    One potential cosmetic issue with using that ./ prefix is it ends up in FILENAME, but you can always use substr(FILENAME, 3) to strip it if you don't want it.



    The GNU implementation of awk fixes all those issues with its -E option.



    After -E, gawk expects only the path of the awk script (where - still means stdin) and then a list of input file paths only (and there, not even - is treated specially).



    It's specially designed for:



    #! /usr/bin/gawk -E


    shebangs where the list of arguments are always input files (note that you're still free to edit that ARGV list in a BEGIN statement).



    You can also use it as:



    gawk -e '...awk code here...' -E /dev/null *.txt


    We use -E with an empty script (/dev/null) just to make sure those *.txt afterwards are always treated as input files, even if they contain = characters.






    share|improve this answer














    As Chris says, arguments of the form variablename=anything are treated as variable assignment (that are performed at the time the arguments are processed as opposed to the (newer) -v var=value ones which are performed before the BEGIN statements) instead of input file names.



    That can be useful in things like:



    awk '{print $1}' FS=/ RS='n' file1 FS='n' RS= file2


    Where you can specify a different FS/RS per file. It's also commonly used in:



    awk '!file1_processed{a[$0]; next}; {...}' file1 file1_processed=1 file2


    Which is a safer version of:



    awk 'NR==FNR{a[$0]; next}; {...}' file1 file2


    (which doesn't work if file1 is empty)



    But that gets in the way when you have files whose name contains = characters.



    Now, that's only a problem when what's left of the first = is a valid awk variable name.



    What constitutes a valid variable name in awk is stricter than in sh.



    POSIX requires it to be something like:



    [_a-zA-Z][_a-zA-Z0-9]*


    With only characters of the portable character set. However the /usr/xpg4/bin/awk of Solaris 11 at least is not compliant in that regard and allows any alphabetical characters in the locale in variable names, not just a-zA-Z.



    So an argument like x+y=foo or =bar or ./foo=bar is still treated as an input file name and not an assignment as what's left of the first = is not a valid variable name. An argument like Stéphane=Chazelas.txt may or may not depending on the awk implementation and locale.



    That's why with awk, it's recommended to use:



    awk '...' ./*.txt


    instead of



    awk '...' *.txt


    for instance to avoid the problem if you can't guarantee the name of the txt files won't contain = characters.



    Also, beware that an argument like -vfoo=bar.txt may be treated as an option if you use:



    awk -f file.awk -vfoo=bar.txt


    Again, using ./*.txt works around that (using a ./ prefix also helps with files called - which otherwise awk understands as meaning standard input instead).



    That's also why



    #! /usr/bin/awk -f


    shebangs don't really work. While the var=value ones can be worked around by fixing the ARGV values (add a ./ prefix) in a BEGIN statement:



    #! /usr/bin/awk -f
    BEGIN {
    for (i = 1; i < ARGC; i++)
    if (ARGV[i] ~ /^[_[:alpha:]][_[:alnum:]]*=/)
    ARGV[i] = "./" ARGV[i]
    }
    # rest of awk script


    That won't help with the option ones as those ones are seen by awk and not the awk script.



    One potential cosmetic issue with using that ./ prefix is it ends up in FILENAME, but you can always use substr(FILENAME, 3) to strip it if you don't want it.



    The GNU implementation of awk fixes all those issues with its -E option.



    After -E, gawk expects only the path of the awk script (where - still means stdin) and then a list of input file paths only (and there, not even - is treated specially).



    It's specially designed for:



    #! /usr/bin/gawk -E


    shebangs where the list of arguments are always input files (note that you're still free to edit that ARGV list in a BEGIN statement).



    You can also use it as:



    gawk -e '...awk code here...' -E /dev/null *.txt


    We use -E with an empty script (/dev/null) just to make sure those *.txt afterwards are always treated as input files, even if they contain = characters.







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited 2 days ago









    ilkkachu

    55.5k783151




    55.5k783151










    answered Dec 22 at 22:34









    Stéphane Chazelas

    299k54563913




    299k54563913












    • I don't see how the explicit path ending up in FILENAME is a problem. Either the awk script is general, in which case it should handle all kind of paths ending up in FILENAME (including but not limited to ../foo, /path/to/foo and paths that are in a different encoding) -- in which case substr(FILENAME,3) won't be enough, or it's a one shot script where the user basically knows what the filenames are -- in which case s/he probably shouldn't bother with any of them containing = either ;-)
      – mosvy
      Dec 23 at 4:09






    • 2




      @mosvy I don't think it states so much that ./ is a problem, but that it may be undesirable under certain conditions, such as cases where filename has to be included in the output, in which case ./ should be redundant and unnecessary, so you'll need to get rid of it somehow. Here's at least one example. As for user knowing what filenames are - well, in this case we also know what filename is, but = still gets in the way of proper processing. So can leading - get in the way.
      – Sergiy Kolodyazhnyy
      2 days ago










    • @mosvy, yes the idea is that you want to use the ./ prefix to work around that awk (mis)feature but then you end up with a that ./ on output which you may want to strip. See how to check if the first line of file contain a specific string? as an example.
      – Stéphane Chazelas
      2 days ago




















    • I don't see how the explicit path ending up in FILENAME is a problem. Either the awk script is general, in which case it should handle all kind of paths ending up in FILENAME (including but not limited to ../foo, /path/to/foo and paths that are in a different encoding) -- in which case substr(FILENAME,3) won't be enough, or it's a one shot script where the user basically knows what the filenames are -- in which case s/he probably shouldn't bother with any of them containing = either ;-)
      – mosvy
      Dec 23 at 4:09






    • 2




      @mosvy I don't think it states so much that ./ is a problem, but that it may be undesirable under certain conditions, such as cases where filename has to be included in the output, in which case ./ should be redundant and unnecessary, so you'll need to get rid of it somehow. Here's at least one example. As for user knowing what filenames are - well, in this case we also know what filename is, but = still gets in the way of proper processing. So can leading - get in the way.
      – Sergiy Kolodyazhnyy
      2 days ago










    • @mosvy, yes the idea is that you want to use the ./ prefix to work around that awk (mis)feature but then you end up with a that ./ on output which you may want to strip. See how to check if the first line of file contain a specific string? as an example.
      – Stéphane Chazelas
      2 days ago


















    I don't see how the explicit path ending up in FILENAME is a problem. Either the awk script is general, in which case it should handle all kind of paths ending up in FILENAME (including but not limited to ../foo, /path/to/foo and paths that are in a different encoding) -- in which case substr(FILENAME,3) won't be enough, or it's a one shot script where the user basically knows what the filenames are -- in which case s/he probably shouldn't bother with any of them containing = either ;-)
    – mosvy
    Dec 23 at 4:09




    I don't see how the explicit path ending up in FILENAME is a problem. Either the awk script is general, in which case it should handle all kind of paths ending up in FILENAME (including but not limited to ../foo, /path/to/foo and paths that are in a different encoding) -- in which case substr(FILENAME,3) won't be enough, or it's a one shot script where the user basically knows what the filenames are -- in which case s/he probably shouldn't bother with any of them containing = either ;-)
    – mosvy
    Dec 23 at 4:09




    2




    2




    @mosvy I don't think it states so much that ./ is a problem, but that it may be undesirable under certain conditions, such as cases where filename has to be included in the output, in which case ./ should be redundant and unnecessary, so you'll need to get rid of it somehow. Here's at least one example. As for user knowing what filenames are - well, in this case we also know what filename is, but = still gets in the way of proper processing. So can leading - get in the way.
    – Sergiy Kolodyazhnyy
    2 days ago




    @mosvy I don't think it states so much that ./ is a problem, but that it may be undesirable under certain conditions, such as cases where filename has to be included in the output, in which case ./ should be redundant and unnecessary, so you'll need to get rid of it somehow. Here's at least one example. As for user knowing what filenames are - well, in this case we also know what filename is, but = still gets in the way of proper processing. So can leading - get in the way.
    – Sergiy Kolodyazhnyy
    2 days ago












    @mosvy, yes the idea is that you want to use the ./ prefix to work around that awk (mis)feature but then you end up with a that ./ on output which you may want to strip. See how to check if the first line of file contain a specific string? as an example.
    – Stéphane Chazelas
    2 days ago






    @mosvy, yes the idea is that you want to use the ./ prefix to work around that awk (mis)feature but then you end up with a that ./ on output which you may want to strip. See how to check if the first line of file contain a specific string? as an example.
    – Stéphane Chazelas
    2 days ago















    18














    In most versions of awk, arguments after the program to execute are either:




    1. A file

    2. An assignment of the form x=y


    Since your filename is being interpreted as case #2, awk is still waiting for something to read on stdin (since it doesn't perceive that there has been any filename passed).



    Portably, this behaviour is documented in POSIX:




    Either of the following two types of argument can be intermixed:




    • file: A pathname of a file that contains the input to be read, which is matched against the set of patterns in the program. If no file operands are specified, or if a file operand is '-', the standard input shall be used.

    • assignment: An operand that begins with an underscore or alphabetic character from the portable character set (see the table in the Base Definitions volume of IEEE Std 1003.1-2001, Section 6.1, Portable Character Set), followed by a sequence of underscores, digits, and alphabetics from the portable character set, followed by the '=' character, shall specify a variable assignment rather than a pathname.




    As such, portably, you have a few options (#1 is likely the least intrusive):




    1. Use awk ... ./my=file, which sidesteps this since . is not "an underscore or alphabetic character from the portable character set".

    2. Put the file on stdin using awk ... < my=file. However, this doesn't work well with multiple files.

    3. Make a hardlink to the file temporarily, and use that. You can do something like ln my=file my_file, and then use my_file as normal. No copying will be performed, and both files will be backed by the same data and inode metadata. After using it, it's safe to remove the link created as the number of references to the inode will still be greater than 0.






    share|improve this answer



















    • 6




      Doesn't ./my=file work? % awk 'processing_script_here' ./my=file.txt awk: fatal: cannot open file ./my=file.txt' for reading (No such file or directory). This should be portable because ./my isn't a valid variable name, so shouldn't be parsed that way.
      – Stephen Harris
      Dec 22 at 21:17








    • 2




      As that POSIX text says, the problem is only when the first = is preceded by an underscore or alphabetic character from the portable character set (see the table in the Base Definitions volume of IEEE Std 1003.1-2001, Section 6.1, Portable Character Set), followed by a sequence of underscores, digits, and alphabetics from the portable character set. so a file path like ++foo=bar.txt or =foo or ./foo=bar are all OK as that . or + is not a [_a-zA-Z].
      – Stéphane Chazelas
      Dec 22 at 22:04








    • 1




      @SergiyKolodyazhnyy awk is external to the shell, so it doesn't matter which you use. ./my=file will be passed through verbatim.
      – Chris Down
      Dec 23 at 0:00






    • 1




      @SergiyKolodyazhnyy, same for awk '{print $1,$2}' /etc/passwd. The point is that having the shell open the file as opposed to awk doesn't make any difference as to whether it makes it seekable or not. Actually, in awk '{exit}' < /etc/passwd, you'd expect awk to seek back to the end of the first record upon that exit to make sure it leaves the position within stdin there. POSIX requires that. /usr/xpg4/bin/awk does it on Solaris, but neither gawk nor mawk seem to do it on GNU/Linux.
      – Stéphane Chazelas
      Dec 23 at 0:28






    • 3




      @mosvy, see INPUT FILES section at pubs.opengroup.org/onlinepubs/9699919799/utilities/… It's useful in a number of usage patterns that only make sense with regular files like when you want to truncate a file or write data into it at a position identified by awk that way.
      – Stéphane Chazelas
      2 days ago
















    18














    In most versions of awk, arguments after the program to execute are either:




    1. A file

    2. An assignment of the form x=y


    Since your filename is being interpreted as case #2, awk is still waiting for something to read on stdin (since it doesn't perceive that there has been any filename passed).



    Portably, this behaviour is documented in POSIX:




    Either of the following two types of argument can be intermixed:




    • file: A pathname of a file that contains the input to be read, which is matched against the set of patterns in the program. If no file operands are specified, or if a file operand is '-', the standard input shall be used.

    • assignment: An operand that begins with an underscore or alphabetic character from the portable character set (see the table in the Base Definitions volume of IEEE Std 1003.1-2001, Section 6.1, Portable Character Set), followed by a sequence of underscores, digits, and alphabetics from the portable character set, followed by the '=' character, shall specify a variable assignment rather than a pathname.




    As such, portably, you have a few options (#1 is likely the least intrusive):




    1. Use awk ... ./my=file, which sidesteps this since . is not "an underscore or alphabetic character from the portable character set".

    2. Put the file on stdin using awk ... < my=file. However, this doesn't work well with multiple files.

    3. Make a hardlink to the file temporarily, and use that. You can do something like ln my=file my_file, and then use my_file as normal. No copying will be performed, and both files will be backed by the same data and inode metadata. After using it, it's safe to remove the link created as the number of references to the inode will still be greater than 0.






    share|improve this answer



















    • 6




      Doesn't ./my=file work? % awk 'processing_script_here' ./my=file.txt awk: fatal: cannot open file ./my=file.txt' for reading (No such file or directory). This should be portable because ./my isn't a valid variable name, so shouldn't be parsed that way.
      – Stephen Harris
      Dec 22 at 21:17








    • 2




      As that POSIX text says, the problem is only when the first = is preceded by an underscore or alphabetic character from the portable character set (see the table in the Base Definitions volume of IEEE Std 1003.1-2001, Section 6.1, Portable Character Set), followed by a sequence of underscores, digits, and alphabetics from the portable character set. so a file path like ++foo=bar.txt or =foo or ./foo=bar are all OK as that . or + is not a [_a-zA-Z].
      – Stéphane Chazelas
      Dec 22 at 22:04








    • 1




      @SergiyKolodyazhnyy awk is external to the shell, so it doesn't matter which you use. ./my=file will be passed through verbatim.
      – Chris Down
      Dec 23 at 0:00






    • 1




      @SergiyKolodyazhnyy, same for awk '{print $1,$2}' /etc/passwd. The point is that having the shell open the file as opposed to awk doesn't make any difference as to whether it makes it seekable or not. Actually, in awk '{exit}' < /etc/passwd, you'd expect awk to seek back to the end of the first record upon that exit to make sure it leaves the position within stdin there. POSIX requires that. /usr/xpg4/bin/awk does it on Solaris, but neither gawk nor mawk seem to do it on GNU/Linux.
      – Stéphane Chazelas
      Dec 23 at 0:28






    • 3




      @mosvy, see INPUT FILES section at pubs.opengroup.org/onlinepubs/9699919799/utilities/… It's useful in a number of usage patterns that only make sense with regular files like when you want to truncate a file or write data into it at a position identified by awk that way.
      – Stéphane Chazelas
      2 days ago














    18












    18








    18






    In most versions of awk, arguments after the program to execute are either:




    1. A file

    2. An assignment of the form x=y


    Since your filename is being interpreted as case #2, awk is still waiting for something to read on stdin (since it doesn't perceive that there has been any filename passed).



    Portably, this behaviour is documented in POSIX:




    Either of the following two types of argument can be intermixed:




    • file: A pathname of a file that contains the input to be read, which is matched against the set of patterns in the program. If no file operands are specified, or if a file operand is '-', the standard input shall be used.

    • assignment: An operand that begins with an underscore or alphabetic character from the portable character set (see the table in the Base Definitions volume of IEEE Std 1003.1-2001, Section 6.1, Portable Character Set), followed by a sequence of underscores, digits, and alphabetics from the portable character set, followed by the '=' character, shall specify a variable assignment rather than a pathname.




    As such, portably, you have a few options (#1 is likely the least intrusive):




    1. Use awk ... ./my=file, which sidesteps this since . is not "an underscore or alphabetic character from the portable character set".

    2. Put the file on stdin using awk ... < my=file. However, this doesn't work well with multiple files.

    3. Make a hardlink to the file temporarily, and use that. You can do something like ln my=file my_file, and then use my_file as normal. No copying will be performed, and both files will be backed by the same data and inode metadata. After using it, it's safe to remove the link created as the number of references to the inode will still be greater than 0.






    share|improve this answer














    In most versions of awk, arguments after the program to execute are either:




    1. A file

    2. An assignment of the form x=y


    Since your filename is being interpreted as case #2, awk is still waiting for something to read on stdin (since it doesn't perceive that there has been any filename passed).



    Portably, this behaviour is documented in POSIX:




    Either of the following two types of argument can be intermixed:




    • file: A pathname of a file that contains the input to be read, which is matched against the set of patterns in the program. If no file operands are specified, or if a file operand is '-', the standard input shall be used.

    • assignment: An operand that begins with an underscore or alphabetic character from the portable character set (see the table in the Base Definitions volume of IEEE Std 1003.1-2001, Section 6.1, Portable Character Set), followed by a sequence of underscores, digits, and alphabetics from the portable character set, followed by the '=' character, shall specify a variable assignment rather than a pathname.




    As such, portably, you have a few options (#1 is likely the least intrusive):




    1. Use awk ... ./my=file, which sidesteps this since . is not "an underscore or alphabetic character from the portable character set".

    2. Put the file on stdin using awk ... < my=file. However, this doesn't work well with multiple files.

    3. Make a hardlink to the file temporarily, and use that. You can do something like ln my=file my_file, and then use my_file as normal. No copying will be performed, and both files will be backed by the same data and inode metadata. After using it, it's safe to remove the link created as the number of references to the inode will still be greater than 0.







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited 2 days ago

























    answered Dec 22 at 20:53









    Chris Down

    79k14188202




    79k14188202








    • 6




      Doesn't ./my=file work? % awk 'processing_script_here' ./my=file.txt awk: fatal: cannot open file ./my=file.txt' for reading (No such file or directory). This should be portable because ./my isn't a valid variable name, so shouldn't be parsed that way.
      – Stephen Harris
      Dec 22 at 21:17








    • 2




      As that POSIX text says, the problem is only when the first = is preceded by an underscore or alphabetic character from the portable character set (see the table in the Base Definitions volume of IEEE Std 1003.1-2001, Section 6.1, Portable Character Set), followed by a sequence of underscores, digits, and alphabetics from the portable character set. so a file path like ++foo=bar.txt or =foo or ./foo=bar are all OK as that . or + is not a [_a-zA-Z].
      – Stéphane Chazelas
      Dec 22 at 22:04








    • 1




      @SergiyKolodyazhnyy awk is external to the shell, so it doesn't matter which you use. ./my=file will be passed through verbatim.
      – Chris Down
      Dec 23 at 0:00






    • 1




      @SergiyKolodyazhnyy, same for awk '{print $1,$2}' /etc/passwd. The point is that having the shell open the file as opposed to awk doesn't make any difference as to whether it makes it seekable or not. Actually, in awk '{exit}' < /etc/passwd, you'd expect awk to seek back to the end of the first record upon that exit to make sure it leaves the position within stdin there. POSIX requires that. /usr/xpg4/bin/awk does it on Solaris, but neither gawk nor mawk seem to do it on GNU/Linux.
      – Stéphane Chazelas
      Dec 23 at 0:28






    • 3




      @mosvy, see INPUT FILES section at pubs.opengroup.org/onlinepubs/9699919799/utilities/… It's useful in a number of usage patterns that only make sense with regular files like when you want to truncate a file or write data into it at a position identified by awk that way.
      – Stéphane Chazelas
      2 days ago














    • 6




      Doesn't ./my=file work? % awk 'processing_script_here' ./my=file.txt awk: fatal: cannot open file ./my=file.txt' for reading (No such file or directory). This should be portable because ./my isn't a valid variable name, so shouldn't be parsed that way.
      – Stephen Harris
      Dec 22 at 21:17








    • 2




      As that POSIX text says, the problem is only when the first = is preceded by an underscore or alphabetic character from the portable character set (see the table in the Base Definitions volume of IEEE Std 1003.1-2001, Section 6.1, Portable Character Set), followed by a sequence of underscores, digits, and alphabetics from the portable character set. so a file path like ++foo=bar.txt or =foo or ./foo=bar are all OK as that . or + is not a [_a-zA-Z].
      – Stéphane Chazelas
      Dec 22 at 22:04








    • 1




      @SergiyKolodyazhnyy awk is external to the shell, so it doesn't matter which you use. ./my=file will be passed through verbatim.
      – Chris Down
      Dec 23 at 0:00






    • 1




      @SergiyKolodyazhnyy, same for awk '{print $1,$2}' /etc/passwd. The point is that having the shell open the file as opposed to awk doesn't make any difference as to whether it makes it seekable or not. Actually, in awk '{exit}' < /etc/passwd, you'd expect awk to seek back to the end of the first record upon that exit to make sure it leaves the position within stdin there. POSIX requires that. /usr/xpg4/bin/awk does it on Solaris, but neither gawk nor mawk seem to do it on GNU/Linux.
      – Stéphane Chazelas
      Dec 23 at 0:28






    • 3




      @mosvy, see INPUT FILES section at pubs.opengroup.org/onlinepubs/9699919799/utilities/… It's useful in a number of usage patterns that only make sense with regular files like when you want to truncate a file or write data into it at a position identified by awk that way.
      – Stéphane Chazelas
      2 days ago








    6




    6




    Doesn't ./my=file work? % awk 'processing_script_here' ./my=file.txt awk: fatal: cannot open file ./my=file.txt' for reading (No such file or directory). This should be portable because ./my isn't a valid variable name, so shouldn't be parsed that way.
    – Stephen Harris
    Dec 22 at 21:17






    Doesn't ./my=file work? % awk 'processing_script_here' ./my=file.txt awk: fatal: cannot open file ./my=file.txt' for reading (No such file or directory). This should be portable because ./my isn't a valid variable name, so shouldn't be parsed that way.
    – Stephen Harris
    Dec 22 at 21:17






    2




    2




    As that POSIX text says, the problem is only when the first = is preceded by an underscore or alphabetic character from the portable character set (see the table in the Base Definitions volume of IEEE Std 1003.1-2001, Section 6.1, Portable Character Set), followed by a sequence of underscores, digits, and alphabetics from the portable character set. so a file path like ++foo=bar.txt or =foo or ./foo=bar are all OK as that . or + is not a [_a-zA-Z].
    – Stéphane Chazelas
    Dec 22 at 22:04






    As that POSIX text says, the problem is only when the first = is preceded by an underscore or alphabetic character from the portable character set (see the table in the Base Definitions volume of IEEE Std 1003.1-2001, Section 6.1, Portable Character Set), followed by a sequence of underscores, digits, and alphabetics from the portable character set. so a file path like ++foo=bar.txt or =foo or ./foo=bar are all OK as that . or + is not a [_a-zA-Z].
    – Stéphane Chazelas
    Dec 22 at 22:04






    1




    1




    @SergiyKolodyazhnyy awk is external to the shell, so it doesn't matter which you use. ./my=file will be passed through verbatim.
    – Chris Down
    Dec 23 at 0:00




    @SergiyKolodyazhnyy awk is external to the shell, so it doesn't matter which you use. ./my=file will be passed through verbatim.
    – Chris Down
    Dec 23 at 0:00




    1




    1




    @SergiyKolodyazhnyy, same for awk '{print $1,$2}' /etc/passwd. The point is that having the shell open the file as opposed to awk doesn't make any difference as to whether it makes it seekable or not. Actually, in awk '{exit}' < /etc/passwd, you'd expect awk to seek back to the end of the first record upon that exit to make sure it leaves the position within stdin there. POSIX requires that. /usr/xpg4/bin/awk does it on Solaris, but neither gawk nor mawk seem to do it on GNU/Linux.
    – Stéphane Chazelas
    Dec 23 at 0:28




    @SergiyKolodyazhnyy, same for awk '{print $1,$2}' /etc/passwd. The point is that having the shell open the file as opposed to awk doesn't make any difference as to whether it makes it seekable or not. Actually, in awk '{exit}' < /etc/passwd, you'd expect awk to seek back to the end of the first record upon that exit to make sure it leaves the position within stdin there. POSIX requires that. /usr/xpg4/bin/awk does it on Solaris, but neither gawk nor mawk seem to do it on GNU/Linux.
    – Stéphane Chazelas
    Dec 23 at 0:28




    3




    3




    @mosvy, see INPUT FILES section at pubs.opengroup.org/onlinepubs/9699919799/utilities/… It's useful in a number of usage patterns that only make sense with regular files like when you want to truncate a file or write data into it at a position identified by awk that way.
    – Stéphane Chazelas
    2 days ago




    @mosvy, see INPUT FILES section at pubs.opengroup.org/onlinepubs/9699919799/utilities/… It's useful in a number of usage patterns that only make sense with regular files like when you want to truncate a file or write data into it at a position identified by awk that way.
    – Stéphane Chazelas
    2 days ago











    3














    To quote gawk documentation ( note emphasis added ):




    Any additional arguments on the command line are normally treated as input files to be processed in the order specified. However, an argument that has the form var=value, assigns the value value to the variable var—it does not specify a file at all.




    Why does the command stop and wait ? Because in the form awk 'processing_script_here' my=file.txt there is no file specified by the above definition - my=file.txt is interpreted as variable assignment, and if there's no file defined awk will read stdin ( also evident from strace which shows that awk in such command is waiting on read(0,'...) syscall.



    This is also documented in POSIX awk specifications, see OPERANDS section and assignments part of that )



    Variable assignment is evident in awk '{print foo}' foo=bar /etc/passwd that value of foo is printed for every line in /etc/passwd. Specifying ./foo=bar or full path however does work.



    Note that running strace on awk '1' foo=bar as well as checking with cat foo=bar shows that this is awk-specific issue, and execve does show filename as argument passed, so shells have nothing to do with env variable assignments in this case.



    Additionally, please note that awk '...script...' foo=bar will not cause environment variable creation by shell, since environment variable assignments should be preceding a command to take effect. See POSIX Shell Grammar Rules, point number 7. Additionally this can be verified via awk '{print ENVIRON["foo"]}' foo=bar /etc/passwd






    share|improve this answer




























      3














      To quote gawk documentation ( note emphasis added ):




      Any additional arguments on the command line are normally treated as input files to be processed in the order specified. However, an argument that has the form var=value, assigns the value value to the variable var—it does not specify a file at all.




      Why does the command stop and wait ? Because in the form awk 'processing_script_here' my=file.txt there is no file specified by the above definition - my=file.txt is interpreted as variable assignment, and if there's no file defined awk will read stdin ( also evident from strace which shows that awk in such command is waiting on read(0,'...) syscall.



      This is also documented in POSIX awk specifications, see OPERANDS section and assignments part of that )



      Variable assignment is evident in awk '{print foo}' foo=bar /etc/passwd that value of foo is printed for every line in /etc/passwd. Specifying ./foo=bar or full path however does work.



      Note that running strace on awk '1' foo=bar as well as checking with cat foo=bar shows that this is awk-specific issue, and execve does show filename as argument passed, so shells have nothing to do with env variable assignments in this case.



      Additionally, please note that awk '...script...' foo=bar will not cause environment variable creation by shell, since environment variable assignments should be preceding a command to take effect. See POSIX Shell Grammar Rules, point number 7. Additionally this can be verified via awk '{print ENVIRON["foo"]}' foo=bar /etc/passwd






      share|improve this answer


























        3












        3








        3






        To quote gawk documentation ( note emphasis added ):




        Any additional arguments on the command line are normally treated as input files to be processed in the order specified. However, an argument that has the form var=value, assigns the value value to the variable var—it does not specify a file at all.




        Why does the command stop and wait ? Because in the form awk 'processing_script_here' my=file.txt there is no file specified by the above definition - my=file.txt is interpreted as variable assignment, and if there's no file defined awk will read stdin ( also evident from strace which shows that awk in such command is waiting on read(0,'...) syscall.



        This is also documented in POSIX awk specifications, see OPERANDS section and assignments part of that )



        Variable assignment is evident in awk '{print foo}' foo=bar /etc/passwd that value of foo is printed for every line in /etc/passwd. Specifying ./foo=bar or full path however does work.



        Note that running strace on awk '1' foo=bar as well as checking with cat foo=bar shows that this is awk-specific issue, and execve does show filename as argument passed, so shells have nothing to do with env variable assignments in this case.



        Additionally, please note that awk '...script...' foo=bar will not cause environment variable creation by shell, since environment variable assignments should be preceding a command to take effect. See POSIX Shell Grammar Rules, point number 7. Additionally this can be verified via awk '{print ENVIRON["foo"]}' foo=bar /etc/passwd






        share|improve this answer














        To quote gawk documentation ( note emphasis added ):




        Any additional arguments on the command line are normally treated as input files to be processed in the order specified. However, an argument that has the form var=value, assigns the value value to the variable var—it does not specify a file at all.




        Why does the command stop and wait ? Because in the form awk 'processing_script_here' my=file.txt there is no file specified by the above definition - my=file.txt is interpreted as variable assignment, and if there's no file defined awk will read stdin ( also evident from strace which shows that awk in such command is waiting on read(0,'...) syscall.



        This is also documented in POSIX awk specifications, see OPERANDS section and assignments part of that )



        Variable assignment is evident in awk '{print foo}' foo=bar /etc/passwd that value of foo is printed for every line in /etc/passwd. Specifying ./foo=bar or full path however does work.



        Note that running strace on awk '1' foo=bar as well as checking with cat foo=bar shows that this is awk-specific issue, and execve does show filename as argument passed, so shells have nothing to do with env variable assignments in this case.



        Additionally, please note that awk '...script...' foo=bar will not cause environment variable creation by shell, since environment variable assignments should be preceding a command to take effect. See POSIX Shell Grammar Rules, point number 7. Additionally this can be verified via awk '{print ENVIRON["foo"]}' foo=bar /etc/passwd







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Dec 23 at 3:19

























        answered Dec 22 at 22:11









        Sergiy Kolodyazhnyy

        8,27212152




        8,27212152






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Unix & Linux Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f490524%2fwhy-does-awk-stop-and-wait-if-the-filename-contains-and-how-to-work-around-tha%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            "Incorrect syntax near the keyword 'ON'. (on update cascade, on delete cascade,)

            Alcedinidae

            Origin of the phrase “under your belt”?