Bash script to truncate subject line of incoming email












7












$begingroup$


I'm going to put this script into production in a mail server /etc/aliases file: we have a system that receives email but the subject line must be limited to a certain size. The proposed usage in the aliases file:



alias_name: "| email_filter.sh -s 200 actual.recipient@example.com"


It will be deployed on a gnu/linux system. I'd appreciate any feedback.



#!/bin/bash
shopt -s extglob

usage="$(basename $BASH_SOURCE) [-h] [-s n] recipient"
where="where: -s n == truncate the subject to n characters"
subject_length=''

while getopts :hs: opt; do
case $opt in
h) echo "$usage"; echo "$where"; exit ;;
s) subject_length=$OPTARG ;;
*) echo "Error: $usage" >&2; exit 1 ;;
esac
done
shift $((OPTIND - 1))

# validation
if [[ "$#" -eq 1 ]]; then
recipient=$1
else
echo "Error: $usage" >&2
exit 1
fi
if [[ -n $subject_length ]] && [[ $subject_length != +([0-9]) ]]; then
echo "Error: subject length must be a whole number"
exit 1
fi

sed_filters=()
if [[ -n $subject_length ]]; then
sed_filters+=( -e "s/^(Subject: .{1,$subject_length}).*/1/" )
fi
# other filters can go here

if [[ ${#sed_filters[@]} > 0 ]]; then
cmd=( sed -E "${sed_filters[@]}" )
else
# no command line filters given
cmd=( cat )
fi

# now, filter the incoming email (on stdin) and pass to sendmail
"${cmd[@]}" | /usr/sbin/sendmail -oi "$recipient"









share|improve this question











$endgroup$

















    7












    $begingroup$


    I'm going to put this script into production in a mail server /etc/aliases file: we have a system that receives email but the subject line must be limited to a certain size. The proposed usage in the aliases file:



    alias_name: "| email_filter.sh -s 200 actual.recipient@example.com"


    It will be deployed on a gnu/linux system. I'd appreciate any feedback.



    #!/bin/bash
    shopt -s extglob

    usage="$(basename $BASH_SOURCE) [-h] [-s n] recipient"
    where="where: -s n == truncate the subject to n characters"
    subject_length=''

    while getopts :hs: opt; do
    case $opt in
    h) echo "$usage"; echo "$where"; exit ;;
    s) subject_length=$OPTARG ;;
    *) echo "Error: $usage" >&2; exit 1 ;;
    esac
    done
    shift $((OPTIND - 1))

    # validation
    if [[ "$#" -eq 1 ]]; then
    recipient=$1
    else
    echo "Error: $usage" >&2
    exit 1
    fi
    if [[ -n $subject_length ]] && [[ $subject_length != +([0-9]) ]]; then
    echo "Error: subject length must be a whole number"
    exit 1
    fi

    sed_filters=()
    if [[ -n $subject_length ]]; then
    sed_filters+=( -e "s/^(Subject: .{1,$subject_length}).*/1/" )
    fi
    # other filters can go here

    if [[ ${#sed_filters[@]} > 0 ]]; then
    cmd=( sed -E "${sed_filters[@]}" )
    else
    # no command line filters given
    cmd=( cat )
    fi

    # now, filter the incoming email (on stdin) and pass to sendmail
    "${cmd[@]}" | /usr/sbin/sendmail -oi "$recipient"









    share|improve this question











    $endgroup$















      7












      7








      7





      $begingroup$


      I'm going to put this script into production in a mail server /etc/aliases file: we have a system that receives email but the subject line must be limited to a certain size. The proposed usage in the aliases file:



      alias_name: "| email_filter.sh -s 200 actual.recipient@example.com"


      It will be deployed on a gnu/linux system. I'd appreciate any feedback.



      #!/bin/bash
      shopt -s extglob

      usage="$(basename $BASH_SOURCE) [-h] [-s n] recipient"
      where="where: -s n == truncate the subject to n characters"
      subject_length=''

      while getopts :hs: opt; do
      case $opt in
      h) echo "$usage"; echo "$where"; exit ;;
      s) subject_length=$OPTARG ;;
      *) echo "Error: $usage" >&2; exit 1 ;;
      esac
      done
      shift $((OPTIND - 1))

      # validation
      if [[ "$#" -eq 1 ]]; then
      recipient=$1
      else
      echo "Error: $usage" >&2
      exit 1
      fi
      if [[ -n $subject_length ]] && [[ $subject_length != +([0-9]) ]]; then
      echo "Error: subject length must be a whole number"
      exit 1
      fi

      sed_filters=()
      if [[ -n $subject_length ]]; then
      sed_filters+=( -e "s/^(Subject: .{1,$subject_length}).*/1/" )
      fi
      # other filters can go here

      if [[ ${#sed_filters[@]} > 0 ]]; then
      cmd=( sed -E "${sed_filters[@]}" )
      else
      # no command line filters given
      cmd=( cat )
      fi

      # now, filter the incoming email (on stdin) and pass to sendmail
      "${cmd[@]}" | /usr/sbin/sendmail -oi "$recipient"









      share|improve this question











      $endgroup$




      I'm going to put this script into production in a mail server /etc/aliases file: we have a system that receives email but the subject line must be limited to a certain size. The proposed usage in the aliases file:



      alias_name: "| email_filter.sh -s 200 actual.recipient@example.com"


      It will be deployed on a gnu/linux system. I'd appreciate any feedback.



      #!/bin/bash
      shopt -s extglob

      usage="$(basename $BASH_SOURCE) [-h] [-s n] recipient"
      where="where: -s n == truncate the subject to n characters"
      subject_length=''

      while getopts :hs: opt; do
      case $opt in
      h) echo "$usage"; echo "$where"; exit ;;
      s) subject_length=$OPTARG ;;
      *) echo "Error: $usage" >&2; exit 1 ;;
      esac
      done
      shift $((OPTIND - 1))

      # validation
      if [[ "$#" -eq 1 ]]; then
      recipient=$1
      else
      echo "Error: $usage" >&2
      exit 1
      fi
      if [[ -n $subject_length ]] && [[ $subject_length != +([0-9]) ]]; then
      echo "Error: subject length must be a whole number"
      exit 1
      fi

      sed_filters=()
      if [[ -n $subject_length ]]; then
      sed_filters+=( -e "s/^(Subject: .{1,$subject_length}).*/1/" )
      fi
      # other filters can go here

      if [[ ${#sed_filters[@]} > 0 ]]; then
      cmd=( sed -E "${sed_filters[@]}" )
      else
      # no command line filters given
      cmd=( cat )
      fi

      # now, filter the incoming email (on stdin) and pass to sendmail
      "${cmd[@]}" | /usr/sbin/sendmail -oi "$recipient"






      validation bash linux email sed






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited 9 hours ago









      200_success

      130k16153417




      130k16153417










      asked 11 hours ago









      glenn jackmanglenn jackman

      1,739711




      1,739711






















          3 Answers
          3






          active

          oldest

          votes


















          5












          $begingroup$



          Be sure to read the relevant RFCs that govern e-mail headers! Specifically:





          • RFC 2822, Section 1.2.2: Header names are case-insensitive.


          • RFC 2822, Section 2.2.3: Header fields may be line-folded:




            2.2.3. Long Header Fields



            Each header field is logically a single line of characters
            comprising the field name, the colon, and the field body. For
            convenience however, and to deal with the 998/78 character
            limitations per line, the field body portion of a header field can
            be split into a multiple line representation; this is called
            "folding". The general rule is that wherever this standard allows
            for folding white space (not simply WSP characters), a CRLF may be
            inserted before any WSP. For example, the header field:



            Subject: This is a test


            can be represented as:



            Subject: This
            is a test



            Since your sed operates on the raw representation of the header, you will miss headers that are logically longer than subject_length characters long, but start with a physically short line.



            What is your rationale for developing this filter? Is the application that processes the incoming messages unable to handle long subject texts, or is it unable to handle long physical lines? If it's the latter, maybe all you need is a filter that performs line folding, rather than truncation.




          • RFC 2047: Encoding mechanisms for non-ASCII headers. A logical subject line



            Subject: this is some text


            … could also be represented physically as



            Subject: =?iso-8859-1?q?this=20is=20some=20text?=


            … or by many other representations. Is your limit based on the number of bytes in the raw representation, the number of bytes in the UTF-8 representation, the number of Unicode characters, or something else? You didn't specify clearly. If you are truncating the raw representation, you might truncate a MIME-encoded header at a point that makes it syntactically invalid.








          share|improve this answer











          $endgroup$













          • $begingroup$
            OK, I can use formail -czx subject to extract the folded subject into a single line, but do you know what tools are available to help with encoded subjects?
            $endgroup$
            – glenn jackman
            8 hours ago












          • $begingroup$
            mailutils 2047 --decode can perform RFC2047 decoding. (Note, however, that the feature is broken in Debian/Ubuntu's GNU mailutils package.)
            $endgroup$
            – 200_success
            5 hours ago



















          6












          $begingroup$

          Generally good code - plus points for good use of stdout/stderr and exit status.



          Shellcheck reported some issues:



          shellcheck -f gcc  214327.sh
          214327.sh:4:19: warning: Expanding an array without an index only gives the first element. [SC2128]
          214327.sh:4:19: note: Double quote to prevent globbing and word splitting. [SC2086]
          214327.sh:31:61: note: Backslash is literal in "1". Prefer explicit escaping: "\1". [SC1117]
          214327.sh:35:26: error: > is for string comparisons. Use -gt instead. [SC2071]


          Taking the first two, I'd simply use $0 to reproduce the program name as it was invoked, rather than messing about with basename to modify it. The other two appear to be mere typos in the code, and the fixes are obvious.



          We might want to perform some sanity checks on $recipient; in any case, it's wise to indicate that it's an argument and not an option when invoking sendmail, by using -- as a separator.



          The repeated tests for [[ -n $subject_length ]] could be combined into a single block:



          sed_filters=()

          if [[ -n $subject_length ]]
          then
          if [[ $subject_length != +([0-9]) ]]
          then
          echo "Error: subject length must be a whole number"
          exit 1
          fi

          sed_filters+=( -e "s/^(Subject: .{1,$subject_length}).*/\1/" )
          fi

          # other filters can go here


          Instead of choosing between sed and cat, we could simplify by unconditionally using sed, even if we do no filtering, by priming the filters list with an empty command:



          sed_filters=(-e '')

          # conditionally add to sed_filters

          # now, filter the incoming email (on stdin) and pass to sendmail
          sed -E "${sed_filters[@]}" | /usr/sbin/sendmail -oi -- "$recipient"


          sed with an empty program acts as cat.



          The sed line may match body text as well as headers; we probably want to replace only the latter. We can do that by adding an address prefix:



          1,/^$/s/^(Subject: .{1,$subject_length}).*/1/i


          (Note that RFC-822 headers are specified case-insensitively, so let's take that into account, using /i).






          share|improve this answer











          $endgroup$





















            0












            $begingroup$

            Both good suggestions. 200_success's answer was more alarming. I have re-implemented in perl



            #!/usr/bin/perl

            use strict;
            use warnings;

            use Email::Simple;
            use Email::Sender::Simple qw/sendmail/;
            use Encode qw/encode decode/;
            use Getopt::Std;

            my $usage = "usage: $0 [-s n] recipient";
            my $where = "where: -s n => truncate the subject to n characters";
            my $subject_length;

            our($opt_h, $opt_s);
            getopts('hs:');

            my $recipient = shift @ARGV;
            unless ($recipient) {
            warn "Error: no recipientn$usagen";
            exit 1;
            }

            if ($opt_h) {
            warn "$usagen$wheren";
            exit;
            }

            if (defined $opt_s) {
            if ($opt_s =~ /^d+$/) {
            $subject_length = $opt_s;
            } else {
            warn "Error: length must be a whole number: '$opt_s'nIgnoring option.n";
            }
            }

            # slurp in the email from stdin
            my $email_text;
            {
            local $/;
            $email_text = <>;
            }
            my $email = Email::Simple->new($email_text);

            # Subject line truncation
            if (defined $subject_length) {
            my $subject = decode("MIME-Header", $email->header("subject"));
            $subject =~ s/^s+|s+$//g; # trim
            $email->header_set(
            "subject",
            encode("MIME-Header", substr($subject, 0, $subject_length))
            );
            }

            sendmail($email, { to => $recipient });





            share|improve this answer









            $endgroup$













              Your Answer





              StackExchange.ifUsing("editor", function () {
              return StackExchange.using("mathjaxEditing", function () {
              StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
              StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
              });
              });
              }, "mathjax-editing");

              StackExchange.ifUsing("editor", function () {
              StackExchange.using("externalEditor", function () {
              StackExchange.using("snippets", function () {
              StackExchange.snippets.init();
              });
              });
              }, "code-snippets");

              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "196"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: false,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: null,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });














              draft saved

              draft discarded


















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f214327%2fbash-script-to-truncate-subject-line-of-incoming-email%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              3 Answers
              3






              active

              oldest

              votes








              3 Answers
              3






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              5












              $begingroup$



              Be sure to read the relevant RFCs that govern e-mail headers! Specifically:





              • RFC 2822, Section 1.2.2: Header names are case-insensitive.


              • RFC 2822, Section 2.2.3: Header fields may be line-folded:




                2.2.3. Long Header Fields



                Each header field is logically a single line of characters
                comprising the field name, the colon, and the field body. For
                convenience however, and to deal with the 998/78 character
                limitations per line, the field body portion of a header field can
                be split into a multiple line representation; this is called
                "folding". The general rule is that wherever this standard allows
                for folding white space (not simply WSP characters), a CRLF may be
                inserted before any WSP. For example, the header field:



                Subject: This is a test


                can be represented as:



                Subject: This
                is a test



                Since your sed operates on the raw representation of the header, you will miss headers that are logically longer than subject_length characters long, but start with a physically short line.



                What is your rationale for developing this filter? Is the application that processes the incoming messages unable to handle long subject texts, or is it unable to handle long physical lines? If it's the latter, maybe all you need is a filter that performs line folding, rather than truncation.




              • RFC 2047: Encoding mechanisms for non-ASCII headers. A logical subject line



                Subject: this is some text


                … could also be represented physically as



                Subject: =?iso-8859-1?q?this=20is=20some=20text?=


                … or by many other representations. Is your limit based on the number of bytes in the raw representation, the number of bytes in the UTF-8 representation, the number of Unicode characters, or something else? You didn't specify clearly. If you are truncating the raw representation, you might truncate a MIME-encoded header at a point that makes it syntactically invalid.








              share|improve this answer











              $endgroup$













              • $begingroup$
                OK, I can use formail -czx subject to extract the folded subject into a single line, but do you know what tools are available to help with encoded subjects?
                $endgroup$
                – glenn jackman
                8 hours ago












              • $begingroup$
                mailutils 2047 --decode can perform RFC2047 decoding. (Note, however, that the feature is broken in Debian/Ubuntu's GNU mailutils package.)
                $endgroup$
                – 200_success
                5 hours ago
















              5












              $begingroup$



              Be sure to read the relevant RFCs that govern e-mail headers! Specifically:





              • RFC 2822, Section 1.2.2: Header names are case-insensitive.


              • RFC 2822, Section 2.2.3: Header fields may be line-folded:




                2.2.3. Long Header Fields



                Each header field is logically a single line of characters
                comprising the field name, the colon, and the field body. For
                convenience however, and to deal with the 998/78 character
                limitations per line, the field body portion of a header field can
                be split into a multiple line representation; this is called
                "folding". The general rule is that wherever this standard allows
                for folding white space (not simply WSP characters), a CRLF may be
                inserted before any WSP. For example, the header field:



                Subject: This is a test


                can be represented as:



                Subject: This
                is a test



                Since your sed operates on the raw representation of the header, you will miss headers that are logically longer than subject_length characters long, but start with a physically short line.



                What is your rationale for developing this filter? Is the application that processes the incoming messages unable to handle long subject texts, or is it unable to handle long physical lines? If it's the latter, maybe all you need is a filter that performs line folding, rather than truncation.




              • RFC 2047: Encoding mechanisms for non-ASCII headers. A logical subject line



                Subject: this is some text


                … could also be represented physically as



                Subject: =?iso-8859-1?q?this=20is=20some=20text?=


                … or by many other representations. Is your limit based on the number of bytes in the raw representation, the number of bytes in the UTF-8 representation, the number of Unicode characters, or something else? You didn't specify clearly. If you are truncating the raw representation, you might truncate a MIME-encoded header at a point that makes it syntactically invalid.








              share|improve this answer











              $endgroup$













              • $begingroup$
                OK, I can use formail -czx subject to extract the folded subject into a single line, but do you know what tools are available to help with encoded subjects?
                $endgroup$
                – glenn jackman
                8 hours ago












              • $begingroup$
                mailutils 2047 --decode can perform RFC2047 decoding. (Note, however, that the feature is broken in Debian/Ubuntu's GNU mailutils package.)
                $endgroup$
                – 200_success
                5 hours ago














              5












              5








              5





              $begingroup$



              Be sure to read the relevant RFCs that govern e-mail headers! Specifically:





              • RFC 2822, Section 1.2.2: Header names are case-insensitive.


              • RFC 2822, Section 2.2.3: Header fields may be line-folded:




                2.2.3. Long Header Fields



                Each header field is logically a single line of characters
                comprising the field name, the colon, and the field body. For
                convenience however, and to deal with the 998/78 character
                limitations per line, the field body portion of a header field can
                be split into a multiple line representation; this is called
                "folding". The general rule is that wherever this standard allows
                for folding white space (not simply WSP characters), a CRLF may be
                inserted before any WSP. For example, the header field:



                Subject: This is a test


                can be represented as:



                Subject: This
                is a test



                Since your sed operates on the raw representation of the header, you will miss headers that are logically longer than subject_length characters long, but start with a physically short line.



                What is your rationale for developing this filter? Is the application that processes the incoming messages unable to handle long subject texts, or is it unable to handle long physical lines? If it's the latter, maybe all you need is a filter that performs line folding, rather than truncation.




              • RFC 2047: Encoding mechanisms for non-ASCII headers. A logical subject line



                Subject: this is some text


                … could also be represented physically as



                Subject: =?iso-8859-1?q?this=20is=20some=20text?=


                … or by many other representations. Is your limit based on the number of bytes in the raw representation, the number of bytes in the UTF-8 representation, the number of Unicode characters, or something else? You didn't specify clearly. If you are truncating the raw representation, you might truncate a MIME-encoded header at a point that makes it syntactically invalid.








              share|improve this answer











              $endgroup$





              Be sure to read the relevant RFCs that govern e-mail headers! Specifically:





              • RFC 2822, Section 1.2.2: Header names are case-insensitive.


              • RFC 2822, Section 2.2.3: Header fields may be line-folded:




                2.2.3. Long Header Fields



                Each header field is logically a single line of characters
                comprising the field name, the colon, and the field body. For
                convenience however, and to deal with the 998/78 character
                limitations per line, the field body portion of a header field can
                be split into a multiple line representation; this is called
                "folding". The general rule is that wherever this standard allows
                for folding white space (not simply WSP characters), a CRLF may be
                inserted before any WSP. For example, the header field:



                Subject: This is a test


                can be represented as:



                Subject: This
                is a test



                Since your sed operates on the raw representation of the header, you will miss headers that are logically longer than subject_length characters long, but start with a physically short line.



                What is your rationale for developing this filter? Is the application that processes the incoming messages unable to handle long subject texts, or is it unable to handle long physical lines? If it's the latter, maybe all you need is a filter that performs line folding, rather than truncation.




              • RFC 2047: Encoding mechanisms for non-ASCII headers. A logical subject line



                Subject: this is some text


                … could also be represented physically as



                Subject: =?iso-8859-1?q?this=20is=20some=20text?=


                … or by many other representations. Is your limit based on the number of bytes in the raw representation, the number of bytes in the UTF-8 representation, the number of Unicode characters, or something else? You didn't specify clearly. If you are truncating the raw representation, you might truncate a MIME-encoded header at a point that makes it syntactically invalid.









              share|improve this answer














              share|improve this answer



              share|improve this answer








              edited 8 hours ago

























              answered 8 hours ago









              200_success200_success

              130k16153417




              130k16153417












              • $begingroup$
                OK, I can use formail -czx subject to extract the folded subject into a single line, but do you know what tools are available to help with encoded subjects?
                $endgroup$
                – glenn jackman
                8 hours ago












              • $begingroup$
                mailutils 2047 --decode can perform RFC2047 decoding. (Note, however, that the feature is broken in Debian/Ubuntu's GNU mailutils package.)
                $endgroup$
                – 200_success
                5 hours ago


















              • $begingroup$
                OK, I can use formail -czx subject to extract the folded subject into a single line, but do you know what tools are available to help with encoded subjects?
                $endgroup$
                – glenn jackman
                8 hours ago












              • $begingroup$
                mailutils 2047 --decode can perform RFC2047 decoding. (Note, however, that the feature is broken in Debian/Ubuntu's GNU mailutils package.)
                $endgroup$
                – 200_success
                5 hours ago
















              $begingroup$
              OK, I can use formail -czx subject to extract the folded subject into a single line, but do you know what tools are available to help with encoded subjects?
              $endgroup$
              – glenn jackman
              8 hours ago






              $begingroup$
              OK, I can use formail -czx subject to extract the folded subject into a single line, but do you know what tools are available to help with encoded subjects?
              $endgroup$
              – glenn jackman
              8 hours ago














              $begingroup$
              mailutils 2047 --decode can perform RFC2047 decoding. (Note, however, that the feature is broken in Debian/Ubuntu's GNU mailutils package.)
              $endgroup$
              – 200_success
              5 hours ago




              $begingroup$
              mailutils 2047 --decode can perform RFC2047 decoding. (Note, however, that the feature is broken in Debian/Ubuntu's GNU mailutils package.)
              $endgroup$
              – 200_success
              5 hours ago













              6












              $begingroup$

              Generally good code - plus points for good use of stdout/stderr and exit status.



              Shellcheck reported some issues:



              shellcheck -f gcc  214327.sh
              214327.sh:4:19: warning: Expanding an array without an index only gives the first element. [SC2128]
              214327.sh:4:19: note: Double quote to prevent globbing and word splitting. [SC2086]
              214327.sh:31:61: note: Backslash is literal in "1". Prefer explicit escaping: "\1". [SC1117]
              214327.sh:35:26: error: > is for string comparisons. Use -gt instead. [SC2071]


              Taking the first two, I'd simply use $0 to reproduce the program name as it was invoked, rather than messing about with basename to modify it. The other two appear to be mere typos in the code, and the fixes are obvious.



              We might want to perform some sanity checks on $recipient; in any case, it's wise to indicate that it's an argument and not an option when invoking sendmail, by using -- as a separator.



              The repeated tests for [[ -n $subject_length ]] could be combined into a single block:



              sed_filters=()

              if [[ -n $subject_length ]]
              then
              if [[ $subject_length != +([0-9]) ]]
              then
              echo "Error: subject length must be a whole number"
              exit 1
              fi

              sed_filters+=( -e "s/^(Subject: .{1,$subject_length}).*/\1/" )
              fi

              # other filters can go here


              Instead of choosing between sed and cat, we could simplify by unconditionally using sed, even if we do no filtering, by priming the filters list with an empty command:



              sed_filters=(-e '')

              # conditionally add to sed_filters

              # now, filter the incoming email (on stdin) and pass to sendmail
              sed -E "${sed_filters[@]}" | /usr/sbin/sendmail -oi -- "$recipient"


              sed with an empty program acts as cat.



              The sed line may match body text as well as headers; we probably want to replace only the latter. We can do that by adding an address prefix:



              1,/^$/s/^(Subject: .{1,$subject_length}).*/1/i


              (Note that RFC-822 headers are specified case-insensitively, so let's take that into account, using /i).






              share|improve this answer











              $endgroup$


















                6












                $begingroup$

                Generally good code - plus points for good use of stdout/stderr and exit status.



                Shellcheck reported some issues:



                shellcheck -f gcc  214327.sh
                214327.sh:4:19: warning: Expanding an array without an index only gives the first element. [SC2128]
                214327.sh:4:19: note: Double quote to prevent globbing and word splitting. [SC2086]
                214327.sh:31:61: note: Backslash is literal in "1". Prefer explicit escaping: "\1". [SC1117]
                214327.sh:35:26: error: > is for string comparisons. Use -gt instead. [SC2071]


                Taking the first two, I'd simply use $0 to reproduce the program name as it was invoked, rather than messing about with basename to modify it. The other two appear to be mere typos in the code, and the fixes are obvious.



                We might want to perform some sanity checks on $recipient; in any case, it's wise to indicate that it's an argument and not an option when invoking sendmail, by using -- as a separator.



                The repeated tests for [[ -n $subject_length ]] could be combined into a single block:



                sed_filters=()

                if [[ -n $subject_length ]]
                then
                if [[ $subject_length != +([0-9]) ]]
                then
                echo "Error: subject length must be a whole number"
                exit 1
                fi

                sed_filters+=( -e "s/^(Subject: .{1,$subject_length}).*/\1/" )
                fi

                # other filters can go here


                Instead of choosing between sed and cat, we could simplify by unconditionally using sed, even if we do no filtering, by priming the filters list with an empty command:



                sed_filters=(-e '')

                # conditionally add to sed_filters

                # now, filter the incoming email (on stdin) and pass to sendmail
                sed -E "${sed_filters[@]}" | /usr/sbin/sendmail -oi -- "$recipient"


                sed with an empty program acts as cat.



                The sed line may match body text as well as headers; we probably want to replace only the latter. We can do that by adding an address prefix:



                1,/^$/s/^(Subject: .{1,$subject_length}).*/1/i


                (Note that RFC-822 headers are specified case-insensitively, so let's take that into account, using /i).






                share|improve this answer











                $endgroup$
















                  6












                  6








                  6





                  $begingroup$

                  Generally good code - plus points for good use of stdout/stderr and exit status.



                  Shellcheck reported some issues:



                  shellcheck -f gcc  214327.sh
                  214327.sh:4:19: warning: Expanding an array without an index only gives the first element. [SC2128]
                  214327.sh:4:19: note: Double quote to prevent globbing and word splitting. [SC2086]
                  214327.sh:31:61: note: Backslash is literal in "1". Prefer explicit escaping: "\1". [SC1117]
                  214327.sh:35:26: error: > is for string comparisons. Use -gt instead. [SC2071]


                  Taking the first two, I'd simply use $0 to reproduce the program name as it was invoked, rather than messing about with basename to modify it. The other two appear to be mere typos in the code, and the fixes are obvious.



                  We might want to perform some sanity checks on $recipient; in any case, it's wise to indicate that it's an argument and not an option when invoking sendmail, by using -- as a separator.



                  The repeated tests for [[ -n $subject_length ]] could be combined into a single block:



                  sed_filters=()

                  if [[ -n $subject_length ]]
                  then
                  if [[ $subject_length != +([0-9]) ]]
                  then
                  echo "Error: subject length must be a whole number"
                  exit 1
                  fi

                  sed_filters+=( -e "s/^(Subject: .{1,$subject_length}).*/\1/" )
                  fi

                  # other filters can go here


                  Instead of choosing between sed and cat, we could simplify by unconditionally using sed, even if we do no filtering, by priming the filters list with an empty command:



                  sed_filters=(-e '')

                  # conditionally add to sed_filters

                  # now, filter the incoming email (on stdin) and pass to sendmail
                  sed -E "${sed_filters[@]}" | /usr/sbin/sendmail -oi -- "$recipient"


                  sed with an empty program acts as cat.



                  The sed line may match body text as well as headers; we probably want to replace only the latter. We can do that by adding an address prefix:



                  1,/^$/s/^(Subject: .{1,$subject_length}).*/1/i


                  (Note that RFC-822 headers are specified case-insensitively, so let's take that into account, using /i).






                  share|improve this answer











                  $endgroup$



                  Generally good code - plus points for good use of stdout/stderr and exit status.



                  Shellcheck reported some issues:



                  shellcheck -f gcc  214327.sh
                  214327.sh:4:19: warning: Expanding an array without an index only gives the first element. [SC2128]
                  214327.sh:4:19: note: Double quote to prevent globbing and word splitting. [SC2086]
                  214327.sh:31:61: note: Backslash is literal in "1". Prefer explicit escaping: "\1". [SC1117]
                  214327.sh:35:26: error: > is for string comparisons. Use -gt instead. [SC2071]


                  Taking the first two, I'd simply use $0 to reproduce the program name as it was invoked, rather than messing about with basename to modify it. The other two appear to be mere typos in the code, and the fixes are obvious.



                  We might want to perform some sanity checks on $recipient; in any case, it's wise to indicate that it's an argument and not an option when invoking sendmail, by using -- as a separator.



                  The repeated tests for [[ -n $subject_length ]] could be combined into a single block:



                  sed_filters=()

                  if [[ -n $subject_length ]]
                  then
                  if [[ $subject_length != +([0-9]) ]]
                  then
                  echo "Error: subject length must be a whole number"
                  exit 1
                  fi

                  sed_filters+=( -e "s/^(Subject: .{1,$subject_length}).*/\1/" )
                  fi

                  # other filters can go here


                  Instead of choosing between sed and cat, we could simplify by unconditionally using sed, even if we do no filtering, by priming the filters list with an empty command:



                  sed_filters=(-e '')

                  # conditionally add to sed_filters

                  # now, filter the incoming email (on stdin) and pass to sendmail
                  sed -E "${sed_filters[@]}" | /usr/sbin/sendmail -oi -- "$recipient"


                  sed with an empty program acts as cat.



                  The sed line may match body text as well as headers; we probably want to replace only the latter. We can do that by adding an address prefix:



                  1,/^$/s/^(Subject: .{1,$subject_length}).*/1/i


                  (Note that RFC-822 headers are specified case-insensitively, so let's take that into account, using /i).







                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited 9 hours ago

























                  answered 11 hours ago









                  Toby SpeightToby Speight

                  24.9k740115




                  24.9k740115























                      0












                      $begingroup$

                      Both good suggestions. 200_success's answer was more alarming. I have re-implemented in perl



                      #!/usr/bin/perl

                      use strict;
                      use warnings;

                      use Email::Simple;
                      use Email::Sender::Simple qw/sendmail/;
                      use Encode qw/encode decode/;
                      use Getopt::Std;

                      my $usage = "usage: $0 [-s n] recipient";
                      my $where = "where: -s n => truncate the subject to n characters";
                      my $subject_length;

                      our($opt_h, $opt_s);
                      getopts('hs:');

                      my $recipient = shift @ARGV;
                      unless ($recipient) {
                      warn "Error: no recipientn$usagen";
                      exit 1;
                      }

                      if ($opt_h) {
                      warn "$usagen$wheren";
                      exit;
                      }

                      if (defined $opt_s) {
                      if ($opt_s =~ /^d+$/) {
                      $subject_length = $opt_s;
                      } else {
                      warn "Error: length must be a whole number: '$opt_s'nIgnoring option.n";
                      }
                      }

                      # slurp in the email from stdin
                      my $email_text;
                      {
                      local $/;
                      $email_text = <>;
                      }
                      my $email = Email::Simple->new($email_text);

                      # Subject line truncation
                      if (defined $subject_length) {
                      my $subject = decode("MIME-Header", $email->header("subject"));
                      $subject =~ s/^s+|s+$//g; # trim
                      $email->header_set(
                      "subject",
                      encode("MIME-Header", substr($subject, 0, $subject_length))
                      );
                      }

                      sendmail($email, { to => $recipient });





                      share|improve this answer









                      $endgroup$


















                        0












                        $begingroup$

                        Both good suggestions. 200_success's answer was more alarming. I have re-implemented in perl



                        #!/usr/bin/perl

                        use strict;
                        use warnings;

                        use Email::Simple;
                        use Email::Sender::Simple qw/sendmail/;
                        use Encode qw/encode decode/;
                        use Getopt::Std;

                        my $usage = "usage: $0 [-s n] recipient";
                        my $where = "where: -s n => truncate the subject to n characters";
                        my $subject_length;

                        our($opt_h, $opt_s);
                        getopts('hs:');

                        my $recipient = shift @ARGV;
                        unless ($recipient) {
                        warn "Error: no recipientn$usagen";
                        exit 1;
                        }

                        if ($opt_h) {
                        warn "$usagen$wheren";
                        exit;
                        }

                        if (defined $opt_s) {
                        if ($opt_s =~ /^d+$/) {
                        $subject_length = $opt_s;
                        } else {
                        warn "Error: length must be a whole number: '$opt_s'nIgnoring option.n";
                        }
                        }

                        # slurp in the email from stdin
                        my $email_text;
                        {
                        local $/;
                        $email_text = <>;
                        }
                        my $email = Email::Simple->new($email_text);

                        # Subject line truncation
                        if (defined $subject_length) {
                        my $subject = decode("MIME-Header", $email->header("subject"));
                        $subject =~ s/^s+|s+$//g; # trim
                        $email->header_set(
                        "subject",
                        encode("MIME-Header", substr($subject, 0, $subject_length))
                        );
                        }

                        sendmail($email, { to => $recipient });





                        share|improve this answer









                        $endgroup$
















                          0












                          0








                          0





                          $begingroup$

                          Both good suggestions. 200_success's answer was more alarming. I have re-implemented in perl



                          #!/usr/bin/perl

                          use strict;
                          use warnings;

                          use Email::Simple;
                          use Email::Sender::Simple qw/sendmail/;
                          use Encode qw/encode decode/;
                          use Getopt::Std;

                          my $usage = "usage: $0 [-s n] recipient";
                          my $where = "where: -s n => truncate the subject to n characters";
                          my $subject_length;

                          our($opt_h, $opt_s);
                          getopts('hs:');

                          my $recipient = shift @ARGV;
                          unless ($recipient) {
                          warn "Error: no recipientn$usagen";
                          exit 1;
                          }

                          if ($opt_h) {
                          warn "$usagen$wheren";
                          exit;
                          }

                          if (defined $opt_s) {
                          if ($opt_s =~ /^d+$/) {
                          $subject_length = $opt_s;
                          } else {
                          warn "Error: length must be a whole number: '$opt_s'nIgnoring option.n";
                          }
                          }

                          # slurp in the email from stdin
                          my $email_text;
                          {
                          local $/;
                          $email_text = <>;
                          }
                          my $email = Email::Simple->new($email_text);

                          # Subject line truncation
                          if (defined $subject_length) {
                          my $subject = decode("MIME-Header", $email->header("subject"));
                          $subject =~ s/^s+|s+$//g; # trim
                          $email->header_set(
                          "subject",
                          encode("MIME-Header", substr($subject, 0, $subject_length))
                          );
                          }

                          sendmail($email, { to => $recipient });





                          share|improve this answer









                          $endgroup$



                          Both good suggestions. 200_success's answer was more alarming. I have re-implemented in perl



                          #!/usr/bin/perl

                          use strict;
                          use warnings;

                          use Email::Simple;
                          use Email::Sender::Simple qw/sendmail/;
                          use Encode qw/encode decode/;
                          use Getopt::Std;

                          my $usage = "usage: $0 [-s n] recipient";
                          my $where = "where: -s n => truncate the subject to n characters";
                          my $subject_length;

                          our($opt_h, $opt_s);
                          getopts('hs:');

                          my $recipient = shift @ARGV;
                          unless ($recipient) {
                          warn "Error: no recipientn$usagen";
                          exit 1;
                          }

                          if ($opt_h) {
                          warn "$usagen$wheren";
                          exit;
                          }

                          if (defined $opt_s) {
                          if ($opt_s =~ /^d+$/) {
                          $subject_length = $opt_s;
                          } else {
                          warn "Error: length must be a whole number: '$opt_s'nIgnoring option.n";
                          }
                          }

                          # slurp in the email from stdin
                          my $email_text;
                          {
                          local $/;
                          $email_text = <>;
                          }
                          my $email = Email::Simple->new($email_text);

                          # Subject line truncation
                          if (defined $subject_length) {
                          my $subject = decode("MIME-Header", $email->header("subject"));
                          $subject =~ s/^s+|s+$//g; # trim
                          $email->header_set(
                          "subject",
                          encode("MIME-Header", substr($subject, 0, $subject_length))
                          );
                          }

                          sendmail($email, { to => $recipient });






                          share|improve this answer












                          share|improve this answer



                          share|improve this answer










                          answered 1 hour ago









                          glenn jackmanglenn jackman

                          1,739711




                          1,739711






























                              draft saved

                              draft discarded




















































                              Thanks for contributing an answer to Code Review Stack Exchange!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              Use MathJax to format equations. MathJax reference.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function () {
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f214327%2fbash-script-to-truncate-subject-line-of-incoming-email%23new-answer', 'question_page');
                              }
                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              "Incorrect syntax near the keyword 'ON'. (on update cascade, on delete cascade,)

                              Alcedinidae

                              Origin of the phrase “under your belt”?