How to sort by odd lines then remove repeated values?












6















I have the following type of file:



transcr_25793 +
YAL039C -
transcr_25793 +
YAL037C-B -
transcr_20649 +
YBL100C -
transcr_7135 +
YBL029C-A -
transcr_11317 +
YBL067C -
transcr_25793 +
YAL038W +
transcr_7135 +
YBL029W +


I was trying to get something like this:



transcr_7135 +
YBL029C-A -
transcr_7135 +
YBL029W +
transcr_11317 +
YBL067C -
transcr_20649 +
YBL100C -
transcr_25793 +
YAL039C -
transcr_25793 +
YAL037C-B -
transcr_25793 +
YAL038W +


Then, afterward, I was looking for something like this:



transcr_7135 +
YBL029C-A -
YBL029W +
transcr_11317 +
YBL067C -
transcr_20649 +
YBL100C -
transcr_25793 +
YAL039C -
YAL037C-B -
YAL038W +


I've scrolled through sort manual and some posts, but couldn't find anything that fit near this, just sort using numerical values to get odd lines...










share|improve this question





























    6















    I have the following type of file:



    transcr_25793 +
    YAL039C -
    transcr_25793 +
    YAL037C-B -
    transcr_20649 +
    YBL100C -
    transcr_7135 +
    YBL029C-A -
    transcr_11317 +
    YBL067C -
    transcr_25793 +
    YAL038W +
    transcr_7135 +
    YBL029W +


    I was trying to get something like this:



    transcr_7135 +
    YBL029C-A -
    transcr_7135 +
    YBL029W +
    transcr_11317 +
    YBL067C -
    transcr_20649 +
    YBL100C -
    transcr_25793 +
    YAL039C -
    transcr_25793 +
    YAL037C-B -
    transcr_25793 +
    YAL038W +


    Then, afterward, I was looking for something like this:



    transcr_7135 +
    YBL029C-A -
    YBL029W +
    transcr_11317 +
    YBL067C -
    transcr_20649 +
    YBL100C -
    transcr_25793 +
    YAL039C -
    YAL037C-B -
    YAL038W +


    I've scrolled through sort manual and some posts, but couldn't find anything that fit near this, just sort using numerical values to get odd lines...










    share|improve this question



























      6












      6








      6


      2






      I have the following type of file:



      transcr_25793 +
      YAL039C -
      transcr_25793 +
      YAL037C-B -
      transcr_20649 +
      YBL100C -
      transcr_7135 +
      YBL029C-A -
      transcr_11317 +
      YBL067C -
      transcr_25793 +
      YAL038W +
      transcr_7135 +
      YBL029W +


      I was trying to get something like this:



      transcr_7135 +
      YBL029C-A -
      transcr_7135 +
      YBL029W +
      transcr_11317 +
      YBL067C -
      transcr_20649 +
      YBL100C -
      transcr_25793 +
      YAL039C -
      transcr_25793 +
      YAL037C-B -
      transcr_25793 +
      YAL038W +


      Then, afterward, I was looking for something like this:



      transcr_7135 +
      YBL029C-A -
      YBL029W +
      transcr_11317 +
      YBL067C -
      transcr_20649 +
      YBL100C -
      transcr_25793 +
      YAL039C -
      YAL037C-B -
      YAL038W +


      I've scrolled through sort manual and some posts, but couldn't find anything that fit near this, just sort using numerical values to get odd lines...










      share|improve this question
















      I have the following type of file:



      transcr_25793 +
      YAL039C -
      transcr_25793 +
      YAL037C-B -
      transcr_20649 +
      YBL100C -
      transcr_7135 +
      YBL029C-A -
      transcr_11317 +
      YBL067C -
      transcr_25793 +
      YAL038W +
      transcr_7135 +
      YBL029W +


      I was trying to get something like this:



      transcr_7135 +
      YBL029C-A -
      transcr_7135 +
      YBL029W +
      transcr_11317 +
      YBL067C -
      transcr_20649 +
      YBL100C -
      transcr_25793 +
      YAL039C -
      transcr_25793 +
      YAL037C-B -
      transcr_25793 +
      YAL038W +


      Then, afterward, I was looking for something like this:



      transcr_7135 +
      YBL029C-A -
      YBL029W +
      transcr_11317 +
      YBL067C -
      transcr_20649 +
      YBL100C -
      transcr_25793 +
      YAL039C -
      YAL037C-B -
      YAL038W +


      I've scrolled through sort manual and some posts, but couldn't find anything that fit near this, just sort using numerical values to get odd lines...







      text-processing sort






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited 2 days ago









      Rui F Ribeiro

      39.6k1479132




      39.6k1479132










      asked 2 days ago









      Lucas Farinazzo MarquesLucas Farinazzo Marques

      645




      645






















          5 Answers
          5






          active

          oldest

          votes


















          3














          Not exactly the sorting order you've showed, but maby right as well?



          $ cat input.txt|paste - -| sort -k1,1V -k2,2| tr "t" "n" | awk '{if($0 in line == 0) {line[$0]; print}}'
          transcr_7135 +
          YBL029C-A -
          YBL029W +
          transcr_11317 +
          YBL067C -
          transcr_20649 +
          YBL100C -
          transcr_25793 +
          YAL037C-B -
          YAL038W +
          YAL039C -


          EDIT:



          Insert the line number and uses it as a sorting key, should produce the exact output you like:



          $ cat input.txt | paste - - | nl | sort -k2,2V -k1,1g | cut -f2- | tr "t" "n" | awk '{if($0 in line == 0) {line[$0]; print}}'





          share|improve this answer


























          • It doesn't show exactly the way I posted, but for my case it doesn't matter, thanks a lot!

            – Lucas Farinazzo Marques
            2 days ago



















          7














          Pure gawk solution:



          awk -F_ 'NR%2{i=$2;next}{a[i]=a[i]"n"$0}
          END{PROCINFO["sorted_in"]="@ind_num_asc";
          for(i in a) printf "%s","transcr_"i""a[i]"n"}' file


          The trick is to sort indexes of array a numerically with a little help of gawk's PROCINFO special array.



          transcr_7135
          YBL029C-A -
          YBL029W +
          transcr_11317
          YBL067C -
          transcr_20649
          YBL100C -
          transcr_25793
          YAL039C -
          YAL037C-B -
          YAL038W +


          BTW, its a pity awk doesn't offer an option to sort naturally a.k.a. version sort (according to text with numbers).






          share|improve this answer

































            3














            With GNU sort and assuming the lines don't contain TAB characters:



            paste - - < file | sort -V | tr 't' 'n' | awk '!seen[$0]++'


            Or sort -t$'t' -sk1,1V to preserve the original order for entries with identical odd lines like in your expected output.



            If you don't have GNU sort, and assuming the odd lines always follow that pattern, you can replace sort -V with sort -k1.9n.






            share|improve this answer

































              2














              for element in $(sed -n 'p;n' a.txt |sort -nk 1.9 |uniq |awk '{print $1}')
              do
              echo $element; cat a.txt |grep -A1 $i |grep -v trans |grep -v \\--
              done


              Where a.txt is your input. Tested:



              [root@megatron ~]# cat a.txt
              transcr_25793 +
              YAL039C -
              transcr_25793 +
              YAL037C-B -
              transcr_20649 +
              YBL100C -
              transcr_7135 +
              YBL029C-A -
              transcr_11317 +
              YBL067C -
              transcr_25793 +
              YAL038W +
              transcr_7135 +
              YBL029W +
              [root@megatron ~]# for i in $(sed -n 'p;n' a.txt |sort -nk 1.9 |uniq |awk '{print $1}')
              do
              echo $i; cat a.txt |grep -A1 $i |grep -v trans |grep -v \\--
              done
              transcr_7135
              YBL029C-A -
              YBL029W +
              transcr_11317
              YBL067C -
              transcr_20649
              YBL100C -
              transcr_25793
              YAL039C -
              YAL037C-B -
              YAL038W +
              [root@megatron ~]#





              share|improve this answer


























              • It appeared this transcr_45 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_193 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_231 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_282 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. and so on... Do you know what I can do?

                – Lucas Farinazzo Marques
                2 days ago











              • I corrected. You have to use |grep -v \-- . Or just try to re-use the updated code.

                – Zatarra
                2 days ago













              • Even when exactly copying your solution and even the file with the same name, it doesn't work out... It keeps showing the same error

                – Lucas Farinazzo Marques
                2 days ago











              • You can download it from here: wget 54.38.222.163/sort.sh and run it with sh sort.sh . The text must be in a.txt

                – Zatarra
                2 days ago













              • Now it worked, thanks!

                – Lucas Farinazzo Marques
                2 days ago



















              0














              Pre- and postprocessing with awk; this does not assume that a transcr line is followed by just one Y* line; it's also idempotent -- its output could be piped back as input and it will give the same result.



              awk '{print $0~/^transcr/ ? t=$0 : t" "$0}' /tmp/foo | sort -t_ -k2n -k2 -u | awk '{print (NF > 2) ? $3" "$4 : $0}'
              transcr_7135 +
              YBL029C-A -
              YBL029W +
              transcr_11317 +
              YBL067C -
              transcr_20649 +
              YBL100C -
              transcr_25793 +
              YAL037C-B -
              YAL038W +
              YAL039C -





              share|improve this answer























                Your Answer








                StackExchange.ready(function() {
                var channelOptions = {
                tags: "".split(" "),
                id: "106"
                };
                initTagRenderer("".split(" "), "".split(" "), channelOptions);

                StackExchange.using("externalEditor", function() {
                // Have to fire editor after snippets, if snippets enabled
                if (StackExchange.settings.snippets.snippetsEnabled) {
                StackExchange.using("snippets", function() {
                createEditor();
                });
                }
                else {
                createEditor();
                }
                });

                function createEditor() {
                StackExchange.prepareEditor({
                heartbeatType: 'answer',
                autoActivateHeartbeat: false,
                convertImagesToLinks: false,
                noModals: true,
                showLowRepImageUploadWarning: true,
                reputationToPostImages: null,
                bindNavPrevention: true,
                postfix: "",
                imageUploader: {
                brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                allowUrls: true
                },
                onDemand: true,
                discardSelector: ".discard-answer"
                ,immediatelyShowMarkdownHelp:true
                });


                }
                });














                draft saved

                draft discarded


















                StackExchange.ready(
                function () {
                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f495272%2fhow-to-sort-by-odd-lines-then-remove-repeated-values%23new-answer', 'question_page');
                }
                );

                Post as a guest















                Required, but never shown

























                5 Answers
                5






                active

                oldest

                votes








                5 Answers
                5






                active

                oldest

                votes









                active

                oldest

                votes






                active

                oldest

                votes









                3














                Not exactly the sorting order you've showed, but maby right as well?



                $ cat input.txt|paste - -| sort -k1,1V -k2,2| tr "t" "n" | awk '{if($0 in line == 0) {line[$0]; print}}'
                transcr_7135 +
                YBL029C-A -
                YBL029W +
                transcr_11317 +
                YBL067C -
                transcr_20649 +
                YBL100C -
                transcr_25793 +
                YAL037C-B -
                YAL038W +
                YAL039C -


                EDIT:



                Insert the line number and uses it as a sorting key, should produce the exact output you like:



                $ cat input.txt | paste - - | nl | sort -k2,2V -k1,1g | cut -f2- | tr "t" "n" | awk '{if($0 in line == 0) {line[$0]; print}}'





                share|improve this answer


























                • It doesn't show exactly the way I posted, but for my case it doesn't matter, thanks a lot!

                  – Lucas Farinazzo Marques
                  2 days ago
















                3














                Not exactly the sorting order you've showed, but maby right as well?



                $ cat input.txt|paste - -| sort -k1,1V -k2,2| tr "t" "n" | awk '{if($0 in line == 0) {line[$0]; print}}'
                transcr_7135 +
                YBL029C-A -
                YBL029W +
                transcr_11317 +
                YBL067C -
                transcr_20649 +
                YBL100C -
                transcr_25793 +
                YAL037C-B -
                YAL038W +
                YAL039C -


                EDIT:



                Insert the line number and uses it as a sorting key, should produce the exact output you like:



                $ cat input.txt | paste - - | nl | sort -k2,2V -k1,1g | cut -f2- | tr "t" "n" | awk '{if($0 in line == 0) {line[$0]; print}}'





                share|improve this answer


























                • It doesn't show exactly the way I posted, but for my case it doesn't matter, thanks a lot!

                  – Lucas Farinazzo Marques
                  2 days ago














                3












                3








                3







                Not exactly the sorting order you've showed, but maby right as well?



                $ cat input.txt|paste - -| sort -k1,1V -k2,2| tr "t" "n" | awk '{if($0 in line == 0) {line[$0]; print}}'
                transcr_7135 +
                YBL029C-A -
                YBL029W +
                transcr_11317 +
                YBL067C -
                transcr_20649 +
                YBL100C -
                transcr_25793 +
                YAL037C-B -
                YAL038W +
                YAL039C -


                EDIT:



                Insert the line number and uses it as a sorting key, should produce the exact output you like:



                $ cat input.txt | paste - - | nl | sort -k2,2V -k1,1g | cut -f2- | tr "t" "n" | awk '{if($0 in line == 0) {line[$0]; print}}'





                share|improve this answer















                Not exactly the sorting order you've showed, but maby right as well?



                $ cat input.txt|paste - -| sort -k1,1V -k2,2| tr "t" "n" | awk '{if($0 in line == 0) {line[$0]; print}}'
                transcr_7135 +
                YBL029C-A -
                YBL029W +
                transcr_11317 +
                YBL067C -
                transcr_20649 +
                YBL100C -
                transcr_25793 +
                YAL037C-B -
                YAL038W +
                YAL039C -


                EDIT:



                Insert the line number and uses it as a sorting key, should produce the exact output you like:



                $ cat input.txt | paste - - | nl | sort -k2,2V -k1,1g | cut -f2- | tr "t" "n" | awk '{if($0 in line == 0) {line[$0]; print}}'






                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited 2 days ago

























                answered 2 days ago









                finswimmerfinswimmer

                1863




                1863













                • It doesn't show exactly the way I posted, but for my case it doesn't matter, thanks a lot!

                  – Lucas Farinazzo Marques
                  2 days ago



















                • It doesn't show exactly the way I posted, but for my case it doesn't matter, thanks a lot!

                  – Lucas Farinazzo Marques
                  2 days ago

















                It doesn't show exactly the way I posted, but for my case it doesn't matter, thanks a lot!

                – Lucas Farinazzo Marques
                2 days ago





                It doesn't show exactly the way I posted, but for my case it doesn't matter, thanks a lot!

                – Lucas Farinazzo Marques
                2 days ago













                7














                Pure gawk solution:



                awk -F_ 'NR%2{i=$2;next}{a[i]=a[i]"n"$0}
                END{PROCINFO["sorted_in"]="@ind_num_asc";
                for(i in a) printf "%s","transcr_"i""a[i]"n"}' file


                The trick is to sort indexes of array a numerically with a little help of gawk's PROCINFO special array.



                transcr_7135
                YBL029C-A -
                YBL029W +
                transcr_11317
                YBL067C -
                transcr_20649
                YBL100C -
                transcr_25793
                YAL039C -
                YAL037C-B -
                YAL038W +


                BTW, its a pity awk doesn't offer an option to sort naturally a.k.a. version sort (according to text with numbers).






                share|improve this answer






























                  7














                  Pure gawk solution:



                  awk -F_ 'NR%2{i=$2;next}{a[i]=a[i]"n"$0}
                  END{PROCINFO["sorted_in"]="@ind_num_asc";
                  for(i in a) printf "%s","transcr_"i""a[i]"n"}' file


                  The trick is to sort indexes of array a numerically with a little help of gawk's PROCINFO special array.



                  transcr_7135
                  YBL029C-A -
                  YBL029W +
                  transcr_11317
                  YBL067C -
                  transcr_20649
                  YBL100C -
                  transcr_25793
                  YAL039C -
                  YAL037C-B -
                  YAL038W +


                  BTW, its a pity awk doesn't offer an option to sort naturally a.k.a. version sort (according to text with numbers).






                  share|improve this answer




























                    7












                    7








                    7







                    Pure gawk solution:



                    awk -F_ 'NR%2{i=$2;next}{a[i]=a[i]"n"$0}
                    END{PROCINFO["sorted_in"]="@ind_num_asc";
                    for(i in a) printf "%s","transcr_"i""a[i]"n"}' file


                    The trick is to sort indexes of array a numerically with a little help of gawk's PROCINFO special array.



                    transcr_7135
                    YBL029C-A -
                    YBL029W +
                    transcr_11317
                    YBL067C -
                    transcr_20649
                    YBL100C -
                    transcr_25793
                    YAL039C -
                    YAL037C-B -
                    YAL038W +


                    BTW, its a pity awk doesn't offer an option to sort naturally a.k.a. version sort (according to text with numbers).






                    share|improve this answer















                    Pure gawk solution:



                    awk -F_ 'NR%2{i=$2;next}{a[i]=a[i]"n"$0}
                    END{PROCINFO["sorted_in"]="@ind_num_asc";
                    for(i in a) printf "%s","transcr_"i""a[i]"n"}' file


                    The trick is to sort indexes of array a numerically with a little help of gawk's PROCINFO special array.



                    transcr_7135
                    YBL029C-A -
                    YBL029W +
                    transcr_11317
                    YBL067C -
                    transcr_20649
                    YBL100C -
                    transcr_25793
                    YAL039C -
                    YAL037C-B -
                    YAL038W +


                    BTW, its a pity awk doesn't offer an option to sort naturally a.k.a. version sort (according to text with numbers).







                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited 2 days ago

























                    answered 2 days ago









                    jimmijjimmij

                    31.1k871106




                    31.1k871106























                        3














                        With GNU sort and assuming the lines don't contain TAB characters:



                        paste - - < file | sort -V | tr 't' 'n' | awk '!seen[$0]++'


                        Or sort -t$'t' -sk1,1V to preserve the original order for entries with identical odd lines like in your expected output.



                        If you don't have GNU sort, and assuming the odd lines always follow that pattern, you can replace sort -V with sort -k1.9n.






                        share|improve this answer






























                          3














                          With GNU sort and assuming the lines don't contain TAB characters:



                          paste - - < file | sort -V | tr 't' 'n' | awk '!seen[$0]++'


                          Or sort -t$'t' -sk1,1V to preserve the original order for entries with identical odd lines like in your expected output.



                          If you don't have GNU sort, and assuming the odd lines always follow that pattern, you can replace sort -V with sort -k1.9n.






                          share|improve this answer




























                            3












                            3








                            3







                            With GNU sort and assuming the lines don't contain TAB characters:



                            paste - - < file | sort -V | tr 't' 'n' | awk '!seen[$0]++'


                            Or sort -t$'t' -sk1,1V to preserve the original order for entries with identical odd lines like in your expected output.



                            If you don't have GNU sort, and assuming the odd lines always follow that pattern, you can replace sort -V with sort -k1.9n.






                            share|improve this answer















                            With GNU sort and assuming the lines don't contain TAB characters:



                            paste - - < file | sort -V | tr 't' 'n' | awk '!seen[$0]++'


                            Or sort -t$'t' -sk1,1V to preserve the original order for entries with identical odd lines like in your expected output.



                            If you don't have GNU sort, and assuming the odd lines always follow that pattern, you can replace sort -V with sort -k1.9n.







                            share|improve this answer














                            share|improve this answer



                            share|improve this answer








                            edited 2 days ago

























                            answered 2 days ago









                            Stéphane ChazelasStéphane Chazelas

                            301k55566918




                            301k55566918























                                2














                                for element in $(sed -n 'p;n' a.txt |sort -nk 1.9 |uniq |awk '{print $1}')
                                do
                                echo $element; cat a.txt |grep -A1 $i |grep -v trans |grep -v \\--
                                done


                                Where a.txt is your input. Tested:



                                [root@megatron ~]# cat a.txt
                                transcr_25793 +
                                YAL039C -
                                transcr_25793 +
                                YAL037C-B -
                                transcr_20649 +
                                YBL100C -
                                transcr_7135 +
                                YBL029C-A -
                                transcr_11317 +
                                YBL067C -
                                transcr_25793 +
                                YAL038W +
                                transcr_7135 +
                                YBL029W +
                                [root@megatron ~]# for i in $(sed -n 'p;n' a.txt |sort -nk 1.9 |uniq |awk '{print $1}')
                                do
                                echo $i; cat a.txt |grep -A1 $i |grep -v trans |grep -v \\--
                                done
                                transcr_7135
                                YBL029C-A -
                                YBL029W +
                                transcr_11317
                                YBL067C -
                                transcr_20649
                                YBL100C -
                                transcr_25793
                                YAL039C -
                                YAL037C-B -
                                YAL038W +
                                [root@megatron ~]#





                                share|improve this answer


























                                • It appeared this transcr_45 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_193 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_231 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_282 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. and so on... Do you know what I can do?

                                  – Lucas Farinazzo Marques
                                  2 days ago











                                • I corrected. You have to use |grep -v \-- . Or just try to re-use the updated code.

                                  – Zatarra
                                  2 days ago













                                • Even when exactly copying your solution and even the file with the same name, it doesn't work out... It keeps showing the same error

                                  – Lucas Farinazzo Marques
                                  2 days ago











                                • You can download it from here: wget 54.38.222.163/sort.sh and run it with sh sort.sh . The text must be in a.txt

                                  – Zatarra
                                  2 days ago













                                • Now it worked, thanks!

                                  – Lucas Farinazzo Marques
                                  2 days ago
















                                2














                                for element in $(sed -n 'p;n' a.txt |sort -nk 1.9 |uniq |awk '{print $1}')
                                do
                                echo $element; cat a.txt |grep -A1 $i |grep -v trans |grep -v \\--
                                done


                                Where a.txt is your input. Tested:



                                [root@megatron ~]# cat a.txt
                                transcr_25793 +
                                YAL039C -
                                transcr_25793 +
                                YAL037C-B -
                                transcr_20649 +
                                YBL100C -
                                transcr_7135 +
                                YBL029C-A -
                                transcr_11317 +
                                YBL067C -
                                transcr_25793 +
                                YAL038W +
                                transcr_7135 +
                                YBL029W +
                                [root@megatron ~]# for i in $(sed -n 'p;n' a.txt |sort -nk 1.9 |uniq |awk '{print $1}')
                                do
                                echo $i; cat a.txt |grep -A1 $i |grep -v trans |grep -v \\--
                                done
                                transcr_7135
                                YBL029C-A -
                                YBL029W +
                                transcr_11317
                                YBL067C -
                                transcr_20649
                                YBL100C -
                                transcr_25793
                                YAL039C -
                                YAL037C-B -
                                YAL038W +
                                [root@megatron ~]#





                                share|improve this answer


























                                • It appeared this transcr_45 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_193 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_231 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_282 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. and so on... Do you know what I can do?

                                  – Lucas Farinazzo Marques
                                  2 days ago











                                • I corrected. You have to use |grep -v \-- . Or just try to re-use the updated code.

                                  – Zatarra
                                  2 days ago













                                • Even when exactly copying your solution and even the file with the same name, it doesn't work out... It keeps showing the same error

                                  – Lucas Farinazzo Marques
                                  2 days ago











                                • You can download it from here: wget 54.38.222.163/sort.sh and run it with sh sort.sh . The text must be in a.txt

                                  – Zatarra
                                  2 days ago













                                • Now it worked, thanks!

                                  – Lucas Farinazzo Marques
                                  2 days ago














                                2












                                2








                                2







                                for element in $(sed -n 'p;n' a.txt |sort -nk 1.9 |uniq |awk '{print $1}')
                                do
                                echo $element; cat a.txt |grep -A1 $i |grep -v trans |grep -v \\--
                                done


                                Where a.txt is your input. Tested:



                                [root@megatron ~]# cat a.txt
                                transcr_25793 +
                                YAL039C -
                                transcr_25793 +
                                YAL037C-B -
                                transcr_20649 +
                                YBL100C -
                                transcr_7135 +
                                YBL029C-A -
                                transcr_11317 +
                                YBL067C -
                                transcr_25793 +
                                YAL038W +
                                transcr_7135 +
                                YBL029W +
                                [root@megatron ~]# for i in $(sed -n 'p;n' a.txt |sort -nk 1.9 |uniq |awk '{print $1}')
                                do
                                echo $i; cat a.txt |grep -A1 $i |grep -v trans |grep -v \\--
                                done
                                transcr_7135
                                YBL029C-A -
                                YBL029W +
                                transcr_11317
                                YBL067C -
                                transcr_20649
                                YBL100C -
                                transcr_25793
                                YAL039C -
                                YAL037C-B -
                                YAL038W +
                                [root@megatron ~]#





                                share|improve this answer















                                for element in $(sed -n 'p;n' a.txt |sort -nk 1.9 |uniq |awk '{print $1}')
                                do
                                echo $element; cat a.txt |grep -A1 $i |grep -v trans |grep -v \\--
                                done


                                Where a.txt is your input. Tested:



                                [root@megatron ~]# cat a.txt
                                transcr_25793 +
                                YAL039C -
                                transcr_25793 +
                                YAL037C-B -
                                transcr_20649 +
                                YBL100C -
                                transcr_7135 +
                                YBL029C-A -
                                transcr_11317 +
                                YBL067C -
                                transcr_25793 +
                                YAL038W +
                                transcr_7135 +
                                YBL029W +
                                [root@megatron ~]# for i in $(sed -n 'p;n' a.txt |sort -nk 1.9 |uniq |awk '{print $1}')
                                do
                                echo $i; cat a.txt |grep -A1 $i |grep -v trans |grep -v \\--
                                done
                                transcr_7135
                                YBL029C-A -
                                YBL029W +
                                transcr_11317
                                YBL067C -
                                transcr_20649
                                YBL100C -
                                transcr_25793
                                YAL039C -
                                YAL037C-B -
                                YAL038W +
                                [root@megatron ~]#






                                share|improve this answer














                                share|improve this answer



                                share|improve this answer








                                edited 2 days ago









                                andcoz

                                12.5k33039




                                12.5k33039










                                answered 2 days ago









                                ZatarraZatarra

                                212




                                212













                                • It appeared this transcr_45 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_193 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_231 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_282 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. and so on... Do you know what I can do?

                                  – Lucas Farinazzo Marques
                                  2 days ago











                                • I corrected. You have to use |grep -v \-- . Or just try to re-use the updated code.

                                  – Zatarra
                                  2 days ago













                                • Even when exactly copying your solution and even the file with the same name, it doesn't work out... It keeps showing the same error

                                  – Lucas Farinazzo Marques
                                  2 days ago











                                • You can download it from here: wget 54.38.222.163/sort.sh and run it with sh sort.sh . The text must be in a.txt

                                  – Zatarra
                                  2 days ago













                                • Now it worked, thanks!

                                  – Lucas Farinazzo Marques
                                  2 days ago



















                                • It appeared this transcr_45 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_193 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_231 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_282 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. and so on... Do you know what I can do?

                                  – Lucas Farinazzo Marques
                                  2 days ago











                                • I corrected. You have to use |grep -v \-- . Or just try to re-use the updated code.

                                  – Zatarra
                                  2 days ago













                                • Even when exactly copying your solution and even the file with the same name, it doesn't work out... It keeps showing the same error

                                  – Lucas Farinazzo Marques
                                  2 days ago











                                • You can download it from here: wget 54.38.222.163/sort.sh and run it with sh sort.sh . The text must be in a.txt

                                  – Zatarra
                                  2 days ago













                                • Now it worked, thanks!

                                  – Lucas Farinazzo Marques
                                  2 days ago

















                                It appeared this transcr_45 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_193 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_231 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_282 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. and so on... Do you know what I can do?

                                – Lucas Farinazzo Marques
                                2 days ago





                                It appeared this transcr_45 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_193 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_231 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. transcr_282 Usage: grep [OPTION]... PATTERN [FILE]... Try 'grep --help' for more information. and so on... Do you know what I can do?

                                – Lucas Farinazzo Marques
                                2 days ago













                                I corrected. You have to use |grep -v \-- . Or just try to re-use the updated code.

                                – Zatarra
                                2 days ago







                                I corrected. You have to use |grep -v \-- . Or just try to re-use the updated code.

                                – Zatarra
                                2 days ago















                                Even when exactly copying your solution and even the file with the same name, it doesn't work out... It keeps showing the same error

                                – Lucas Farinazzo Marques
                                2 days ago





                                Even when exactly copying your solution and even the file with the same name, it doesn't work out... It keeps showing the same error

                                – Lucas Farinazzo Marques
                                2 days ago













                                You can download it from here: wget 54.38.222.163/sort.sh and run it with sh sort.sh . The text must be in a.txt

                                – Zatarra
                                2 days ago







                                You can download it from here: wget 54.38.222.163/sort.sh and run it with sh sort.sh . The text must be in a.txt

                                – Zatarra
                                2 days ago















                                Now it worked, thanks!

                                – Lucas Farinazzo Marques
                                2 days ago





                                Now it worked, thanks!

                                – Lucas Farinazzo Marques
                                2 days ago











                                0














                                Pre- and postprocessing with awk; this does not assume that a transcr line is followed by just one Y* line; it's also idempotent -- its output could be piped back as input and it will give the same result.



                                awk '{print $0~/^transcr/ ? t=$0 : t" "$0}' /tmp/foo | sort -t_ -k2n -k2 -u | awk '{print (NF > 2) ? $3" "$4 : $0}'
                                transcr_7135 +
                                YBL029C-A -
                                YBL029W +
                                transcr_11317 +
                                YBL067C -
                                transcr_20649 +
                                YBL100C -
                                transcr_25793 +
                                YAL037C-B -
                                YAL038W +
                                YAL039C -





                                share|improve this answer




























                                  0














                                  Pre- and postprocessing with awk; this does not assume that a transcr line is followed by just one Y* line; it's also idempotent -- its output could be piped back as input and it will give the same result.



                                  awk '{print $0~/^transcr/ ? t=$0 : t" "$0}' /tmp/foo | sort -t_ -k2n -k2 -u | awk '{print (NF > 2) ? $3" "$4 : $0}'
                                  transcr_7135 +
                                  YBL029C-A -
                                  YBL029W +
                                  transcr_11317 +
                                  YBL067C -
                                  transcr_20649 +
                                  YBL100C -
                                  transcr_25793 +
                                  YAL037C-B -
                                  YAL038W +
                                  YAL039C -





                                  share|improve this answer


























                                    0












                                    0








                                    0







                                    Pre- and postprocessing with awk; this does not assume that a transcr line is followed by just one Y* line; it's also idempotent -- its output could be piped back as input and it will give the same result.



                                    awk '{print $0~/^transcr/ ? t=$0 : t" "$0}' /tmp/foo | sort -t_ -k2n -k2 -u | awk '{print (NF > 2) ? $3" "$4 : $0}'
                                    transcr_7135 +
                                    YBL029C-A -
                                    YBL029W +
                                    transcr_11317 +
                                    YBL067C -
                                    transcr_20649 +
                                    YBL100C -
                                    transcr_25793 +
                                    YAL037C-B -
                                    YAL038W +
                                    YAL039C -





                                    share|improve this answer













                                    Pre- and postprocessing with awk; this does not assume that a transcr line is followed by just one Y* line; it's also idempotent -- its output could be piped back as input and it will give the same result.



                                    awk '{print $0~/^transcr/ ? t=$0 : t" "$0}' /tmp/foo | sort -t_ -k2n -k2 -u | awk '{print (NF > 2) ? $3" "$4 : $0}'
                                    transcr_7135 +
                                    YBL029C-A -
                                    YBL029W +
                                    transcr_11317 +
                                    YBL067C -
                                    transcr_20649 +
                                    YBL100C -
                                    transcr_25793 +
                                    YAL037C-B -
                                    YAL038W +
                                    YAL039C -






                                    share|improve this answer












                                    share|improve this answer



                                    share|improve this answer










                                    answered yesterday









                                    mosvymosvy

                                    6,3261426




                                    6,3261426






























                                        draft saved

                                        draft discarded




















































                                        Thanks for contributing an answer to Unix & Linux Stack Exchange!


                                        • Please be sure to answer the question. Provide details and share your research!

                                        But avoid



                                        • Asking for help, clarification, or responding to other answers.

                                        • Making statements based on opinion; back them up with references or personal experience.


                                        To learn more, see our tips on writing great answers.




                                        draft saved


                                        draft discarded














                                        StackExchange.ready(
                                        function () {
                                        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f495272%2fhow-to-sort-by-odd-lines-then-remove-repeated-values%23new-answer', 'question_page');
                                        }
                                        );

                                        Post as a guest















                                        Required, but never shown





















































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown

































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown







                                        Popular posts from this blog

                                        "Incorrect syntax near the keyword 'ON'. (on update cascade, on delete cascade,)

                                        Alcedinidae

                                        Origin of the phrase “under your belt”?