compare xml files using python












0















I want to compare these two xml files:



File1.xml:



<ngs_sample id="40332">
<workflow value="salmonella" version="101_provisional" />
<results>
<gastro_prelim_st reason="not novel" success="false">
<type st="1364" />
<type st="9999" />
</gastro_prelim_st>
</results>
</ngs_sample>


File2.xml:



<ngs_sample id="40332">
<workflow value="salmonella" version="101_provisional" />
<results>
<gastro_prelim_st reason="not novel" success="false">
<type st="1364" />
</gastro_prelim_st>
</results>
</ngs_sample>


I've used xmldiff to compare a.xml with b.xml:



def compare_xmls(observed,expected):

from xmldiff import main, formatting
formatter = formatting.DiffFormatter()
diff = main.diff_files(observed,expected,formatter=formatter)
return diff

out = compare_xmls(a.xml, b.xml)
print(out)


OUTPUT:



[delete, /ngs_sample/results/gastro_prelim_st/type[2]]


Anyone know how to identify what is the difference between the two xml files, i.e. what has been deleted compared to the file b.xml. Anyone recommend any other way of comparing xml files in python?










share|improve this question























  • For comparing differences in general I use WinMerge, so if you don't need to do it in python, it's a pretty handy tool. But if you must, it seems the output already tells you the difference exactly? (That the second type tag under ngs_sample/...prelim_st/ was deleted). Did you mean you wanted to see the values being deleted?

    – Idlehands
    Nov 22 '18 at 14:33













  • Yes I want to see what has been deleted, i.e. what is the difference between the two xmls.

    – Mark
    Nov 22 '18 at 15:45











  • What exactly are you expecting from the output that's missing then? It's already telling you that second type tag has been deleted. As it stands it's not clear, would be helpful if you stated your expected output instead.

    – Idlehands
    Nov 22 '18 at 15:52











  • Helpful to say <type st="9999" /> is deleted.

    – Mark
    Nov 22 '18 at 16:25
















0















I want to compare these two xml files:



File1.xml:



<ngs_sample id="40332">
<workflow value="salmonella" version="101_provisional" />
<results>
<gastro_prelim_st reason="not novel" success="false">
<type st="1364" />
<type st="9999" />
</gastro_prelim_st>
</results>
</ngs_sample>


File2.xml:



<ngs_sample id="40332">
<workflow value="salmonella" version="101_provisional" />
<results>
<gastro_prelim_st reason="not novel" success="false">
<type st="1364" />
</gastro_prelim_st>
</results>
</ngs_sample>


I've used xmldiff to compare a.xml with b.xml:



def compare_xmls(observed,expected):

from xmldiff import main, formatting
formatter = formatting.DiffFormatter()
diff = main.diff_files(observed,expected,formatter=formatter)
return diff

out = compare_xmls(a.xml, b.xml)
print(out)


OUTPUT:



[delete, /ngs_sample/results/gastro_prelim_st/type[2]]


Anyone know how to identify what is the difference between the two xml files, i.e. what has been deleted compared to the file b.xml. Anyone recommend any other way of comparing xml files in python?










share|improve this question























  • For comparing differences in general I use WinMerge, so if you don't need to do it in python, it's a pretty handy tool. But if you must, it seems the output already tells you the difference exactly? (That the second type tag under ngs_sample/...prelim_st/ was deleted). Did you mean you wanted to see the values being deleted?

    – Idlehands
    Nov 22 '18 at 14:33













  • Yes I want to see what has been deleted, i.e. what is the difference between the two xmls.

    – Mark
    Nov 22 '18 at 15:45











  • What exactly are you expecting from the output that's missing then? It's already telling you that second type tag has been deleted. As it stands it's not clear, would be helpful if you stated your expected output instead.

    – Idlehands
    Nov 22 '18 at 15:52











  • Helpful to say <type st="9999" /> is deleted.

    – Mark
    Nov 22 '18 at 16:25














0












0








0








I want to compare these two xml files:



File1.xml:



<ngs_sample id="40332">
<workflow value="salmonella" version="101_provisional" />
<results>
<gastro_prelim_st reason="not novel" success="false">
<type st="1364" />
<type st="9999" />
</gastro_prelim_st>
</results>
</ngs_sample>


File2.xml:



<ngs_sample id="40332">
<workflow value="salmonella" version="101_provisional" />
<results>
<gastro_prelim_st reason="not novel" success="false">
<type st="1364" />
</gastro_prelim_st>
</results>
</ngs_sample>


I've used xmldiff to compare a.xml with b.xml:



def compare_xmls(observed,expected):

from xmldiff import main, formatting
formatter = formatting.DiffFormatter()
diff = main.diff_files(observed,expected,formatter=formatter)
return diff

out = compare_xmls(a.xml, b.xml)
print(out)


OUTPUT:



[delete, /ngs_sample/results/gastro_prelim_st/type[2]]


Anyone know how to identify what is the difference between the two xml files, i.e. what has been deleted compared to the file b.xml. Anyone recommend any other way of comparing xml files in python?










share|improve this question














I want to compare these two xml files:



File1.xml:



<ngs_sample id="40332">
<workflow value="salmonella" version="101_provisional" />
<results>
<gastro_prelim_st reason="not novel" success="false">
<type st="1364" />
<type st="9999" />
</gastro_prelim_st>
</results>
</ngs_sample>


File2.xml:



<ngs_sample id="40332">
<workflow value="salmonella" version="101_provisional" />
<results>
<gastro_prelim_st reason="not novel" success="false">
<type st="1364" />
</gastro_prelim_st>
</results>
</ngs_sample>


I've used xmldiff to compare a.xml with b.xml:



def compare_xmls(observed,expected):

from xmldiff import main, formatting
formatter = formatting.DiffFormatter()
diff = main.diff_files(observed,expected,formatter=formatter)
return diff

out = compare_xmls(a.xml, b.xml)
print(out)


OUTPUT:



[delete, /ngs_sample/results/gastro_prelim_st/type[2]]


Anyone know how to identify what is the difference between the two xml files, i.e. what has been deleted compared to the file b.xml. Anyone recommend any other way of comparing xml files in python?







python xml xmldiff






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 22 '18 at 13:58









MarkMark

1611516




1611516













  • For comparing differences in general I use WinMerge, so if you don't need to do it in python, it's a pretty handy tool. But if you must, it seems the output already tells you the difference exactly? (That the second type tag under ngs_sample/...prelim_st/ was deleted). Did you mean you wanted to see the values being deleted?

    – Idlehands
    Nov 22 '18 at 14:33













  • Yes I want to see what has been deleted, i.e. what is the difference between the two xmls.

    – Mark
    Nov 22 '18 at 15:45











  • What exactly are you expecting from the output that's missing then? It's already telling you that second type tag has been deleted. As it stands it's not clear, would be helpful if you stated your expected output instead.

    – Idlehands
    Nov 22 '18 at 15:52











  • Helpful to say <type st="9999" /> is deleted.

    – Mark
    Nov 22 '18 at 16:25



















  • For comparing differences in general I use WinMerge, so if you don't need to do it in python, it's a pretty handy tool. But if you must, it seems the output already tells you the difference exactly? (That the second type tag under ngs_sample/...prelim_st/ was deleted). Did you mean you wanted to see the values being deleted?

    – Idlehands
    Nov 22 '18 at 14:33













  • Yes I want to see what has been deleted, i.e. what is the difference between the two xmls.

    – Mark
    Nov 22 '18 at 15:45











  • What exactly are you expecting from the output that's missing then? It's already telling you that second type tag has been deleted. As it stands it's not clear, would be helpful if you stated your expected output instead.

    – Idlehands
    Nov 22 '18 at 15:52











  • Helpful to say <type st="9999" /> is deleted.

    – Mark
    Nov 22 '18 at 16:25

















For comparing differences in general I use WinMerge, so if you don't need to do it in python, it's a pretty handy tool. But if you must, it seems the output already tells you the difference exactly? (That the second type tag under ngs_sample/...prelim_st/ was deleted). Did you mean you wanted to see the values being deleted?

– Idlehands
Nov 22 '18 at 14:33







For comparing differences in general I use WinMerge, so if you don't need to do it in python, it's a pretty handy tool. But if you must, it seems the output already tells you the difference exactly? (That the second type tag under ngs_sample/...prelim_st/ was deleted). Did you mean you wanted to see the values being deleted?

– Idlehands
Nov 22 '18 at 14:33















Yes I want to see what has been deleted, i.e. what is the difference between the two xmls.

– Mark
Nov 22 '18 at 15:45





Yes I want to see what has been deleted, i.e. what is the difference between the two xmls.

– Mark
Nov 22 '18 at 15:45













What exactly are you expecting from the output that's missing then? It's already telling you that second type tag has been deleted. As it stands it's not clear, would be helpful if you stated your expected output instead.

– Idlehands
Nov 22 '18 at 15:52





What exactly are you expecting from the output that's missing then? It's already telling you that second type tag has been deleted. As it stands it's not clear, would be helpful if you stated your expected output instead.

– Idlehands
Nov 22 '18 at 15:52













Helpful to say <type st="9999" /> is deleted.

– Mark
Nov 22 '18 at 16:25





Helpful to say <type st="9999" /> is deleted.

– Mark
Nov 22 '18 at 16:25












3 Answers
3






active

oldest

votes


















1














You can switch to the XMLFormatter and manually filter out the results:



...
# Change formatter:
formatter = formatting.XMLFormatter(normalize=formatting.WS_BOTH)

...

# after `out` has been retrieved:
import re
for i in out.splitlines():
if re.search(r'bdiff:w+', i):
print(i)

# Result:
# <type st="9999" diff:delete=""/>





share|improve this answer































    0














    Use the xmldiff to perform this exact task.



    main.py



    from xmldiff import main
    diff = main.diff_files("file1.xml", "file2.xml")
    print(diff)


    output



    [DeleteNode(node='/ngs_sample/results/gastro_prelim_st/type[2]')]





    share|improve this answer
























    • Not sure if you read the question but this doesnt answer my query

      – Mark
      Nov 22 '18 at 14:10



















    -1














    Try XmlXdiff, currently only svg output available. But it should be quite simple to provied a text output or an interface class.



    Some example code from the XmlXDiff website:



    from XmlXdiff.XReport import DrawXmlDiff

    _xml1 = """<root><deleted>with content</deleted><unchanged/><changed name="test1" /></root>"""
    _xml2 = """<root><unchanged/><changed name="test2" /><added/></root>"""

    with open("test1.xml", "w") as f:
    f.write(_xml1)

    with open("test2.xml", "w") as f:
    f.write(_xml2)

    x = DrawXmlDiff("test1.xml", "test2.xml")
    x.saveSvg('xdiff.svg')


    Example Output






    share|improve this answer

























      Your Answer






      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "1"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53432591%2fcompare-xml-files-using-python%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      3 Answers
      3






      active

      oldest

      votes








      3 Answers
      3






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      1














      You can switch to the XMLFormatter and manually filter out the results:



      ...
      # Change formatter:
      formatter = formatting.XMLFormatter(normalize=formatting.WS_BOTH)

      ...

      # after `out` has been retrieved:
      import re
      for i in out.splitlines():
      if re.search(r'bdiff:w+', i):
      print(i)

      # Result:
      # <type st="9999" diff:delete=""/>





      share|improve this answer




























        1














        You can switch to the XMLFormatter and manually filter out the results:



        ...
        # Change formatter:
        formatter = formatting.XMLFormatter(normalize=formatting.WS_BOTH)

        ...

        # after `out` has been retrieved:
        import re
        for i in out.splitlines():
        if re.search(r'bdiff:w+', i):
        print(i)

        # Result:
        # <type st="9999" diff:delete=""/>





        share|improve this answer


























          1












          1








          1







          You can switch to the XMLFormatter and manually filter out the results:



          ...
          # Change formatter:
          formatter = formatting.XMLFormatter(normalize=formatting.WS_BOTH)

          ...

          # after `out` has been retrieved:
          import re
          for i in out.splitlines():
          if re.search(r'bdiff:w+', i):
          print(i)

          # Result:
          # <type st="9999" diff:delete=""/>





          share|improve this answer













          You can switch to the XMLFormatter and manually filter out the results:



          ...
          # Change formatter:
          formatter = formatting.XMLFormatter(normalize=formatting.WS_BOTH)

          ...

          # after `out` has been retrieved:
          import re
          for i in out.splitlines():
          if re.search(r'bdiff:w+', i):
          print(i)

          # Result:
          # <type st="9999" diff:delete=""/>






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 22 '18 at 18:01









          IdlehandsIdlehands

          5,6651620




          5,6651620

























              0














              Use the xmldiff to perform this exact task.



              main.py



              from xmldiff import main
              diff = main.diff_files("file1.xml", "file2.xml")
              print(diff)


              output



              [DeleteNode(node='/ngs_sample/results/gastro_prelim_st/type[2]')]





              share|improve this answer
























              • Not sure if you read the question but this doesnt answer my query

                – Mark
                Nov 22 '18 at 14:10
















              0














              Use the xmldiff to perform this exact task.



              main.py



              from xmldiff import main
              diff = main.diff_files("file1.xml", "file2.xml")
              print(diff)


              output



              [DeleteNode(node='/ngs_sample/results/gastro_prelim_st/type[2]')]





              share|improve this answer
























              • Not sure if you read the question but this doesnt answer my query

                – Mark
                Nov 22 '18 at 14:10














              0












              0








              0







              Use the xmldiff to perform this exact task.



              main.py



              from xmldiff import main
              diff = main.diff_files("file1.xml", "file2.xml")
              print(diff)


              output



              [DeleteNode(node='/ngs_sample/results/gastro_prelim_st/type[2]')]





              share|improve this answer













              Use the xmldiff to perform this exact task.



              main.py



              from xmldiff import main
              diff = main.diff_files("file1.xml", "file2.xml")
              print(diff)


              output



              [DeleteNode(node='/ngs_sample/results/gastro_prelim_st/type[2]')]






              share|improve this answer












              share|improve this answer



              share|improve this answer










              answered Nov 22 '18 at 14:09









              Victor 'Chris' CabralVictor 'Chris' Cabral

              1,5201222




              1,5201222













              • Not sure if you read the question but this doesnt answer my query

                – Mark
                Nov 22 '18 at 14:10



















              • Not sure if you read the question but this doesnt answer my query

                – Mark
                Nov 22 '18 at 14:10

















              Not sure if you read the question but this doesnt answer my query

              – Mark
              Nov 22 '18 at 14:10





              Not sure if you read the question but this doesnt answer my query

              – Mark
              Nov 22 '18 at 14:10











              -1














              Try XmlXdiff, currently only svg output available. But it should be quite simple to provied a text output or an interface class.



              Some example code from the XmlXDiff website:



              from XmlXdiff.XReport import DrawXmlDiff

              _xml1 = """<root><deleted>with content</deleted><unchanged/><changed name="test1" /></root>"""
              _xml2 = """<root><unchanged/><changed name="test2" /><added/></root>"""

              with open("test1.xml", "w") as f:
              f.write(_xml1)

              with open("test2.xml", "w") as f:
              f.write(_xml2)

              x = DrawXmlDiff("test1.xml", "test2.xml")
              x.saveSvg('xdiff.svg')


              Example Output






              share|improve this answer






























                -1














                Try XmlXdiff, currently only svg output available. But it should be quite simple to provied a text output or an interface class.



                Some example code from the XmlXDiff website:



                from XmlXdiff.XReport import DrawXmlDiff

                _xml1 = """<root><deleted>with content</deleted><unchanged/><changed name="test1" /></root>"""
                _xml2 = """<root><unchanged/><changed name="test2" /><added/></root>"""

                with open("test1.xml", "w") as f:
                f.write(_xml1)

                with open("test2.xml", "w") as f:
                f.write(_xml2)

                x = DrawXmlDiff("test1.xml", "test2.xml")
                x.saveSvg('xdiff.svg')


                Example Output






                share|improve this answer




























                  -1












                  -1








                  -1







                  Try XmlXdiff, currently only svg output available. But it should be quite simple to provied a text output or an interface class.



                  Some example code from the XmlXDiff website:



                  from XmlXdiff.XReport import DrawXmlDiff

                  _xml1 = """<root><deleted>with content</deleted><unchanged/><changed name="test1" /></root>"""
                  _xml2 = """<root><unchanged/><changed name="test2" /><added/></root>"""

                  with open("test1.xml", "w") as f:
                  f.write(_xml1)

                  with open("test2.xml", "w") as f:
                  f.write(_xml2)

                  x = DrawXmlDiff("test1.xml", "test2.xml")
                  x.saveSvg('xdiff.svg')


                  Example Output






                  share|improve this answer















                  Try XmlXdiff, currently only svg output available. But it should be quite simple to provied a text output or an interface class.



                  Some example code from the XmlXDiff website:



                  from XmlXdiff.XReport import DrawXmlDiff

                  _xml1 = """<root><deleted>with content</deleted><unchanged/><changed name="test1" /></root>"""
                  _xml2 = """<root><unchanged/><changed name="test2" /><added/></root>"""

                  with open("test1.xml", "w") as f:
                  f.write(_xml1)

                  with open("test2.xml", "w") as f:
                  f.write(_xml2)

                  x = DrawXmlDiff("test1.xml", "test2.xml")
                  x.saveSvg('xdiff.svg')


                  Example Output







                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Jan 6 at 21:54









                  Kingsley

                  3,03021326




                  3,03021326










                  answered Jan 6 at 21:05









                  mmoossttmmoosstt

                  1




                  1






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53432591%2fcompare-xml-files-using-python%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      "Incorrect syntax near the keyword 'ON'. (on update cascade, on delete cascade,)

                      Alcedinidae

                      Origin of the phrase “under your belt”?