Python: How to read and load an excel file from AWS S3?












1















I have uploaded an excel file to AWS S3 bucket and now I want to read it in python. Any help would be appreciated. Here is what I have achieved so far,



import boto3
import os

aws_id = 'aws_id'
aws_secret = 'aws_secret_key'

client = boto3.client('s3', aws_access_key_id=aws_id, aws_secret_access_key=aws_secret)
bucket_name = 'my_bucket'
object_key = 'my_excel_file.xlsm'
object_file = client.get_object(Bucket=bucket_name, Key=object_key)
body = object_file['Body']
data = body.read()


What do I need to do next in order to read this data and work on it?










share|improve this question





























    1















    I have uploaded an excel file to AWS S3 bucket and now I want to read it in python. Any help would be appreciated. Here is what I have achieved so far,



    import boto3
    import os

    aws_id = 'aws_id'
    aws_secret = 'aws_secret_key'

    client = boto3.client('s3', aws_access_key_id=aws_id, aws_secret_access_key=aws_secret)
    bucket_name = 'my_bucket'
    object_key = 'my_excel_file.xlsm'
    object_file = client.get_object(Bucket=bucket_name, Key=object_key)
    body = object_file['Body']
    data = body.read()


    What do I need to do next in order to read this data and work on it?










    share|improve this question



























      1












      1








      1








      I have uploaded an excel file to AWS S3 bucket and now I want to read it in python. Any help would be appreciated. Here is what I have achieved so far,



      import boto3
      import os

      aws_id = 'aws_id'
      aws_secret = 'aws_secret_key'

      client = boto3.client('s3', aws_access_key_id=aws_id, aws_secret_access_key=aws_secret)
      bucket_name = 'my_bucket'
      object_key = 'my_excel_file.xlsm'
      object_file = client.get_object(Bucket=bucket_name, Key=object_key)
      body = object_file['Body']
      data = body.read()


      What do I need to do next in order to read this data and work on it?










      share|improve this question
















      I have uploaded an excel file to AWS S3 bucket and now I want to read it in python. Any help would be appreciated. Here is what I have achieved so far,



      import boto3
      import os

      aws_id = 'aws_id'
      aws_secret = 'aws_secret_key'

      client = boto3.client('s3', aws_access_key_id=aws_id, aws_secret_access_key=aws_secret)
      bucket_name = 'my_bucket'
      object_key = 'my_excel_file.xlsm'
      object_file = client.get_object(Bucket=bucket_name, Key=object_key)
      body = object_file['Body']
      data = body.read()


      What do I need to do next in order to read this data and work on it?







      python python-3.x amazon-web-services amazon-s3






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 23 '18 at 5:51







      exan

















      asked Nov 23 '18 at 1:04









      exanexan

      725




      725
























          3 Answers
          3






          active

          oldest

          votes


















          1














          Spent quite some time on it and here's how I got it working,



          import boto3
          import io
          import pandas as pd
          import json

          aws_id = ''
          aws_secret = ''
          bucket_name = ''
          object_key = ''

          s3 = boto3.client('s3', aws_access_key_id=aws_id, aws_secret_access_key=aws_secret)
          obj = s3.get_object(Bucket=bucket_name, Key=object_key)
          data = obj['Body'].read()
          df = pd.read_excel(io.BytesIO(data), encoding='utf-8')





          share|improve this answer
























          • I tried the above, but received this error: TypeError: expected str, bytes or os.PathLike object, not NoneType. Any clue?

            – SaTa
            yesterday



















          0














          Python doesn't support excel files natively. You could use the pandas library pandas library read_excel functionality






          share|improve this answer
























          • Apparently files reterived from S3 using boto3 library are encoded in some particular format. I am not able to read them in using the read_excel functions without knowing the decoding format

            – exan
            Nov 23 '18 at 4:29











          • @exan Is that with the additional library? If so, please edit your question to show your updated code. (S3 does not change file contents.)

            – John Rotenstein
            Nov 23 '18 at 5:23











          • @JohnRotenstein: Not sure what you mean mate, can you kindly elaborate a bit. I have uploaded a solution though.

            – exan
            Nov 23 '18 at 5:52



















          0














          You can directly read xls file from S3 without having to download or save it locally. xlrd module has a provision to provide raw data to create workbook object.
          Following is the code snippet.



          from boto3 import Session  
          from xlrd.book import open_workbook_xls

          aws_id = ''
          aws_secret = ''
          bucket_name = ''
          object_key = ''

          s3_session = Session(aws_access_key_id=aws_id, aws_secret_access_key=aws_secret)
          bucket_object = s3_session.resource('s3').Bucket(bucket_name).Object(object_key)
          content = bucket_object.get()['Body'].read()
          workbook = open_workbook_xls(file_contents=content)





          share|improve this answer























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53439566%2fpython-how-to-read-and-load-an-excel-file-from-aws-s3%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            3 Answers
            3






            active

            oldest

            votes








            3 Answers
            3






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            1














            Spent quite some time on it and here's how I got it working,



            import boto3
            import io
            import pandas as pd
            import json

            aws_id = ''
            aws_secret = ''
            bucket_name = ''
            object_key = ''

            s3 = boto3.client('s3', aws_access_key_id=aws_id, aws_secret_access_key=aws_secret)
            obj = s3.get_object(Bucket=bucket_name, Key=object_key)
            data = obj['Body'].read()
            df = pd.read_excel(io.BytesIO(data), encoding='utf-8')





            share|improve this answer
























            • I tried the above, but received this error: TypeError: expected str, bytes or os.PathLike object, not NoneType. Any clue?

              – SaTa
              yesterday
















            1














            Spent quite some time on it and here's how I got it working,



            import boto3
            import io
            import pandas as pd
            import json

            aws_id = ''
            aws_secret = ''
            bucket_name = ''
            object_key = ''

            s3 = boto3.client('s3', aws_access_key_id=aws_id, aws_secret_access_key=aws_secret)
            obj = s3.get_object(Bucket=bucket_name, Key=object_key)
            data = obj['Body'].read()
            df = pd.read_excel(io.BytesIO(data), encoding='utf-8')





            share|improve this answer
























            • I tried the above, but received this error: TypeError: expected str, bytes or os.PathLike object, not NoneType. Any clue?

              – SaTa
              yesterday














            1












            1








            1







            Spent quite some time on it and here's how I got it working,



            import boto3
            import io
            import pandas as pd
            import json

            aws_id = ''
            aws_secret = ''
            bucket_name = ''
            object_key = ''

            s3 = boto3.client('s3', aws_access_key_id=aws_id, aws_secret_access_key=aws_secret)
            obj = s3.get_object(Bucket=bucket_name, Key=object_key)
            data = obj['Body'].read()
            df = pd.read_excel(io.BytesIO(data), encoding='utf-8')





            share|improve this answer













            Spent quite some time on it and here's how I got it working,



            import boto3
            import io
            import pandas as pd
            import json

            aws_id = ''
            aws_secret = ''
            bucket_name = ''
            object_key = ''

            s3 = boto3.client('s3', aws_access_key_id=aws_id, aws_secret_access_key=aws_secret)
            obj = s3.get_object(Bucket=bucket_name, Key=object_key)
            data = obj['Body'].read()
            df = pd.read_excel(io.BytesIO(data), encoding='utf-8')






            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Nov 23 '18 at 5:50









            exanexan

            725




            725













            • I tried the above, but received this error: TypeError: expected str, bytes or os.PathLike object, not NoneType. Any clue?

              – SaTa
              yesterday



















            • I tried the above, but received this error: TypeError: expected str, bytes or os.PathLike object, not NoneType. Any clue?

              – SaTa
              yesterday

















            I tried the above, but received this error: TypeError: expected str, bytes or os.PathLike object, not NoneType. Any clue?

            – SaTa
            yesterday





            I tried the above, but received this error: TypeError: expected str, bytes or os.PathLike object, not NoneType. Any clue?

            – SaTa
            yesterday













            0














            Python doesn't support excel files natively. You could use the pandas library pandas library read_excel functionality






            share|improve this answer
























            • Apparently files reterived from S3 using boto3 library are encoded in some particular format. I am not able to read them in using the read_excel functions without knowing the decoding format

              – exan
              Nov 23 '18 at 4:29











            • @exan Is that with the additional library? If so, please edit your question to show your updated code. (S3 does not change file contents.)

              – John Rotenstein
              Nov 23 '18 at 5:23











            • @JohnRotenstein: Not sure what you mean mate, can you kindly elaborate a bit. I have uploaded a solution though.

              – exan
              Nov 23 '18 at 5:52
















            0














            Python doesn't support excel files natively. You could use the pandas library pandas library read_excel functionality






            share|improve this answer
























            • Apparently files reterived from S3 using boto3 library are encoded in some particular format. I am not able to read them in using the read_excel functions without knowing the decoding format

              – exan
              Nov 23 '18 at 4:29











            • @exan Is that with the additional library? If so, please edit your question to show your updated code. (S3 does not change file contents.)

              – John Rotenstein
              Nov 23 '18 at 5:23











            • @JohnRotenstein: Not sure what you mean mate, can you kindly elaborate a bit. I have uploaded a solution though.

              – exan
              Nov 23 '18 at 5:52














            0












            0








            0







            Python doesn't support excel files natively. You could use the pandas library pandas library read_excel functionality






            share|improve this answer













            Python doesn't support excel files natively. You could use the pandas library pandas library read_excel functionality







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Nov 23 '18 at 1:33









            sshevlyaginsshevlyagin

            3391311




            3391311













            • Apparently files reterived from S3 using boto3 library are encoded in some particular format. I am not able to read them in using the read_excel functions without knowing the decoding format

              – exan
              Nov 23 '18 at 4:29











            • @exan Is that with the additional library? If so, please edit your question to show your updated code. (S3 does not change file contents.)

              – John Rotenstein
              Nov 23 '18 at 5:23











            • @JohnRotenstein: Not sure what you mean mate, can you kindly elaborate a bit. I have uploaded a solution though.

              – exan
              Nov 23 '18 at 5:52



















            • Apparently files reterived from S3 using boto3 library are encoded in some particular format. I am not able to read them in using the read_excel functions without knowing the decoding format

              – exan
              Nov 23 '18 at 4:29











            • @exan Is that with the additional library? If so, please edit your question to show your updated code. (S3 does not change file contents.)

              – John Rotenstein
              Nov 23 '18 at 5:23











            • @JohnRotenstein: Not sure what you mean mate, can you kindly elaborate a bit. I have uploaded a solution though.

              – exan
              Nov 23 '18 at 5:52

















            Apparently files reterived from S3 using boto3 library are encoded in some particular format. I am not able to read them in using the read_excel functions without knowing the decoding format

            – exan
            Nov 23 '18 at 4:29





            Apparently files reterived from S3 using boto3 library are encoded in some particular format. I am not able to read them in using the read_excel functions without knowing the decoding format

            – exan
            Nov 23 '18 at 4:29













            @exan Is that with the additional library? If so, please edit your question to show your updated code. (S3 does not change file contents.)

            – John Rotenstein
            Nov 23 '18 at 5:23





            @exan Is that with the additional library? If so, please edit your question to show your updated code. (S3 does not change file contents.)

            – John Rotenstein
            Nov 23 '18 at 5:23













            @JohnRotenstein: Not sure what you mean mate, can you kindly elaborate a bit. I have uploaded a solution though.

            – exan
            Nov 23 '18 at 5:52





            @JohnRotenstein: Not sure what you mean mate, can you kindly elaborate a bit. I have uploaded a solution though.

            – exan
            Nov 23 '18 at 5:52











            0














            You can directly read xls file from S3 without having to download or save it locally. xlrd module has a provision to provide raw data to create workbook object.
            Following is the code snippet.



            from boto3 import Session  
            from xlrd.book import open_workbook_xls

            aws_id = ''
            aws_secret = ''
            bucket_name = ''
            object_key = ''

            s3_session = Session(aws_access_key_id=aws_id, aws_secret_access_key=aws_secret)
            bucket_object = s3_session.resource('s3').Bucket(bucket_name).Object(object_key)
            content = bucket_object.get()['Body'].read()
            workbook = open_workbook_xls(file_contents=content)





            share|improve this answer




























              0














              You can directly read xls file from S3 without having to download or save it locally. xlrd module has a provision to provide raw data to create workbook object.
              Following is the code snippet.



              from boto3 import Session  
              from xlrd.book import open_workbook_xls

              aws_id = ''
              aws_secret = ''
              bucket_name = ''
              object_key = ''

              s3_session = Session(aws_access_key_id=aws_id, aws_secret_access_key=aws_secret)
              bucket_object = s3_session.resource('s3').Bucket(bucket_name).Object(object_key)
              content = bucket_object.get()['Body'].read()
              workbook = open_workbook_xls(file_contents=content)





              share|improve this answer


























                0












                0








                0







                You can directly read xls file from S3 without having to download or save it locally. xlrd module has a provision to provide raw data to create workbook object.
                Following is the code snippet.



                from boto3 import Session  
                from xlrd.book import open_workbook_xls

                aws_id = ''
                aws_secret = ''
                bucket_name = ''
                object_key = ''

                s3_session = Session(aws_access_key_id=aws_id, aws_secret_access_key=aws_secret)
                bucket_object = s3_session.resource('s3').Bucket(bucket_name).Object(object_key)
                content = bucket_object.get()['Body'].read()
                workbook = open_workbook_xls(file_contents=content)





                share|improve this answer













                You can directly read xls file from S3 without having to download or save it locally. xlrd module has a provision to provide raw data to create workbook object.
                Following is the code snippet.



                from boto3 import Session  
                from xlrd.book import open_workbook_xls

                aws_id = ''
                aws_secret = ''
                bucket_name = ''
                object_key = ''

                s3_session = Session(aws_access_key_id=aws_id, aws_secret_access_key=aws_secret)
                bucket_object = s3_session.resource('s3').Bucket(bucket_name).Object(object_key)
                content = bucket_object.get()['Body'].read()
                workbook = open_workbook_xls(file_contents=content)






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Feb 7 at 7:33









                Rhythm ChopraRhythm Chopra

                336




                336






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53439566%2fpython-how-to-read-and-load-an-excel-file-from-aws-s3%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    "Incorrect syntax near the keyword 'ON'. (on update cascade, on delete cascade,)

                    Alcedinidae

                    Origin of the phrase “under your belt”?