How to export a Hive table into a CSV file?












44















I used this Hive query to export a table into a CSV file.



INSERT OVERWRITE DIRECTORY '/user/data/output/test' select column1, column2 from table1;


The file generated '000000_0' does not have comma separator



Is this the right way to generate CSV file? If no, please let me know how can I generate the CSV file?










share|improve this question



























    44















    I used this Hive query to export a table into a CSV file.



    INSERT OVERWRITE DIRECTORY '/user/data/output/test' select column1, column2 from table1;


    The file generated '000000_0' does not have comma separator



    Is this the right way to generate CSV file? If no, please let me know how can I generate the CSV file?










    share|improve this question

























      44












      44








      44


      23






      I used this Hive query to export a table into a CSV file.



      INSERT OVERWRITE DIRECTORY '/user/data/output/test' select column1, column2 from table1;


      The file generated '000000_0' does not have comma separator



      Is this the right way to generate CSV file? If no, please let me know how can I generate the CSV file?










      share|improve this question














      I used this Hive query to export a table into a CSV file.



      INSERT OVERWRITE DIRECTORY '/user/data/output/test' select column1, column2 from table1;


      The file generated '000000_0' does not have comma separator



      Is this the right way to generate CSV file? If no, please let me know how can I generate the CSV file?







      csv hive






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Jun 13 '13 at 12:04









      Dunith DhanushkaDunith Dhanushka

      1,70441928




      1,70441928
























          13 Answers
          13






          active

          oldest

          votes


















          45














          If you're using Hive 11 or better you can use the INSERT statement with the LOCAL keyword.



          Example:



          insert overwrite local directory '/home/carter/staging' row format delimited fields terminated by ',' select * from hugetable;


          Note that this may create multiple files and you may want to concatenate them on the client side after it's done exporting.



          Using this approach means you don't need to worry about the format of the source tables, can export based on arbitrary SQL query, and can select your own delimiters and output formats.






          share|improve this answer


























          • Thank you, this created folder with multiple csv files. Is there anyway to put everything into one file? Also is there anyway to include header (column name) in the csv file?

            – mike
            Jun 14 '17 at 13:36






          • 1





            How do you concatenate them on the client side after exporting?

            – user2205916
            May 24 '18 at 20:45











          • For me this command has produced a bunch of files ending with the extension .snappy which looks like a compressed format. I am not sure how to convert un-compress them. I know how to merge files locally using the command cat file1 file2 > file on my local machine.

            – Ravi Chandra
            Nov 27 '18 at 6:51



















          44














          or use this



          hive -e 'select * from your_Table' | sed 's/[t]/,/g'  > /home/yourfile.csv


          You can also specify property set hive.cli.print.header=true before the SELECT to ensure that header along with data is created and copied to file.
          For example:



          hive -e 'set hive.cli.print.header=true; select * from your_Table' | sed 's/[t]/,/g'  > /home/yourfile.csv


          If you don't want to write to local file system, pipe the output of sed command back into HDFS using the hadoop fs -put command.






          share|improve this answer


























          • By using this command the hive data types such as 'double' are not carried forward in CSV. So when I read the CSV all are read as a string.

            – Aman Mathur
            Jun 25 '15 at 12:17











          • This will read table from bash ?

            – Thomas Decaux
            May 9 '17 at 18:27



















          29














          That should work for you





          • tab separated



            hive -e 'select * from some_table' > /home/yourfile.tsv



          • comma separated



            hive -e 'select * from some_table' | sed 's/[t]/,/g' > /home/yourfile.csv







          share|improve this answer





















          • 1





            this will export as tab-separated

            – Brett Bonner
            Aug 2 '15 at 2:08











          • It is working: hive -e 'use <database or schema name>; select * from <table_name>;' > <absolute path for the csv file>/<csv file name>.csv

            – JGS
            May 12 '16 at 10:30






          • 1





            Excellent!! Made my day!

            – prashanth
            Jan 30 '18 at 18:30



















          21














          You can not have a delimiter for query output,after generating the report (as you did).



          you can change the delimiter to comma.



          It comes with default delimiter 01 (inivisible character).



          hadoop fs -cat /user/data/output/test/* |tr "1" "," >>outputwithcomma.csv


          check this also






          share|improve this answer

































            6














            Recent versions of hive comes with this feature.



            INSERT OVERWRITE LOCAL DIRECTORY '/home/lvermeer/temp' 
            ROW FORMAT DELIMITED
            FIELDS TERMINATED BY ','
            select * from table;


            this way you can choose your own delimiter and file name.
            Just be careful with the "OVERWRITE" it will try to delete everything from the mentioned folder.






            share|improve this answer

































              6














              INSERT OVERWRITE LOCAL DIRECTORY '/home/lvermeer/temp' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' select * from table; 


              is the correct answer.



              If the number of records is really big, based on the number of files generated



              the following command would give only partial result.



              hive -e 'select * from some_table' > /home/yourfile.csv





              share|improve this answer

































                4














                I have used simple linux shell piping + perl to convert hive generated output from tsv to csv.



                hive -e "SELECT col1, col2, … FROM table_name" | perl -lpe 's/"/\"/g; s/^|$/"/g; s/t/","/g' > output_file.csv


                (I got the updated perl regex from someone in stackoverflow some time ago)



                The result will be like regular csv:



                "col1","col2","col3"... and so on






                share|improve this answer

































                  4














                  The following script should work for you:



                  #!/bin/bash
                  hive -e "insert overwrite local directory '/LocalPath/'
                  row format delimited fields terminated by ','
                  select * from Mydatabase,Mytable limit 100"
                  cat /LocalPath/* > /LocalPath/table.csv


                  I used limit 100 to limit the size of data since I had a huge table, but you can delete it to export the entire table.






                  share|improve this answer































                    2














                    Here using Hive warehouse dir you can export data instead of Hive table.
                    first give hive warehouse path and after local path where you want to store the .csv file
                    For this command is bellow :-



                    hadoop fs -cat /user/hdusr/warehouse/HiveDb/tableName/* > /users/hadoop/test/nilesh/sample.csv





                    share|improve this answer































                      1














                      I had a similar issue and this is how I was able to address it.



                      Step 1 - Loaded the data from hive table into another table as follows




                      DROP TABLE IF EXISTS TestHiveTableCSV;
                      CREATE TABLE TestHiveTableCSV ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY 'n' AS
                      SELECT Column List FROM TestHiveTable;




                      Step 2 - Copied the blob from hive warehouse to the new location with appropriate extension




                      Start-AzureStorageBlobCopy
                      -DestContext $destContext

                      -SrcContainer "Source Container"
                      -SrcBlob "hive/warehouse/TestHiveTableCSV/000000_0"

                      -DestContainer "Destination Container" `
                      -DestBlob "CSV/TestHiveTable.csv"




                      Hope this helps!



                      Best Regards,
                      Dattatrey Sindol (Datta)
                      http://dattatreysindol.com






                      share|improve this answer































                        1














                        There are ways to change the default delimiter, as shown by other answers.



                        There are also ways to convert the raw output to csv with some bash scripting. There are 3 delimiters to consider though, not just 01. Things get a bit more complicated when your hive table has maps.



                        I wrote a bash script that can handle all 3 default delimiters (01 02 and 03) from hive and output a csv. The script and some more info are here:




                        Hive Default Delimiters to CSV



                        Hive's default delimiters are



                        Row Delimiter => Control-A ('01')
                        Collection Item Delimiter => Control-B ('02')
                        Map Key Delimiter => Control-C ('03')


                        There are ways to change these delimiters when exporting tables but
                        sometimes you might still get stuck needing to convert this to csv.



                        Here's a quick bash script that can handle a DB export that's
                        segmented in multiple files and has the default delimiters. It will
                        output a single CSV file.



                        It is assumed that the segments all have the naming convention 000*_0



                        INDIRECTORY="path/to/input/directory"
                        for f in $INDIRECTORY/000*_0; do
                        echo "Processing $f file..";
                        cat -v $f |
                        LC_ALL=C sed -e "s/^/"/g" |
                        LC_ALL=C sed -e "s/^A/","/g" |
                        LC_ALL=C sed -e "s/^C^B/"":"""",""/g" |
                        LC_ALL=C sed -e "s/^B/"",""/g" |
                        LC_ALL=C sed -e "s/^C/"":""/g" |
                        LC_ALL=C sed -e "s/$/"/g" > $f-temp
                        done
                        echo "you,can,echo,your,header,here,if,you,like" > $INDIRECTORY/final_output.csv
                        cat $INDIRECTORY/*-temp >> $INDIRECTORY/final_output.csv
                        rm $INDIRECTORY/*-temp



                        More explanation on the gist






                        share|improve this answer

































                          1














                          In case you are doing it from Windows you can use Python script hivehoney to extract table data to local CSV file.



                          It will:




                          • Login to bastion host.

                          • pbrun.

                          • kinit.

                          • beeline (with your query).

                          • Save
                            echo from beeline to a file on Windows.


                          Execute it like this:



                          set PROXY_HOST=your_bastion_host

                          set SERVICE_USER=you_func_user

                          set LINUX_USER=your_SOID

                          set LINUX_PWD=your_pwd

                          python hh.py --query_file=query.sql





                          share|improve this answer

































                            0














                            The problem solutions are fine but I found some problems in both:





                            • As Carter Shanklin said, with this command we will obtain a csv file with the results of the query in the path specified:



                              insert overwrite local directory '/home/carter/staging' row format delimited fields terminated by ',' select * from hugetable;


                              The problem with this solution is that the csv obtained won´t have headers and will create a file that is not a CSV (so we have to rename it).




                            • As user1922900 said, with the following command we will obtain a CSV files with the results of the query in the specified file and with headers:



                              hive -e 'select * from some_table' | sed 's/[t]/,/g' > /home/yourfile.csv


                              With this solution we will get a CSV file with the result rows of our query, but with log messages between these rows too. As a solution of this problem I tried this, but without results.




                            So, to solve all these issues I created a script that execute a list of queries, create a folder (with a timestamp) where it stores the results, rename the files obtained, remove the unnecesay files and it also add the respective headers.



                             #!/bin/sh
                            QUERIES=("select * from table1" "select * from table2")
                            IFS=""
                            directoryname=$(echo "ScriptResults$timestamp")
                            mkdir $directoryname
                            counter=1
                            for query in ${QUERIES[*]}
                            do
                            tablename="query"$counter
                            hive -S -e "INSERT OVERWRITE LOCAL DIRECTORY '/data/2/DOMAIN_USERS/SANUK/users/$USER/$tablename' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' $query ;"
                            hive -S -e "set hive.cli.print.header=true; $query limit 1" | head -1 | sed 's/[t]/,/g' >> /data/2/DOMAIN_USERS/SANUK/users/$USER/$tablename/header.csv
                            mv $tablename/000000_0 $tablename/$tablename.csv
                            cat $tablename/$tablename.csv >> $tablename/header.csv.
                            rm $tablename/$tablename.csv
                            mv $tablename/header.csv $tablename/$tablename.csv
                            mv $tablename/$tablename.csv $directoryname
                            counter=$((counter+1))
                            rm -rf $tablename/
                            done





                            share|improve this answer































                              13 Answers
                              13






                              active

                              oldest

                              votes








                              13 Answers
                              13






                              active

                              oldest

                              votes









                              active

                              oldest

                              votes






                              active

                              oldest

                              votes









                              45














                              If you're using Hive 11 or better you can use the INSERT statement with the LOCAL keyword.



                              Example:



                              insert overwrite local directory '/home/carter/staging' row format delimited fields terminated by ',' select * from hugetable;


                              Note that this may create multiple files and you may want to concatenate them on the client side after it's done exporting.



                              Using this approach means you don't need to worry about the format of the source tables, can export based on arbitrary SQL query, and can select your own delimiters and output formats.






                              share|improve this answer


























                              • Thank you, this created folder with multiple csv files. Is there anyway to put everything into one file? Also is there anyway to include header (column name) in the csv file?

                                – mike
                                Jun 14 '17 at 13:36






                              • 1





                                How do you concatenate them on the client side after exporting?

                                – user2205916
                                May 24 '18 at 20:45











                              • For me this command has produced a bunch of files ending with the extension .snappy which looks like a compressed format. I am not sure how to convert un-compress them. I know how to merge files locally using the command cat file1 file2 > file on my local machine.

                                – Ravi Chandra
                                Nov 27 '18 at 6:51
















                              45














                              If you're using Hive 11 or better you can use the INSERT statement with the LOCAL keyword.



                              Example:



                              insert overwrite local directory '/home/carter/staging' row format delimited fields terminated by ',' select * from hugetable;


                              Note that this may create multiple files and you may want to concatenate them on the client side after it's done exporting.



                              Using this approach means you don't need to worry about the format of the source tables, can export based on arbitrary SQL query, and can select your own delimiters and output formats.






                              share|improve this answer


























                              • Thank you, this created folder with multiple csv files. Is there anyway to put everything into one file? Also is there anyway to include header (column name) in the csv file?

                                – mike
                                Jun 14 '17 at 13:36






                              • 1





                                How do you concatenate them on the client side after exporting?

                                – user2205916
                                May 24 '18 at 20:45











                              • For me this command has produced a bunch of files ending with the extension .snappy which looks like a compressed format. I am not sure how to convert un-compress them. I know how to merge files locally using the command cat file1 file2 > file on my local machine.

                                – Ravi Chandra
                                Nov 27 '18 at 6:51














                              45












                              45








                              45







                              If you're using Hive 11 or better you can use the INSERT statement with the LOCAL keyword.



                              Example:



                              insert overwrite local directory '/home/carter/staging' row format delimited fields terminated by ',' select * from hugetable;


                              Note that this may create multiple files and you may want to concatenate them on the client side after it's done exporting.



                              Using this approach means you don't need to worry about the format of the source tables, can export based on arbitrary SQL query, and can select your own delimiters and output formats.






                              share|improve this answer















                              If you're using Hive 11 or better you can use the INSERT statement with the LOCAL keyword.



                              Example:



                              insert overwrite local directory '/home/carter/staging' row format delimited fields terminated by ',' select * from hugetable;


                              Note that this may create multiple files and you may want to concatenate them on the client side after it's done exporting.



                              Using this approach means you don't need to worry about the format of the source tables, can export based on arbitrary SQL query, and can select your own delimiters and output formats.







                              share|improve this answer














                              share|improve this answer



                              share|improve this answer








                              edited Nov 13 '16 at 14:12









                              Ram Ghadiyaram

                              16.4k64477




                              16.4k64477










                              answered Nov 1 '13 at 15:59









                              Carter ShanklinCarter Shanklin

                              1,9861112




                              1,9861112













                              • Thank you, this created folder with multiple csv files. Is there anyway to put everything into one file? Also is there anyway to include header (column name) in the csv file?

                                – mike
                                Jun 14 '17 at 13:36






                              • 1





                                How do you concatenate them on the client side after exporting?

                                – user2205916
                                May 24 '18 at 20:45











                              • For me this command has produced a bunch of files ending with the extension .snappy which looks like a compressed format. I am not sure how to convert un-compress them. I know how to merge files locally using the command cat file1 file2 > file on my local machine.

                                – Ravi Chandra
                                Nov 27 '18 at 6:51



















                              • Thank you, this created folder with multiple csv files. Is there anyway to put everything into one file? Also is there anyway to include header (column name) in the csv file?

                                – mike
                                Jun 14 '17 at 13:36






                              • 1





                                How do you concatenate them on the client side after exporting?

                                – user2205916
                                May 24 '18 at 20:45











                              • For me this command has produced a bunch of files ending with the extension .snappy which looks like a compressed format. I am not sure how to convert un-compress them. I know how to merge files locally using the command cat file1 file2 > file on my local machine.

                                – Ravi Chandra
                                Nov 27 '18 at 6:51

















                              Thank you, this created folder with multiple csv files. Is there anyway to put everything into one file? Also is there anyway to include header (column name) in the csv file?

                              – mike
                              Jun 14 '17 at 13:36





                              Thank you, this created folder with multiple csv files. Is there anyway to put everything into one file? Also is there anyway to include header (column name) in the csv file?

                              – mike
                              Jun 14 '17 at 13:36




                              1




                              1





                              How do you concatenate them on the client side after exporting?

                              – user2205916
                              May 24 '18 at 20:45





                              How do you concatenate them on the client side after exporting?

                              – user2205916
                              May 24 '18 at 20:45













                              For me this command has produced a bunch of files ending with the extension .snappy which looks like a compressed format. I am not sure how to convert un-compress them. I know how to merge files locally using the command cat file1 file2 > file on my local machine.

                              – Ravi Chandra
                              Nov 27 '18 at 6:51





                              For me this command has produced a bunch of files ending with the extension .snappy which looks like a compressed format. I am not sure how to convert un-compress them. I know how to merge files locally using the command cat file1 file2 > file on my local machine.

                              – Ravi Chandra
                              Nov 27 '18 at 6:51













                              44














                              or use this



                              hive -e 'select * from your_Table' | sed 's/[t]/,/g'  > /home/yourfile.csv


                              You can also specify property set hive.cli.print.header=true before the SELECT to ensure that header along with data is created and copied to file.
                              For example:



                              hive -e 'set hive.cli.print.header=true; select * from your_Table' | sed 's/[t]/,/g'  > /home/yourfile.csv


                              If you don't want to write to local file system, pipe the output of sed command back into HDFS using the hadoop fs -put command.






                              share|improve this answer


























                              • By using this command the hive data types such as 'double' are not carried forward in CSV. So when I read the CSV all are read as a string.

                                – Aman Mathur
                                Jun 25 '15 at 12:17











                              • This will read table from bash ?

                                – Thomas Decaux
                                May 9 '17 at 18:27
















                              44














                              or use this



                              hive -e 'select * from your_Table' | sed 's/[t]/,/g'  > /home/yourfile.csv


                              You can also specify property set hive.cli.print.header=true before the SELECT to ensure that header along with data is created and copied to file.
                              For example:



                              hive -e 'set hive.cli.print.header=true; select * from your_Table' | sed 's/[t]/,/g'  > /home/yourfile.csv


                              If you don't want to write to local file system, pipe the output of sed command back into HDFS using the hadoop fs -put command.






                              share|improve this answer


























                              • By using this command the hive data types such as 'double' are not carried forward in CSV. So when I read the CSV all are read as a string.

                                – Aman Mathur
                                Jun 25 '15 at 12:17











                              • This will read table from bash ?

                                – Thomas Decaux
                                May 9 '17 at 18:27














                              44












                              44








                              44







                              or use this



                              hive -e 'select * from your_Table' | sed 's/[t]/,/g'  > /home/yourfile.csv


                              You can also specify property set hive.cli.print.header=true before the SELECT to ensure that header along with data is created and copied to file.
                              For example:



                              hive -e 'set hive.cli.print.header=true; select * from your_Table' | sed 's/[t]/,/g'  > /home/yourfile.csv


                              If you don't want to write to local file system, pipe the output of sed command back into HDFS using the hadoop fs -put command.






                              share|improve this answer















                              or use this



                              hive -e 'select * from your_Table' | sed 's/[t]/,/g'  > /home/yourfile.csv


                              You can also specify property set hive.cli.print.header=true before the SELECT to ensure that header along with data is created and copied to file.
                              For example:



                              hive -e 'set hive.cli.print.header=true; select * from your_Table' | sed 's/[t]/,/g'  > /home/yourfile.csv


                              If you don't want to write to local file system, pipe the output of sed command back into HDFS using the hadoop fs -put command.







                              share|improve this answer














                              share|improve this answer



                              share|improve this answer








                              edited Apr 23 '15 at 23:01









                              Dave Sag

                              8,233663104




                              8,233663104










                              answered May 29 '14 at 23:46









                              user1922900user1922900

                              55953




                              55953













                              • By using this command the hive data types such as 'double' are not carried forward in CSV. So when I read the CSV all are read as a string.

                                – Aman Mathur
                                Jun 25 '15 at 12:17











                              • This will read table from bash ?

                                – Thomas Decaux
                                May 9 '17 at 18:27



















                              • By using this command the hive data types such as 'double' are not carried forward in CSV. So when I read the CSV all are read as a string.

                                – Aman Mathur
                                Jun 25 '15 at 12:17











                              • This will read table from bash ?

                                – Thomas Decaux
                                May 9 '17 at 18:27

















                              By using this command the hive data types such as 'double' are not carried forward in CSV. So when I read the CSV all are read as a string.

                              – Aman Mathur
                              Jun 25 '15 at 12:17





                              By using this command the hive data types such as 'double' are not carried forward in CSV. So when I read the CSV all are read as a string.

                              – Aman Mathur
                              Jun 25 '15 at 12:17













                              This will read table from bash ?

                              – Thomas Decaux
                              May 9 '17 at 18:27





                              This will read table from bash ?

                              – Thomas Decaux
                              May 9 '17 at 18:27











                              29














                              That should work for you





                              • tab separated



                                hive -e 'select * from some_table' > /home/yourfile.tsv



                              • comma separated



                                hive -e 'select * from some_table' | sed 's/[t]/,/g' > /home/yourfile.csv







                              share|improve this answer





















                              • 1





                                this will export as tab-separated

                                – Brett Bonner
                                Aug 2 '15 at 2:08











                              • It is working: hive -e 'use <database or schema name>; select * from <table_name>;' > <absolute path for the csv file>/<csv file name>.csv

                                – JGS
                                May 12 '16 at 10:30






                              • 1





                                Excellent!! Made my day!

                                – prashanth
                                Jan 30 '18 at 18:30
















                              29














                              That should work for you





                              • tab separated



                                hive -e 'select * from some_table' > /home/yourfile.tsv



                              • comma separated



                                hive -e 'select * from some_table' | sed 's/[t]/,/g' > /home/yourfile.csv







                              share|improve this answer





















                              • 1





                                this will export as tab-separated

                                – Brett Bonner
                                Aug 2 '15 at 2:08











                              • It is working: hive -e 'use <database or schema name>; select * from <table_name>;' > <absolute path for the csv file>/<csv file name>.csv

                                – JGS
                                May 12 '16 at 10:30






                              • 1





                                Excellent!! Made my day!

                                – prashanth
                                Jan 30 '18 at 18:30














                              29












                              29








                              29







                              That should work for you





                              • tab separated



                                hive -e 'select * from some_table' > /home/yourfile.tsv



                              • comma separated



                                hive -e 'select * from some_table' | sed 's/[t]/,/g' > /home/yourfile.csv







                              share|improve this answer















                              That should work for you





                              • tab separated



                                hive -e 'select * from some_table' > /home/yourfile.tsv



                              • comma separated



                                hive -e 'select * from some_table' | sed 's/[t]/,/g' > /home/yourfile.csv








                              share|improve this answer














                              share|improve this answer



                              share|improve this answer








                              edited May 22 '17 at 9:27









                              Anton Protopopov

                              14.8k34659




                              14.8k34659










                              answered May 2 '14 at 10:24









                              SaadSaad

                              1,03811421




                              1,03811421








                              • 1





                                this will export as tab-separated

                                – Brett Bonner
                                Aug 2 '15 at 2:08











                              • It is working: hive -e 'use <database or schema name>; select * from <table_name>;' > <absolute path for the csv file>/<csv file name>.csv

                                – JGS
                                May 12 '16 at 10:30






                              • 1





                                Excellent!! Made my day!

                                – prashanth
                                Jan 30 '18 at 18:30














                              • 1





                                this will export as tab-separated

                                – Brett Bonner
                                Aug 2 '15 at 2:08











                              • It is working: hive -e 'use <database or schema name>; select * from <table_name>;' > <absolute path for the csv file>/<csv file name>.csv

                                – JGS
                                May 12 '16 at 10:30






                              • 1





                                Excellent!! Made my day!

                                – prashanth
                                Jan 30 '18 at 18:30








                              1




                              1





                              this will export as tab-separated

                              – Brett Bonner
                              Aug 2 '15 at 2:08





                              this will export as tab-separated

                              – Brett Bonner
                              Aug 2 '15 at 2:08













                              It is working: hive -e 'use <database or schema name>; select * from <table_name>;' > <absolute path for the csv file>/<csv file name>.csv

                              – JGS
                              May 12 '16 at 10:30





                              It is working: hive -e 'use <database or schema name>; select * from <table_name>;' > <absolute path for the csv file>/<csv file name>.csv

                              – JGS
                              May 12 '16 at 10:30




                              1




                              1





                              Excellent!! Made my day!

                              – prashanth
                              Jan 30 '18 at 18:30





                              Excellent!! Made my day!

                              – prashanth
                              Jan 30 '18 at 18:30











                              21














                              You can not have a delimiter for query output,after generating the report (as you did).



                              you can change the delimiter to comma.



                              It comes with default delimiter 01 (inivisible character).



                              hadoop fs -cat /user/data/output/test/* |tr "1" "," >>outputwithcomma.csv


                              check this also






                              share|improve this answer






























                                21














                                You can not have a delimiter for query output,after generating the report (as you did).



                                you can change the delimiter to comma.



                                It comes with default delimiter 01 (inivisible character).



                                hadoop fs -cat /user/data/output/test/* |tr "1" "," >>outputwithcomma.csv


                                check this also






                                share|improve this answer




























                                  21












                                  21








                                  21







                                  You can not have a delimiter for query output,after generating the report (as you did).



                                  you can change the delimiter to comma.



                                  It comes with default delimiter 01 (inivisible character).



                                  hadoop fs -cat /user/data/output/test/* |tr "1" "," >>outputwithcomma.csv


                                  check this also






                                  share|improve this answer















                                  You can not have a delimiter for query output,after generating the report (as you did).



                                  you can change the delimiter to comma.



                                  It comes with default delimiter 01 (inivisible character).



                                  hadoop fs -cat /user/data/output/test/* |tr "1" "," >>outputwithcomma.csv


                                  check this also







                                  share|improve this answer














                                  share|improve this answer



                                  share|improve this answer








                                  edited May 23 '17 at 11:55









                                  Community

                                  11




                                  11










                                  answered Jun 13 '13 at 12:44









                                  Balaswamy VaddemanBalaswamy Vaddeman

                                  6,17032137




                                  6,17032137























                                      6














                                      Recent versions of hive comes with this feature.



                                      INSERT OVERWRITE LOCAL DIRECTORY '/home/lvermeer/temp' 
                                      ROW FORMAT DELIMITED
                                      FIELDS TERMINATED BY ','
                                      select * from table;


                                      this way you can choose your own delimiter and file name.
                                      Just be careful with the "OVERWRITE" it will try to delete everything from the mentioned folder.






                                      share|improve this answer






























                                        6














                                        Recent versions of hive comes with this feature.



                                        INSERT OVERWRITE LOCAL DIRECTORY '/home/lvermeer/temp' 
                                        ROW FORMAT DELIMITED
                                        FIELDS TERMINATED BY ','
                                        select * from table;


                                        this way you can choose your own delimiter and file name.
                                        Just be careful with the "OVERWRITE" it will try to delete everything from the mentioned folder.






                                        share|improve this answer




























                                          6












                                          6








                                          6







                                          Recent versions of hive comes with this feature.



                                          INSERT OVERWRITE LOCAL DIRECTORY '/home/lvermeer/temp' 
                                          ROW FORMAT DELIMITED
                                          FIELDS TERMINATED BY ','
                                          select * from table;


                                          this way you can choose your own delimiter and file name.
                                          Just be careful with the "OVERWRITE" it will try to delete everything from the mentioned folder.






                                          share|improve this answer















                                          Recent versions of hive comes with this feature.



                                          INSERT OVERWRITE LOCAL DIRECTORY '/home/lvermeer/temp' 
                                          ROW FORMAT DELIMITED
                                          FIELDS TERMINATED BY ','
                                          select * from table;


                                          this way you can choose your own delimiter and file name.
                                          Just be careful with the "OVERWRITE" it will try to delete everything from the mentioned folder.







                                          share|improve this answer














                                          share|improve this answer



                                          share|improve this answer








                                          edited Nov 13 '16 at 14:10









                                          Ram Ghadiyaram

                                          16.4k64477




                                          16.4k64477










                                          answered Mar 23 '15 at 9:16









                                          sunilsunil

                                          7501819




                                          7501819























                                              6














                                              INSERT OVERWRITE LOCAL DIRECTORY '/home/lvermeer/temp' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' select * from table; 


                                              is the correct answer.



                                              If the number of records is really big, based on the number of files generated



                                              the following command would give only partial result.



                                              hive -e 'select * from some_table' > /home/yourfile.csv





                                              share|improve this answer






























                                                6














                                                INSERT OVERWRITE LOCAL DIRECTORY '/home/lvermeer/temp' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' select * from table; 


                                                is the correct answer.



                                                If the number of records is really big, based on the number of files generated



                                                the following command would give only partial result.



                                                hive -e 'select * from some_table' > /home/yourfile.csv





                                                share|improve this answer




























                                                  6












                                                  6








                                                  6







                                                  INSERT OVERWRITE LOCAL DIRECTORY '/home/lvermeer/temp' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' select * from table; 


                                                  is the correct answer.



                                                  If the number of records is really big, based on the number of files generated



                                                  the following command would give only partial result.



                                                  hive -e 'select * from some_table' > /home/yourfile.csv





                                                  share|improve this answer















                                                  INSERT OVERWRITE LOCAL DIRECTORY '/home/lvermeer/temp' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' select * from table; 


                                                  is the correct answer.



                                                  If the number of records is really big, based on the number of files generated



                                                  the following command would give only partial result.



                                                  hive -e 'select * from some_table' > /home/yourfile.csv






                                                  share|improve this answer














                                                  share|improve this answer



                                                  share|improve this answer








                                                  edited Dec 2 '16 at 9:11









                                                  Kishore

                                                  3,84531238




                                                  3,84531238










                                                  answered Oct 2 '15 at 18:29









                                                  JsimJsim

                                                  6112




                                                  6112























                                                      4














                                                      I have used simple linux shell piping + perl to convert hive generated output from tsv to csv.



                                                      hive -e "SELECT col1, col2, … FROM table_name" | perl -lpe 's/"/\"/g; s/^|$/"/g; s/t/","/g' > output_file.csv


                                                      (I got the updated perl regex from someone in stackoverflow some time ago)



                                                      The result will be like regular csv:



                                                      "col1","col2","col3"... and so on






                                                      share|improve this answer






























                                                        4














                                                        I have used simple linux shell piping + perl to convert hive generated output from tsv to csv.



                                                        hive -e "SELECT col1, col2, … FROM table_name" | perl -lpe 's/"/\"/g; s/^|$/"/g; s/t/","/g' > output_file.csv


                                                        (I got the updated perl regex from someone in stackoverflow some time ago)



                                                        The result will be like regular csv:



                                                        "col1","col2","col3"... and so on






                                                        share|improve this answer




























                                                          4












                                                          4








                                                          4







                                                          I have used simple linux shell piping + perl to convert hive generated output from tsv to csv.



                                                          hive -e "SELECT col1, col2, … FROM table_name" | perl -lpe 's/"/\"/g; s/^|$/"/g; s/t/","/g' > output_file.csv


                                                          (I got the updated perl regex from someone in stackoverflow some time ago)



                                                          The result will be like regular csv:



                                                          "col1","col2","col3"... and so on






                                                          share|improve this answer















                                                          I have used simple linux shell piping + perl to convert hive generated output from tsv to csv.



                                                          hive -e "SELECT col1, col2, … FROM table_name" | perl -lpe 's/"/\"/g; s/^|$/"/g; s/t/","/g' > output_file.csv


                                                          (I got the updated perl regex from someone in stackoverflow some time ago)



                                                          The result will be like regular csv:



                                                          "col1","col2","col3"... and so on







                                                          share|improve this answer














                                                          share|improve this answer



                                                          share|improve this answer








                                                          edited Nov 13 '16 at 14:10









                                                          Ram Ghadiyaram

                                                          16.4k64477




                                                          16.4k64477










                                                          answered Mar 22 '15 at 4:45









                                                          Firman GautamaFirman Gautama

                                                          815




                                                          815























                                                              4














                                                              The following script should work for you:



                                                              #!/bin/bash
                                                              hive -e "insert overwrite local directory '/LocalPath/'
                                                              row format delimited fields terminated by ','
                                                              select * from Mydatabase,Mytable limit 100"
                                                              cat /LocalPath/* > /LocalPath/table.csv


                                                              I used limit 100 to limit the size of data since I had a huge table, but you can delete it to export the entire table.






                                                              share|improve this answer




























                                                                4














                                                                The following script should work for you:



                                                                #!/bin/bash
                                                                hive -e "insert overwrite local directory '/LocalPath/'
                                                                row format delimited fields terminated by ','
                                                                select * from Mydatabase,Mytable limit 100"
                                                                cat /LocalPath/* > /LocalPath/table.csv


                                                                I used limit 100 to limit the size of data since I had a huge table, but you can delete it to export the entire table.






                                                                share|improve this answer


























                                                                  4












                                                                  4








                                                                  4







                                                                  The following script should work for you:



                                                                  #!/bin/bash
                                                                  hive -e "insert overwrite local directory '/LocalPath/'
                                                                  row format delimited fields terminated by ','
                                                                  select * from Mydatabase,Mytable limit 100"
                                                                  cat /LocalPath/* > /LocalPath/table.csv


                                                                  I used limit 100 to limit the size of data since I had a huge table, but you can delete it to export the entire table.






                                                                  share|improve this answer













                                                                  The following script should work for you:



                                                                  #!/bin/bash
                                                                  hive -e "insert overwrite local directory '/LocalPath/'
                                                                  row format delimited fields terminated by ','
                                                                  select * from Mydatabase,Mytable limit 100"
                                                                  cat /LocalPath/* > /LocalPath/table.csv


                                                                  I used limit 100 to limit the size of data since I had a huge table, but you can delete it to export the entire table.







                                                                  share|improve this answer












                                                                  share|improve this answer



                                                                  share|improve this answer










                                                                  answered Mar 21 '18 at 12:07









                                                                  HISIHISI

                                                                  1,49511023




                                                                  1,49511023























                                                                      2














                                                                      Here using Hive warehouse dir you can export data instead of Hive table.
                                                                      first give hive warehouse path and after local path where you want to store the .csv file
                                                                      For this command is bellow :-



                                                                      hadoop fs -cat /user/hdusr/warehouse/HiveDb/tableName/* > /users/hadoop/test/nilesh/sample.csv





                                                                      share|improve this answer




























                                                                        2














                                                                        Here using Hive warehouse dir you can export data instead of Hive table.
                                                                        first give hive warehouse path and after local path where you want to store the .csv file
                                                                        For this command is bellow :-



                                                                        hadoop fs -cat /user/hdusr/warehouse/HiveDb/tableName/* > /users/hadoop/test/nilesh/sample.csv





                                                                        share|improve this answer


























                                                                          2












                                                                          2








                                                                          2







                                                                          Here using Hive warehouse dir you can export data instead of Hive table.
                                                                          first give hive warehouse path and after local path where you want to store the .csv file
                                                                          For this command is bellow :-



                                                                          hadoop fs -cat /user/hdusr/warehouse/HiveDb/tableName/* > /users/hadoop/test/nilesh/sample.csv





                                                                          share|improve this answer













                                                                          Here using Hive warehouse dir you can export data instead of Hive table.
                                                                          first give hive warehouse path and after local path where you want to store the .csv file
                                                                          For this command is bellow :-



                                                                          hadoop fs -cat /user/hdusr/warehouse/HiveDb/tableName/* > /users/hadoop/test/nilesh/sample.csv






                                                                          share|improve this answer












                                                                          share|improve this answer



                                                                          share|improve this answer










                                                                          answered Nov 3 '17 at 3:40









                                                                          Nilesh ShindeNilesh Shinde

                                                                          28547




                                                                          28547























                                                                              1














                                                                              I had a similar issue and this is how I was able to address it.



                                                                              Step 1 - Loaded the data from hive table into another table as follows




                                                                              DROP TABLE IF EXISTS TestHiveTableCSV;
                                                                              CREATE TABLE TestHiveTableCSV ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY 'n' AS
                                                                              SELECT Column List FROM TestHiveTable;




                                                                              Step 2 - Copied the blob from hive warehouse to the new location with appropriate extension




                                                                              Start-AzureStorageBlobCopy
                                                                              -DestContext $destContext

                                                                              -SrcContainer "Source Container"
                                                                              -SrcBlob "hive/warehouse/TestHiveTableCSV/000000_0"

                                                                              -DestContainer "Destination Container" `
                                                                              -DestBlob "CSV/TestHiveTable.csv"




                                                                              Hope this helps!



                                                                              Best Regards,
                                                                              Dattatrey Sindol (Datta)
                                                                              http://dattatreysindol.com






                                                                              share|improve this answer




























                                                                                1














                                                                                I had a similar issue and this is how I was able to address it.



                                                                                Step 1 - Loaded the data from hive table into another table as follows




                                                                                DROP TABLE IF EXISTS TestHiveTableCSV;
                                                                                CREATE TABLE TestHiveTableCSV ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY 'n' AS
                                                                                SELECT Column List FROM TestHiveTable;




                                                                                Step 2 - Copied the blob from hive warehouse to the new location with appropriate extension




                                                                                Start-AzureStorageBlobCopy
                                                                                -DestContext $destContext

                                                                                -SrcContainer "Source Container"
                                                                                -SrcBlob "hive/warehouse/TestHiveTableCSV/000000_0"

                                                                                -DestContainer "Destination Container" `
                                                                                -DestBlob "CSV/TestHiveTable.csv"




                                                                                Hope this helps!



                                                                                Best Regards,
                                                                                Dattatrey Sindol (Datta)
                                                                                http://dattatreysindol.com






                                                                                share|improve this answer


























                                                                                  1












                                                                                  1








                                                                                  1







                                                                                  I had a similar issue and this is how I was able to address it.



                                                                                  Step 1 - Loaded the data from hive table into another table as follows




                                                                                  DROP TABLE IF EXISTS TestHiveTableCSV;
                                                                                  CREATE TABLE TestHiveTableCSV ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY 'n' AS
                                                                                  SELECT Column List FROM TestHiveTable;




                                                                                  Step 2 - Copied the blob from hive warehouse to the new location with appropriate extension




                                                                                  Start-AzureStorageBlobCopy
                                                                                  -DestContext $destContext

                                                                                  -SrcContainer "Source Container"
                                                                                  -SrcBlob "hive/warehouse/TestHiveTableCSV/000000_0"

                                                                                  -DestContainer "Destination Container" `
                                                                                  -DestBlob "CSV/TestHiveTable.csv"




                                                                                  Hope this helps!



                                                                                  Best Regards,
                                                                                  Dattatrey Sindol (Datta)
                                                                                  http://dattatreysindol.com






                                                                                  share|improve this answer













                                                                                  I had a similar issue and this is how I was able to address it.



                                                                                  Step 1 - Loaded the data from hive table into another table as follows




                                                                                  DROP TABLE IF EXISTS TestHiveTableCSV;
                                                                                  CREATE TABLE TestHiveTableCSV ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY 'n' AS
                                                                                  SELECT Column List FROM TestHiveTable;




                                                                                  Step 2 - Copied the blob from hive warehouse to the new location with appropriate extension




                                                                                  Start-AzureStorageBlobCopy
                                                                                  -DestContext $destContext

                                                                                  -SrcContainer "Source Container"
                                                                                  -SrcBlob "hive/warehouse/TestHiveTableCSV/000000_0"

                                                                                  -DestContainer "Destination Container" `
                                                                                  -DestBlob "CSV/TestHiveTable.csv"




                                                                                  Hope this helps!



                                                                                  Best Regards,
                                                                                  Dattatrey Sindol (Datta)
                                                                                  http://dattatreysindol.com







                                                                                  share|improve this answer












                                                                                  share|improve this answer



                                                                                  share|improve this answer










                                                                                  answered May 29 '14 at 14:19









                                                                                  Dattatrey SindolDattatrey Sindol

                                                                                  637




                                                                                  637























                                                                                      1














                                                                                      There are ways to change the default delimiter, as shown by other answers.



                                                                                      There are also ways to convert the raw output to csv with some bash scripting. There are 3 delimiters to consider though, not just 01. Things get a bit more complicated when your hive table has maps.



                                                                                      I wrote a bash script that can handle all 3 default delimiters (01 02 and 03) from hive and output a csv. The script and some more info are here:




                                                                                      Hive Default Delimiters to CSV



                                                                                      Hive's default delimiters are



                                                                                      Row Delimiter => Control-A ('01')
                                                                                      Collection Item Delimiter => Control-B ('02')
                                                                                      Map Key Delimiter => Control-C ('03')


                                                                                      There are ways to change these delimiters when exporting tables but
                                                                                      sometimes you might still get stuck needing to convert this to csv.



                                                                                      Here's a quick bash script that can handle a DB export that's
                                                                                      segmented in multiple files and has the default delimiters. It will
                                                                                      output a single CSV file.



                                                                                      It is assumed that the segments all have the naming convention 000*_0



                                                                                      INDIRECTORY="path/to/input/directory"
                                                                                      for f in $INDIRECTORY/000*_0; do
                                                                                      echo "Processing $f file..";
                                                                                      cat -v $f |
                                                                                      LC_ALL=C sed -e "s/^/"/g" |
                                                                                      LC_ALL=C sed -e "s/^A/","/g" |
                                                                                      LC_ALL=C sed -e "s/^C^B/"":"""",""/g" |
                                                                                      LC_ALL=C sed -e "s/^B/"",""/g" |
                                                                                      LC_ALL=C sed -e "s/^C/"":""/g" |
                                                                                      LC_ALL=C sed -e "s/$/"/g" > $f-temp
                                                                                      done
                                                                                      echo "you,can,echo,your,header,here,if,you,like" > $INDIRECTORY/final_output.csv
                                                                                      cat $INDIRECTORY/*-temp >> $INDIRECTORY/final_output.csv
                                                                                      rm $INDIRECTORY/*-temp



                                                                                      More explanation on the gist






                                                                                      share|improve this answer






























                                                                                        1














                                                                                        There are ways to change the default delimiter, as shown by other answers.



                                                                                        There are also ways to convert the raw output to csv with some bash scripting. There are 3 delimiters to consider though, not just 01. Things get a bit more complicated when your hive table has maps.



                                                                                        I wrote a bash script that can handle all 3 default delimiters (01 02 and 03) from hive and output a csv. The script and some more info are here:




                                                                                        Hive Default Delimiters to CSV



                                                                                        Hive's default delimiters are



                                                                                        Row Delimiter => Control-A ('01')
                                                                                        Collection Item Delimiter => Control-B ('02')
                                                                                        Map Key Delimiter => Control-C ('03')


                                                                                        There are ways to change these delimiters when exporting tables but
                                                                                        sometimes you might still get stuck needing to convert this to csv.



                                                                                        Here's a quick bash script that can handle a DB export that's
                                                                                        segmented in multiple files and has the default delimiters. It will
                                                                                        output a single CSV file.



                                                                                        It is assumed that the segments all have the naming convention 000*_0



                                                                                        INDIRECTORY="path/to/input/directory"
                                                                                        for f in $INDIRECTORY/000*_0; do
                                                                                        echo "Processing $f file..";
                                                                                        cat -v $f |
                                                                                        LC_ALL=C sed -e "s/^/"/g" |
                                                                                        LC_ALL=C sed -e "s/^A/","/g" |
                                                                                        LC_ALL=C sed -e "s/^C^B/"":"""",""/g" |
                                                                                        LC_ALL=C sed -e "s/^B/"",""/g" |
                                                                                        LC_ALL=C sed -e "s/^C/"":""/g" |
                                                                                        LC_ALL=C sed -e "s/$/"/g" > $f-temp
                                                                                        done
                                                                                        echo "you,can,echo,your,header,here,if,you,like" > $INDIRECTORY/final_output.csv
                                                                                        cat $INDIRECTORY/*-temp >> $INDIRECTORY/final_output.csv
                                                                                        rm $INDIRECTORY/*-temp



                                                                                        More explanation on the gist






                                                                                        share|improve this answer




























                                                                                          1












                                                                                          1








                                                                                          1







                                                                                          There are ways to change the default delimiter, as shown by other answers.



                                                                                          There are also ways to convert the raw output to csv with some bash scripting. There are 3 delimiters to consider though, not just 01. Things get a bit more complicated when your hive table has maps.



                                                                                          I wrote a bash script that can handle all 3 default delimiters (01 02 and 03) from hive and output a csv. The script and some more info are here:




                                                                                          Hive Default Delimiters to CSV



                                                                                          Hive's default delimiters are



                                                                                          Row Delimiter => Control-A ('01')
                                                                                          Collection Item Delimiter => Control-B ('02')
                                                                                          Map Key Delimiter => Control-C ('03')


                                                                                          There are ways to change these delimiters when exporting tables but
                                                                                          sometimes you might still get stuck needing to convert this to csv.



                                                                                          Here's a quick bash script that can handle a DB export that's
                                                                                          segmented in multiple files and has the default delimiters. It will
                                                                                          output a single CSV file.



                                                                                          It is assumed that the segments all have the naming convention 000*_0



                                                                                          INDIRECTORY="path/to/input/directory"
                                                                                          for f in $INDIRECTORY/000*_0; do
                                                                                          echo "Processing $f file..";
                                                                                          cat -v $f |
                                                                                          LC_ALL=C sed -e "s/^/"/g" |
                                                                                          LC_ALL=C sed -e "s/^A/","/g" |
                                                                                          LC_ALL=C sed -e "s/^C^B/"":"""",""/g" |
                                                                                          LC_ALL=C sed -e "s/^B/"",""/g" |
                                                                                          LC_ALL=C sed -e "s/^C/"":""/g" |
                                                                                          LC_ALL=C sed -e "s/$/"/g" > $f-temp
                                                                                          done
                                                                                          echo "you,can,echo,your,header,here,if,you,like" > $INDIRECTORY/final_output.csv
                                                                                          cat $INDIRECTORY/*-temp >> $INDIRECTORY/final_output.csv
                                                                                          rm $INDIRECTORY/*-temp



                                                                                          More explanation on the gist






                                                                                          share|improve this answer















                                                                                          There are ways to change the default delimiter, as shown by other answers.



                                                                                          There are also ways to convert the raw output to csv with some bash scripting. There are 3 delimiters to consider though, not just 01. Things get a bit more complicated when your hive table has maps.



                                                                                          I wrote a bash script that can handle all 3 default delimiters (01 02 and 03) from hive and output a csv. The script and some more info are here:




                                                                                          Hive Default Delimiters to CSV



                                                                                          Hive's default delimiters are



                                                                                          Row Delimiter => Control-A ('01')
                                                                                          Collection Item Delimiter => Control-B ('02')
                                                                                          Map Key Delimiter => Control-C ('03')


                                                                                          There are ways to change these delimiters when exporting tables but
                                                                                          sometimes you might still get stuck needing to convert this to csv.



                                                                                          Here's a quick bash script that can handle a DB export that's
                                                                                          segmented in multiple files and has the default delimiters. It will
                                                                                          output a single CSV file.



                                                                                          It is assumed that the segments all have the naming convention 000*_0



                                                                                          INDIRECTORY="path/to/input/directory"
                                                                                          for f in $INDIRECTORY/000*_0; do
                                                                                          echo "Processing $f file..";
                                                                                          cat -v $f |
                                                                                          LC_ALL=C sed -e "s/^/"/g" |
                                                                                          LC_ALL=C sed -e "s/^A/","/g" |
                                                                                          LC_ALL=C sed -e "s/^C^B/"":"""",""/g" |
                                                                                          LC_ALL=C sed -e "s/^B/"",""/g" |
                                                                                          LC_ALL=C sed -e "s/^C/"":""/g" |
                                                                                          LC_ALL=C sed -e "s/$/"/g" > $f-temp
                                                                                          done
                                                                                          echo "you,can,echo,your,header,here,if,you,like" > $INDIRECTORY/final_output.csv
                                                                                          cat $INDIRECTORY/*-temp >> $INDIRECTORY/final_output.csv
                                                                                          rm $INDIRECTORY/*-temp



                                                                                          More explanation on the gist







                                                                                          share|improve this answer














                                                                                          share|improve this answer



                                                                                          share|improve this answer








                                                                                          edited Mar 18 '16 at 14:28

























                                                                                          answered Mar 18 '16 at 13:11









                                                                                          alex9311alex9311

                                                                                          67311237




                                                                                          67311237























                                                                                              1














                                                                                              In case you are doing it from Windows you can use Python script hivehoney to extract table data to local CSV file.



                                                                                              It will:




                                                                                              • Login to bastion host.

                                                                                              • pbrun.

                                                                                              • kinit.

                                                                                              • beeline (with your query).

                                                                                              • Save
                                                                                                echo from beeline to a file on Windows.


                                                                                              Execute it like this:



                                                                                              set PROXY_HOST=your_bastion_host

                                                                                              set SERVICE_USER=you_func_user

                                                                                              set LINUX_USER=your_SOID

                                                                                              set LINUX_PWD=your_pwd

                                                                                              python hh.py --query_file=query.sql





                                                                                              share|improve this answer






























                                                                                                1














                                                                                                In case you are doing it from Windows you can use Python script hivehoney to extract table data to local CSV file.



                                                                                                It will:




                                                                                                • Login to bastion host.

                                                                                                • pbrun.

                                                                                                • kinit.

                                                                                                • beeline (with your query).

                                                                                                • Save
                                                                                                  echo from beeline to a file on Windows.


                                                                                                Execute it like this:



                                                                                                set PROXY_HOST=your_bastion_host

                                                                                                set SERVICE_USER=you_func_user

                                                                                                set LINUX_USER=your_SOID

                                                                                                set LINUX_PWD=your_pwd

                                                                                                python hh.py --query_file=query.sql





                                                                                                share|improve this answer




























                                                                                                  1












                                                                                                  1








                                                                                                  1







                                                                                                  In case you are doing it from Windows you can use Python script hivehoney to extract table data to local CSV file.



                                                                                                  It will:




                                                                                                  • Login to bastion host.

                                                                                                  • pbrun.

                                                                                                  • kinit.

                                                                                                  • beeline (with your query).

                                                                                                  • Save
                                                                                                    echo from beeline to a file on Windows.


                                                                                                  Execute it like this:



                                                                                                  set PROXY_HOST=your_bastion_host

                                                                                                  set SERVICE_USER=you_func_user

                                                                                                  set LINUX_USER=your_SOID

                                                                                                  set LINUX_PWD=your_pwd

                                                                                                  python hh.py --query_file=query.sql





                                                                                                  share|improve this answer















                                                                                                  In case you are doing it from Windows you can use Python script hivehoney to extract table data to local CSV file.



                                                                                                  It will:




                                                                                                  • Login to bastion host.

                                                                                                  • pbrun.

                                                                                                  • kinit.

                                                                                                  • beeline (with your query).

                                                                                                  • Save
                                                                                                    echo from beeline to a file on Windows.


                                                                                                  Execute it like this:



                                                                                                  set PROXY_HOST=your_bastion_host

                                                                                                  set SERVICE_USER=you_func_user

                                                                                                  set LINUX_USER=your_SOID

                                                                                                  set LINUX_PWD=your_pwd

                                                                                                  python hh.py --query_file=query.sql






                                                                                                  share|improve this answer














                                                                                                  share|improve this answer



                                                                                                  share|improve this answer








                                                                                                  edited Sep 20 '18 at 19:19

























                                                                                                  answered Sep 4 '18 at 18:42









                                                                                                  Alex BAlex B

                                                                                                  708719




                                                                                                  708719























                                                                                                      0














                                                                                                      The problem solutions are fine but I found some problems in both:





                                                                                                      • As Carter Shanklin said, with this command we will obtain a csv file with the results of the query in the path specified:



                                                                                                        insert overwrite local directory '/home/carter/staging' row format delimited fields terminated by ',' select * from hugetable;


                                                                                                        The problem with this solution is that the csv obtained won´t have headers and will create a file that is not a CSV (so we have to rename it).




                                                                                                      • As user1922900 said, with the following command we will obtain a CSV files with the results of the query in the specified file and with headers:



                                                                                                        hive -e 'select * from some_table' | sed 's/[t]/,/g' > /home/yourfile.csv


                                                                                                        With this solution we will get a CSV file with the result rows of our query, but with log messages between these rows too. As a solution of this problem I tried this, but without results.




                                                                                                      So, to solve all these issues I created a script that execute a list of queries, create a folder (with a timestamp) where it stores the results, rename the files obtained, remove the unnecesay files and it also add the respective headers.



                                                                                                       #!/bin/sh
                                                                                                      QUERIES=("select * from table1" "select * from table2")
                                                                                                      IFS=""
                                                                                                      directoryname=$(echo "ScriptResults$timestamp")
                                                                                                      mkdir $directoryname
                                                                                                      counter=1
                                                                                                      for query in ${QUERIES[*]}
                                                                                                      do
                                                                                                      tablename="query"$counter
                                                                                                      hive -S -e "INSERT OVERWRITE LOCAL DIRECTORY '/data/2/DOMAIN_USERS/SANUK/users/$USER/$tablename' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' $query ;"
                                                                                                      hive -S -e "set hive.cli.print.header=true; $query limit 1" | head -1 | sed 's/[t]/,/g' >> /data/2/DOMAIN_USERS/SANUK/users/$USER/$tablename/header.csv
                                                                                                      mv $tablename/000000_0 $tablename/$tablename.csv
                                                                                                      cat $tablename/$tablename.csv >> $tablename/header.csv.
                                                                                                      rm $tablename/$tablename.csv
                                                                                                      mv $tablename/header.csv $tablename/$tablename.csv
                                                                                                      mv $tablename/$tablename.csv $directoryname
                                                                                                      counter=$((counter+1))
                                                                                                      rm -rf $tablename/
                                                                                                      done





                                                                                                      share|improve this answer




























                                                                                                        0














                                                                                                        The problem solutions are fine but I found some problems in both:





                                                                                                        • As Carter Shanklin said, with this command we will obtain a csv file with the results of the query in the path specified:



                                                                                                          insert overwrite local directory '/home/carter/staging' row format delimited fields terminated by ',' select * from hugetable;


                                                                                                          The problem with this solution is that the csv obtained won´t have headers and will create a file that is not a CSV (so we have to rename it).




                                                                                                        • As user1922900 said, with the following command we will obtain a CSV files with the results of the query in the specified file and with headers:



                                                                                                          hive -e 'select * from some_table' | sed 's/[t]/,/g' > /home/yourfile.csv


                                                                                                          With this solution we will get a CSV file with the result rows of our query, but with log messages between these rows too. As a solution of this problem I tried this, but without results.




                                                                                                        So, to solve all these issues I created a script that execute a list of queries, create a folder (with a timestamp) where it stores the results, rename the files obtained, remove the unnecesay files and it also add the respective headers.



                                                                                                         #!/bin/sh
                                                                                                        QUERIES=("select * from table1" "select * from table2")
                                                                                                        IFS=""
                                                                                                        directoryname=$(echo "ScriptResults$timestamp")
                                                                                                        mkdir $directoryname
                                                                                                        counter=1
                                                                                                        for query in ${QUERIES[*]}
                                                                                                        do
                                                                                                        tablename="query"$counter
                                                                                                        hive -S -e "INSERT OVERWRITE LOCAL DIRECTORY '/data/2/DOMAIN_USERS/SANUK/users/$USER/$tablename' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' $query ;"
                                                                                                        hive -S -e "set hive.cli.print.header=true; $query limit 1" | head -1 | sed 's/[t]/,/g' >> /data/2/DOMAIN_USERS/SANUK/users/$USER/$tablename/header.csv
                                                                                                        mv $tablename/000000_0 $tablename/$tablename.csv
                                                                                                        cat $tablename/$tablename.csv >> $tablename/header.csv.
                                                                                                        rm $tablename/$tablename.csv
                                                                                                        mv $tablename/header.csv $tablename/$tablename.csv
                                                                                                        mv $tablename/$tablename.csv $directoryname
                                                                                                        counter=$((counter+1))
                                                                                                        rm -rf $tablename/
                                                                                                        done





                                                                                                        share|improve this answer


























                                                                                                          0












                                                                                                          0








                                                                                                          0







                                                                                                          The problem solutions are fine but I found some problems in both:





                                                                                                          • As Carter Shanklin said, with this command we will obtain a csv file with the results of the query in the path specified:



                                                                                                            insert overwrite local directory '/home/carter/staging' row format delimited fields terminated by ',' select * from hugetable;


                                                                                                            The problem with this solution is that the csv obtained won´t have headers and will create a file that is not a CSV (so we have to rename it).




                                                                                                          • As user1922900 said, with the following command we will obtain a CSV files with the results of the query in the specified file and with headers:



                                                                                                            hive -e 'select * from some_table' | sed 's/[t]/,/g' > /home/yourfile.csv


                                                                                                            With this solution we will get a CSV file with the result rows of our query, but with log messages between these rows too. As a solution of this problem I tried this, but without results.




                                                                                                          So, to solve all these issues I created a script that execute a list of queries, create a folder (with a timestamp) where it stores the results, rename the files obtained, remove the unnecesay files and it also add the respective headers.



                                                                                                           #!/bin/sh
                                                                                                          QUERIES=("select * from table1" "select * from table2")
                                                                                                          IFS=""
                                                                                                          directoryname=$(echo "ScriptResults$timestamp")
                                                                                                          mkdir $directoryname
                                                                                                          counter=1
                                                                                                          for query in ${QUERIES[*]}
                                                                                                          do
                                                                                                          tablename="query"$counter
                                                                                                          hive -S -e "INSERT OVERWRITE LOCAL DIRECTORY '/data/2/DOMAIN_USERS/SANUK/users/$USER/$tablename' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' $query ;"
                                                                                                          hive -S -e "set hive.cli.print.header=true; $query limit 1" | head -1 | sed 's/[t]/,/g' >> /data/2/DOMAIN_USERS/SANUK/users/$USER/$tablename/header.csv
                                                                                                          mv $tablename/000000_0 $tablename/$tablename.csv
                                                                                                          cat $tablename/$tablename.csv >> $tablename/header.csv.
                                                                                                          rm $tablename/$tablename.csv
                                                                                                          mv $tablename/header.csv $tablename/$tablename.csv
                                                                                                          mv $tablename/$tablename.csv $directoryname
                                                                                                          counter=$((counter+1))
                                                                                                          rm -rf $tablename/
                                                                                                          done





                                                                                                          share|improve this answer













                                                                                                          The problem solutions are fine but I found some problems in both:





                                                                                                          • As Carter Shanklin said, with this command we will obtain a csv file with the results of the query in the path specified:



                                                                                                            insert overwrite local directory '/home/carter/staging' row format delimited fields terminated by ',' select * from hugetable;


                                                                                                            The problem with this solution is that the csv obtained won´t have headers and will create a file that is not a CSV (so we have to rename it).




                                                                                                          • As user1922900 said, with the following command we will obtain a CSV files with the results of the query in the specified file and with headers:



                                                                                                            hive -e 'select * from some_table' | sed 's/[t]/,/g' > /home/yourfile.csv


                                                                                                            With this solution we will get a CSV file with the result rows of our query, but with log messages between these rows too. As a solution of this problem I tried this, but without results.




                                                                                                          So, to solve all these issues I created a script that execute a list of queries, create a folder (with a timestamp) where it stores the results, rename the files obtained, remove the unnecesay files and it also add the respective headers.



                                                                                                           #!/bin/sh
                                                                                                          QUERIES=("select * from table1" "select * from table2")
                                                                                                          IFS=""
                                                                                                          directoryname=$(echo "ScriptResults$timestamp")
                                                                                                          mkdir $directoryname
                                                                                                          counter=1
                                                                                                          for query in ${QUERIES[*]}
                                                                                                          do
                                                                                                          tablename="query"$counter
                                                                                                          hive -S -e "INSERT OVERWRITE LOCAL DIRECTORY '/data/2/DOMAIN_USERS/SANUK/users/$USER/$tablename' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' $query ;"
                                                                                                          hive -S -e "set hive.cli.print.header=true; $query limit 1" | head -1 | sed 's/[t]/,/g' >> /data/2/DOMAIN_USERS/SANUK/users/$USER/$tablename/header.csv
                                                                                                          mv $tablename/000000_0 $tablename/$tablename.csv
                                                                                                          cat $tablename/$tablename.csv >> $tablename/header.csv.
                                                                                                          rm $tablename/$tablename.csv
                                                                                                          mv $tablename/header.csv $tablename/$tablename.csv
                                                                                                          mv $tablename/$tablename.csv $directoryname
                                                                                                          counter=$((counter+1))
                                                                                                          rm -rf $tablename/
                                                                                                          done






                                                                                                          share|improve this answer












                                                                                                          share|improve this answer



                                                                                                          share|improve this answer










                                                                                                          answered Nov 21 '18 at 9:31









                                                                                                          AngryCoderAngryCoder

                                                                                                          325




                                                                                                          325















                                                                                                              Popular posts from this blog

                                                                                                              If I really need a card on my start hand, how many mulligans make sense? [duplicate]

                                                                                                              Alcedinidae

                                                                                                              Can an atomic nucleus contain both particles and antiparticles? [duplicate]