How to export a Hive table into a CSV file?

I used this Hive query to export a table into a CSV file.

INSERT OVERWRITE DIRECTORY '/user/data/output/test' select column1, column2 from table1;

The file generated '000000_0' does not have comma separator

Is this the right way to generate CSV file? If no, please let me know how can I generate the CSV file?

asked Jun 13 '13 at 12:04

Dunith Dhanushka

1,70441928

add a comment |

I used this Hive query to export a table into a CSV file.

INSERT OVERWRITE DIRECTORY '/user/data/output/test' select column1, column2 from table1;

The file generated '000000_0' does not have comma separator

Is this the right way to generate CSV file? If no, please let me know how can I generate the CSV file?

asked Jun 13 '13 at 12:04

Dunith Dhanushka

1,70441928

add a comment |

I used this Hive query to export a table into a CSV file.

INSERT OVERWRITE DIRECTORY '/user/data/output/test' select column1, column2 from table1;

The file generated '000000_0' does not have comma separator

Is this the right way to generate CSV file? If no, please let me know how can I generate the CSV file?

asked Jun 13 '13 at 12:04

Dunith Dhanushka

1,70441928

I used this Hive query to export a table into a CSV file.

INSERT OVERWRITE DIRECTORY '/user/data/output/test' select column1, column2 from table1;

The file generated '000000_0' does not have comma separator

Is this the right way to generate CSV file? If no, please let me know how can I generate the CSV file?

csv hive

asked Jun 13 '13 at 12:04

Dunith Dhanushka

1,70441928

asked Jun 13 '13 at 12:04

Dunith Dhanushka

1,70441928

asked Jun 13 '13 at 12:04

Dunith Dhanushka

1,70441928

asked Jun 13 '13 at 12:04

Dunith Dhanushka

1,70441928

asked Jun 13 '13 at 12:04

Dunith Dhanushka

1,70441928

add a comment |

13 Answers
13

active

oldest

votes

If you're using Hive 11 or better you can use the INSERT statement with the LOCAL keyword.

Example:

insert overwrite local directory '/home/carter/staging' row format delimited fields terminated by ',' select * from hugetable;

Note that this may create multiple files and you may want to concatenate them on the client side after it's done exporting.

Using this approach means you don't need to worry about the format of the source tables, can export based on arbitrary SQL query, and can select your own delimiters and output formats.

edited Nov 13 '16 at 14:12

Ram Ghadiyaram

16.4k64477

answered Nov 1 '13 at 15:59

Carter Shanklin

1,9861112

Thank you, this created folder with multiple csv files. Is there anyway to put everything into one file? Also is there anyway to include header (column name) in the csv file?

– mike
Jun 14 '17 at 13:36

1

How do you concatenate them on the client side after exporting?

– user2205916
May 24 '18 at 20:45

For me this command has produced a bunch of files ending with the extension .snappy which looks like a compressed format. I am not sure how to convert un-compress them. I know how to merge files locally using the command cat file1 file2 > file on my local machine.

– Ravi Chandra
Nov 27 '18 at 6:51

add a comment |

or use this

hive -e 'select * from your_Table' | sed 's/[t]/,/g'  > /home/yourfile.csv

You can also specify property set hive.cli.print.header=true before the SELECT to ensure that header along with data is created and copied to file.
For example:

hive -e 'set hive.cli.print.header=true; select * from your_Table' | sed 's/[t]/,/g'  > /home/yourfile.csv

If you don't want to write to local file system, pipe the output of sed command back into HDFS using the hadoop fs -put command.

edited Apr 23 '15 at 23:01

Dave Sag

8,233663104

answered May 29 '14 at 23:46

user1922900

55953

By using this command the hive data types such as 'double' are not carried forward in CSV. So when I read the CSV all are read as a string.

– Aman Mathur
Jun 25 '15 at 12:17

This will read table from bash ?

– Thomas Decaux
May 9 '17 at 18:27

add a comment |

That should work for you

tab separated

hive -e 'select * from some_table' > /home/yourfile.tsv

comma separated

hive -e 'select * from some_table' | sed 's/[t]/,/g' > /home/yourfile.csv

edited May 22 '17 at 9:27

Anton Protopopov

14.8k34659

answered May 2 '14 at 10:24

Saad

1,03811421

1

this will export as tab-separated

– Brett Bonner
Aug 2 '15 at 2:08

It is working: hive -e 'use <database or schema name>; select * from <table_name>;' > <absolute path for the csv file>/<csv file name>.csv

– JGS
May 12 '16 at 10:30

1

Excellent!! Made my day!

– prashanth
Jan 30 '18 at 18:30

add a comment |

You can not have a delimiter for query output,after generating the report (as you did).

you can change the delimiter to comma.

It comes with default delimiter 01 (inivisible character).

hadoop fs -cat /user/data/output/test/* |tr "1" "," >>outputwithcomma.csv

check this also

edited May 23 '17 at 11:55

Community♦

answered Jun 13 '13 at 12:44

Balaswamy Vaddeman

6,17032137

add a comment |

Recent versions of hive comes with this feature.

INSERT OVERWRITE LOCAL DIRECTORY '/home/lvermeer/temp' 

ROW FORMAT DELIMITED 

FIELDS TERMINATED BY ',' 

select * from table;

this way you can choose your own delimiter and file name.
Just be careful with the "OVERWRITE" it will try to delete everything from the mentioned folder.

edited Nov 13 '16 at 14:10

Ram Ghadiyaram

16.4k64477

answered Mar 23 '15 at 9:16

sunil

7501819

add a comment |

INSERT OVERWRITE LOCAL DIRECTORY '/home/lvermeer/temp' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' select * from table;

is the correct answer.

If the number of records is really big, based on the number of files generated

the following command would give only partial result.

hive -e 'select * from some_table' > /home/yourfile.csv

edited Dec 2 '16 at 9:11

Kishore

3,84531238

answered Oct 2 '15 at 18:29

Jsim

6112

add a comment |

I have used simple linux shell piping + perl to convert hive generated output from tsv to csv.

hive -e "SELECT col1, col2, … FROM table_name" | perl -lpe 's/"/\"/g; s/^|$/"/g; s/t/","/g' > output_file.csv

(I got the updated perl regex from someone in stackoverflow some time ago)

The result will be like regular csv:

"col1","col2","col3"... and so on

edited Nov 13 '16 at 14:10

Ram Ghadiyaram

16.4k64477

answered Mar 22 '15 at 4:45

Firman Gautama

815

add a comment |

The following script should work for you:

#!/bin/bash

hive -e "insert overwrite local directory '/LocalPath/'

row format delimited fields terminated by ','

select * from Mydatabase,Mytable limit 100"

cat /LocalPath/* > /LocalPath/table.csv

I used limit 100 to limit the size of data since I had a huge table, but you can delete it to export the entire table.

answered Mar 21 '18 at 12:07

HISI

1,49511023

add a comment |

Here using Hive warehouse dir you can export data instead of Hive table.
first give hive warehouse path and after local path where you want to store the .csv file
For this command is bellow :-

hadoop fs -cat /user/hdusr/warehouse/HiveDb/tableName/* > /users/hadoop/test/nilesh/sample.csv

answered Nov 3 '17 at 3:40

Nilesh Shinde

28547

add a comment |

I had a similar issue and this is how I was able to address it.

Step 1 - Loaded the data from hive table into another table as follows

DROP TABLE IF EXISTS TestHiveTableCSV;
CREATE TABLE TestHiveTableCSV ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY 'n' AS
SELECT Column List FROM TestHiveTable;

Step 2 - Copied the blob from hive warehouse to the new location with appropriate extension

Start-AzureStorageBlobCopy -DestContext $destContext
-SrcContainer "Source Container" -SrcBlob "hive/warehouse/TestHiveTableCSV/000000_0"
-DestContainer "Destination Container" `
-DestBlob "CSV/TestHiveTable.csv"

Hope this helps!

Best Regards,
Dattatrey Sindol (Datta)
http://dattatreysindol.com

answered May 29 '14 at 14:19

Dattatrey Sindol

637

add a comment |

There are ways to change the default delimiter, as shown by other answers.

There are also ways to convert the raw output to csv with some bash scripting. There are 3 delimiters to consider though, not just 01. Things get a bit more complicated when your hive table has maps.

I wrote a bash script that can handle all 3 default delimiters (01 02 and 03) from hive and output a csv. The script and some more info are here:

Hive Default Delimiters to CSV

Hive's default delimiters are
Row Delimiter => Control-A ('01')

Collection Item Delimiter => Control-B ('02')

Map Key Delimiter => Control-C ('03')
There are ways to change these delimiters when exporting tables but
sometimes you might still get stuck needing to convert this to csv.

Here's a quick bash script that can handle a DB export that's
segmented in multiple files and has the default delimiters. It will
output a single CSV file.

It is assumed that the segments all have the naming convention 000*_0
INDIRECTORY="path/to/input/directory"

for f in $INDIRECTORY/000*_0; do 

  echo "Processing $f file.."; 

  cat -v $f | 

      LC_ALL=C sed -e "s/^/"/g" | 

      LC_ALL=C sed -e "s/^A/","/g" | 

      LC_ALL=C sed -e "s/^C^B/"":"""",""/g" | 

      LC_ALL=C sed -e "s/^B/"",""/g" |  

      LC_ALL=C sed -e "s/^C/"":""/g" | 

      LC_ALL=C sed -e "s/$/"/g" > $f-temp

done

echo "you,can,echo,your,header,here,if,you,like" > $INDIRECTORY/final_output.csv

cat $INDIRECTORY/*-temp >> $INDIRECTORY/final_output.csv

rm $INDIRECTORY/*-temp

More explanation on the gist

edited Mar 18 '16 at 14:28

answered Mar 18 '16 at 13:11

alex9311

67311237

add a comment |

In case you are doing it from Windows you can use Python script hivehoney to extract table data to local CSV file.

It will:

pbrun.

kinit.

beeline (with your query).

Save
echo from beeline to a file on Windows.

Execute it like this:

set PROXY_HOST=your_bastion_host



set SERVICE_USER=you_func_user



set LINUX_USER=your_SOID



set LINUX_PWD=your_pwd



python hh.py --query_file=query.sql

edited Sep 20 '18 at 19:19

answered Sep 4 '18 at 18:42

Alex B

708719

add a comment |

The problem solutions are fine but I found some problems in both:

As Carter Shanklin said, with this command we will obtain a csv file with the results of the query in the path specified:
```
insert overwrite local directory '/home/carter/staging' row format delimited fields terminated by ',' select * from hugetable;
```
The problem with this solution is that the csv obtained won´t have headers and will create a file that is not a CSV (so we have to rename it).

As user1922900 said, with the following command we will obtain a CSV files with the results of the query in the specified file and with headers:
```
hive -e 'select * from some_table' | sed 's/[t]/,/g' > /home/yourfile.csv
```
With this solution we will get a CSV file with the result rows of our query, but with log messages between these rows too. As a solution of this problem I tried this, but without results.

So, to solve all these issues I created a script that execute a list of queries, create a folder (with a timestamp) where it stores the results, rename the files obtained, remove the unnecesay files and it also add the respective headers.

 #!/bin/sh

 QUERIES=("select * from table1" "select * from table2")

 IFS=""

 directoryname=$(echo "ScriptResults$timestamp")

 mkdir $directoryname 

 counter=1 

for query in ${QUERIES[*]}

 do 

     tablename="query"$counter 

     hive -S -e "INSERT OVERWRITE LOCAL DIRECTORY '/data/2/DOMAIN_USERS/SANUK/users/$USER/$tablename' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' $query ;"

     hive -S -e "set hive.cli.print.header=true; $query limit 1" | head -1 | sed 's/[t]/,/g' >> /data/2/DOMAIN_USERS/SANUK/users/$USER/$tablename/header.csv

     mv $tablename/000000_0 $tablename/$tablename.csv

     cat $tablename/$tablename.csv >> $tablename/header.csv.

     rm $tablename/$tablename.csv

     mv $tablename/header.csv $tablename/$tablename.csv 

     mv $tablename/$tablename.csv $directoryname

     counter=$((counter+1))

     rm -rf $tablename/ 

 done

answered Nov 21 '18 at 9:31

AngryCoder

325

add a comment |

13 Answers
13

active

oldest

votes

13 Answers
13

active

oldest

votes

If you're using Hive 11 or better you can use the INSERT statement with the LOCAL keyword.

Example:

insert overwrite local directory '/home/carter/staging' row format delimited fields terminated by ',' select * from hugetable;

Note that this may create multiple files and you may want to concatenate them on the client side after it's done exporting.

Using this approach means you don't need to worry about the format of the source tables, can export based on arbitrary SQL query, and can select your own delimiters and output formats.

edited Nov 13 '16 at 14:12

Ram Ghadiyaram

16.4k64477

answered Nov 1 '13 at 15:59

Carter Shanklin

1,9861112

Thank you, this created folder with multiple csv files. Is there anyway to put everything into one file? Also is there anyway to include header (column name) in the csv file?

– mike
Jun 14 '17 at 13:36

1

How do you concatenate them on the client side after exporting?

– user2205916
May 24 '18 at 20:45

For me this command has produced a bunch of files ending with the extension .snappy which looks like a compressed format. I am not sure how to convert un-compress them. I know how to merge files locally using the command cat file1 file2 > file on my local machine.

– Ravi Chandra
Nov 27 '18 at 6:51

add a comment |

If you're using Hive 11 or better you can use the INSERT statement with the LOCAL keyword.

Example:

insert overwrite local directory '/home/carter/staging' row format delimited fields terminated by ',' select * from hugetable;

Note that this may create multiple files and you may want to concatenate them on the client side after it's done exporting.

Using this approach means you don't need to worry about the format of the source tables, can export based on arbitrary SQL query, and can select your own delimiters and output formats.

edited Nov 13 '16 at 14:12

Ram Ghadiyaram

16.4k64477

answered Nov 1 '13 at 15:59

Carter Shanklin

1,9861112

Thank you, this created folder with multiple csv files. Is there anyway to put everything into one file? Also is there anyway to include header (column name) in the csv file?

– mike
Jun 14 '17 at 13:36

1

How do you concatenate them on the client side after exporting?

– user2205916
May 24 '18 at 20:45

For me this command has produced a bunch of files ending with the extension .snappy which looks like a compressed format. I am not sure how to convert un-compress them. I know how to merge files locally using the command cat file1 file2 > file on my local machine.

– Ravi Chandra
Nov 27 '18 at 6:51

add a comment |

If you're using Hive 11 or better you can use the INSERT statement with the LOCAL keyword.

Example:

insert overwrite local directory '/home/carter/staging' row format delimited fields terminated by ',' select * from hugetable;

Note that this may create multiple files and you may want to concatenate them on the client side after it's done exporting.

Using this approach means you don't need to worry about the format of the source tables, can export based on arbitrary SQL query, and can select your own delimiters and output formats.

edited Nov 13 '16 at 14:12

Ram Ghadiyaram

16.4k64477

answered Nov 1 '13 at 15:59

Carter Shanklin

1,9861112

If you're using Hive 11 or better you can use the INSERT statement with the LOCAL keyword.

Example:

insert overwrite local directory '/home/carter/staging' row format delimited fields terminated by ',' select * from hugetable;

Note that this may create multiple files and you may want to concatenate them on the client side after it's done exporting.

Using this approach means you don't need to worry about the format of the source tables, can export based on arbitrary SQL query, and can select your own delimiters and output formats.

edited Nov 13 '16 at 14:12

Ram Ghadiyaram

16.4k64477

answered Nov 1 '13 at 15:59

Carter Shanklin

1,9861112

edited Nov 13 '16 at 14:12

Ram Ghadiyaram

16.4k64477

edited Nov 13 '16 at 14:12

Ram Ghadiyaram

16.4k64477

edited Nov 13 '16 at 14:12

Ram Ghadiyaram

16.4k64477

answered Nov 1 '13 at 15:59

Carter Shanklin

1,9861112

answered Nov 1 '13 at 15:59

Carter Shanklin

1,9861112

answered Nov 1 '13 at 15:59

Carter Shanklin

1,9861112

Thank you, this created folder with multiple csv files. Is there anyway to put everything into one file? Also is there anyway to include header (column name) in the csv file?

– mike
Jun 14 '17 at 13:36

1

How do you concatenate them on the client side after exporting?

– user2205916
May 24 '18 at 20:45

For me this command has produced a bunch of files ending with the extension .snappy which looks like a compressed format. I am not sure how to convert un-compress them. I know how to merge files locally using the command cat file1 file2 > file on my local machine.

– Ravi Chandra
Nov 27 '18 at 6:51

add a comment |

Thank you, this created folder with multiple csv files. Is there anyway to put everything into one file? Also is there anyway to include header (column name) in the csv file?

– mike
Jun 14 '17 at 13:36

1

How do you concatenate them on the client side after exporting?

– user2205916
May 24 '18 at 20:45

For me this command has produced a bunch of files ending with the extension .snappy which looks like a compressed format. I am not sure how to convert un-compress them. I know how to merge files locally using the command cat file1 file2 > file on my local machine.

– Ravi Chandra
Nov 27 '18 at 6:51

Thank you, this created folder with multiple csv files. Is there anyway to put everything into one file? Also is there anyway to include header (column name) in the csv file?

– mike
Jun 14 '17 at 13:36

How do you concatenate them on the client side after exporting?

– user2205916
May 24 '18 at 20:45

For me this command has produced a bunch of files ending with the extension .snappy which looks like a compressed format. I am not sure how to convert un-compress them. I know how to merge files locally using the command cat file1 file2 > file on my local machine.

– Ravi Chandra
Nov 27 '18 at 6:51

add a comment |

or use this

hive -e 'select * from your_Table' | sed 's/[t]/,/g'  > /home/yourfile.csv

You can also specify property set hive.cli.print.header=true before the SELECT to ensure that header along with data is created and copied to file.
For example:

hive -e 'set hive.cli.print.header=true; select * from your_Table' | sed 's/[t]/,/g'  > /home/yourfile.csv

If you don't want to write to local file system, pipe the output of sed command back into HDFS using the hadoop fs -put command.

edited Apr 23 '15 at 23:01

Dave Sag

8,233663104

answered May 29 '14 at 23:46

user1922900

55953

By using this command the hive data types such as 'double' are not carried forward in CSV. So when I read the CSV all are read as a string.

– Aman Mathur
Jun 25 '15 at 12:17

This will read table from bash ?

– Thomas Decaux
May 9 '17 at 18:27

add a comment |

or use this

hive -e 'select * from your_Table' | sed 's/[t]/,/g'  > /home/yourfile.csv

You can also specify property set hive.cli.print.header=true before the SELECT to ensure that header along with data is created and copied to file.
For example:

hive -e 'set hive.cli.print.header=true; select * from your_Table' | sed 's/[t]/,/g'  > /home/yourfile.csv

If you don't want to write to local file system, pipe the output of sed command back into HDFS using the hadoop fs -put command.

edited Apr 23 '15 at 23:01

Dave Sag

8,233663104

answered May 29 '14 at 23:46

user1922900

55953

By using this command the hive data types such as 'double' are not carried forward in CSV. So when I read the CSV all are read as a string.

– Aman Mathur
Jun 25 '15 at 12:17

This will read table from bash ?

– Thomas Decaux
May 9 '17 at 18:27

add a comment |

or use this

hive -e 'select * from your_Table' | sed 's/[t]/,/g'  > /home/yourfile.csv

You can also specify property set hive.cli.print.header=true before the SELECT to ensure that header along with data is created and copied to file.
For example:

hive -e 'set hive.cli.print.header=true; select * from your_Table' | sed 's/[t]/,/g'  > /home/yourfile.csv

If you don't want to write to local file system, pipe the output of sed command back into HDFS using the hadoop fs -put command.

edited Apr 23 '15 at 23:01

Dave Sag

8,233663104

answered May 29 '14 at 23:46

user1922900

55953

or use this

hive -e 'select * from your_Table' | sed 's/[t]/,/g'  > /home/yourfile.csv

You can also specify property set hive.cli.print.header=true before the SELECT to ensure that header along with data is created and copied to file.
For example:

hive -e 'set hive.cli.print.header=true; select * from your_Table' | sed 's/[t]/,/g'  > /home/yourfile.csv

If you don't want to write to local file system, pipe the output of sed command back into HDFS using the hadoop fs -put command.

edited Apr 23 '15 at 23:01

Dave Sag

8,233663104

answered May 29 '14 at 23:46

user1922900

55953

edited Apr 23 '15 at 23:01

Dave Sag

8,233663104

edited Apr 23 '15 at 23:01

Dave Sag

8,233663104

edited Apr 23 '15 at 23:01

Dave Sag

8,233663104

answered May 29 '14 at 23:46

user1922900

55953

answered May 29 '14 at 23:46

user1922900

55953

answered May 29 '14 at 23:46

user1922900

55953

By using this command the hive data types such as 'double' are not carried forward in CSV. So when I read the CSV all are read as a string.

– Aman Mathur
Jun 25 '15 at 12:17

This will read table from bash ?

– Thomas Decaux
May 9 '17 at 18:27

add a comment |

By using this command the hive data types such as 'double' are not carried forward in CSV. So when I read the CSV all are read as a string.

– Aman Mathur
Jun 25 '15 at 12:17

This will read table from bash ?

– Thomas Decaux
May 9 '17 at 18:27

By using this command the hive data types such as 'double' are not carried forward in CSV. So when I read the CSV all are read as a string.

– Aman Mathur
Jun 25 '15 at 12:17

This will read table from bash ?

– Thomas Decaux
May 9 '17 at 18:27

add a comment |

That should work for you

tab separated

hive -e 'select * from some_table' > /home/yourfile.tsv

comma separated

hive -e 'select * from some_table' | sed 's/[t]/,/g' > /home/yourfile.csv

edited May 22 '17 at 9:27

Anton Protopopov

14.8k34659

answered May 2 '14 at 10:24

Saad

1,03811421

1

this will export as tab-separated

– Brett Bonner
Aug 2 '15 at 2:08

It is working: hive -e 'use <database or schema name>; select * from <table_name>;' > <absolute path for the csv file>/<csv file name>.csv

– JGS
May 12 '16 at 10:30

1

Excellent!! Made my day!

– prashanth
Jan 30 '18 at 18:30

add a comment |

That should work for you

tab separated

hive -e 'select * from some_table' > /home/yourfile.tsv

comma separated

hive -e 'select * from some_table' | sed 's/[t]/,/g' > /home/yourfile.csv

edited May 22 '17 at 9:27

Anton Protopopov

14.8k34659

answered May 2 '14 at 10:24

Saad

1,03811421

1

this will export as tab-separated

– Brett Bonner
Aug 2 '15 at 2:08

It is working: hive -e 'use <database or schema name>; select * from <table_name>;' > <absolute path for the csv file>/<csv file name>.csv

– JGS
May 12 '16 at 10:30

1

Excellent!! Made my day!

– prashanth
Jan 30 '18 at 18:30

add a comment |

That should work for you

tab separated

hive -e 'select * from some_table' > /home/yourfile.tsv

comma separated

hive -e 'select * from some_table' | sed 's/[t]/,/g' > /home/yourfile.csv

edited May 22 '17 at 9:27

Anton Protopopov

14.8k34659

answered May 2 '14 at 10:24

Saad

1,03811421

That should work for you

tab separated

hive -e 'select * from some_table' > /home/yourfile.tsv

comma separated

hive -e 'select * from some_table' | sed 's/[t]/,/g' > /home/yourfile.csv

edited May 22 '17 at 9:27

Anton Protopopov

14.8k34659

answered May 2 '14 at 10:24

Saad

1,03811421

edited May 22 '17 at 9:27

Anton Protopopov

14.8k34659

edited May 22 '17 at 9:27

Anton Protopopov

14.8k34659

edited May 22 '17 at 9:27

Anton Protopopov

14.8k34659

answered May 2 '14 at 10:24

Saad

1,03811421

answered May 2 '14 at 10:24

Saad

1,03811421

answered May 2 '14 at 10:24

Saad

1,03811421

1

this will export as tab-separated

– Brett Bonner
Aug 2 '15 at 2:08

It is working: hive -e 'use <database or schema name>; select * from <table_name>;' > <absolute path for the csv file>/<csv file name>.csv

– JGS
May 12 '16 at 10:30

1

Excellent!! Made my day!

– prashanth
Jan 30 '18 at 18:30

add a comment |

1

this will export as tab-separated

– Brett Bonner
Aug 2 '15 at 2:08

It is working: hive -e 'use <database or schema name>; select * from <table_name>;' > <absolute path for the csv file>/<csv file name>.csv

– JGS
May 12 '16 at 10:30

1

Excellent!! Made my day!

– prashanth
Jan 30 '18 at 18:30

this will export as tab-separated

– Brett Bonner
Aug 2 '15 at 2:08

It is working: hive -e 'use <database or schema name>; select * from <table_name>;' > <absolute path for the csv file>/<csv file name>.csv

– JGS
May 12 '16 at 10:30

Excellent!! Made my day!

– prashanth
Jan 30 '18 at 18:30

add a comment |

You can not have a delimiter for query output,after generating the report (as you did).

you can change the delimiter to comma.

It comes with default delimiter 01 (inivisible character).

hadoop fs -cat /user/data/output/test/* |tr "1" "," >>outputwithcomma.csv

check this also

edited May 23 '17 at 11:55

Community♦

answered Jun 13 '13 at 12:44

Balaswamy Vaddeman

6,17032137

add a comment |

You can not have a delimiter for query output,after generating the report (as you did).

you can change the delimiter to comma.

It comes with default delimiter 01 (inivisible character).

hadoop fs -cat /user/data/output/test/* |tr "1" "," >>outputwithcomma.csv

check this also

edited May 23 '17 at 11:55

Community♦

answered Jun 13 '13 at 12:44

Balaswamy Vaddeman

6,17032137

add a comment |

You can not have a delimiter for query output,after generating the report (as you did).

you can change the delimiter to comma.

It comes with default delimiter 01 (inivisible character).

hadoop fs -cat /user/data/output/test/* |tr "1" "," >>outputwithcomma.csv

check this also

edited May 23 '17 at 11:55

Community♦

answered Jun 13 '13 at 12:44

Balaswamy Vaddeman

6,17032137

You can not have a delimiter for query output,after generating the report (as you did).

you can change the delimiter to comma.

It comes with default delimiter 01 (inivisible character).

hadoop fs -cat /user/data/output/test/* |tr "1" "," >>outputwithcomma.csv

check this also

edited May 23 '17 at 11:55

Community♦

answered Jun 13 '13 at 12:44

Balaswamy Vaddeman

6,17032137

edited May 23 '17 at 11:55

Community♦

edited May 23 '17 at 11:55

Community♦

edited May 23 '17 at 11:55

Community♦

answered Jun 13 '13 at 12:44

Balaswamy Vaddeman

6,17032137

answered Jun 13 '13 at 12:44

Balaswamy Vaddeman

6,17032137

answered Jun 13 '13 at 12:44

Balaswamy Vaddeman

6,17032137

add a comment |

Recent versions of hive comes with this feature.

INSERT OVERWRITE LOCAL DIRECTORY '/home/lvermeer/temp' 

ROW FORMAT DELIMITED 

FIELDS TERMINATED BY ',' 

select * from table;

this way you can choose your own delimiter and file name.
Just be careful with the "OVERWRITE" it will try to delete everything from the mentioned folder.

edited Nov 13 '16 at 14:10

Ram Ghadiyaram

16.4k64477

answered Mar 23 '15 at 9:16

sunil

7501819

add a comment |

Recent versions of hive comes with this feature.

INSERT OVERWRITE LOCAL DIRECTORY '/home/lvermeer/temp' 

ROW FORMAT DELIMITED 

FIELDS TERMINATED BY ',' 

select * from table;

this way you can choose your own delimiter and file name.
Just be careful with the "OVERWRITE" it will try to delete everything from the mentioned folder.

edited Nov 13 '16 at 14:10

Ram Ghadiyaram

16.4k64477

answered Mar 23 '15 at 9:16

sunil

7501819

add a comment |

Recent versions of hive comes with this feature.

INSERT OVERWRITE LOCAL DIRECTORY '/home/lvermeer/temp' 

ROW FORMAT DELIMITED 

FIELDS TERMINATED BY ',' 

select * from table;

this way you can choose your own delimiter and file name.
Just be careful with the "OVERWRITE" it will try to delete everything from the mentioned folder.

edited Nov 13 '16 at 14:10

Ram Ghadiyaram

16.4k64477

answered Mar 23 '15 at 9:16

sunil

7501819

Recent versions of hive comes with this feature.

INSERT OVERWRITE LOCAL DIRECTORY '/home/lvermeer/temp' 

ROW FORMAT DELIMITED 

FIELDS TERMINATED BY ',' 

select * from table;

this way you can choose your own delimiter and file name.
Just be careful with the "OVERWRITE" it will try to delete everything from the mentioned folder.

edited Nov 13 '16 at 14:10

Ram Ghadiyaram

16.4k64477

answered Mar 23 '15 at 9:16

sunil

7501819

edited Nov 13 '16 at 14:10

Ram Ghadiyaram

16.4k64477

edited Nov 13 '16 at 14:10

Ram Ghadiyaram

16.4k64477

edited Nov 13 '16 at 14:10

Ram Ghadiyaram

16.4k64477

answered Mar 23 '15 at 9:16

sunil

7501819

answered Mar 23 '15 at 9:16

sunil

7501819

answered Mar 23 '15 at 9:16

sunil

7501819

add a comment |

INSERT OVERWRITE LOCAL DIRECTORY '/home/lvermeer/temp' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' select * from table;

is the correct answer.

If the number of records is really big, based on the number of files generated

the following command would give only partial result.

hive -e 'select * from some_table' > /home/yourfile.csv

edited Dec 2 '16 at 9:11

Kishore

3,84531238

answered Oct 2 '15 at 18:29

Jsim

6112

add a comment |

INSERT OVERWRITE LOCAL DIRECTORY '/home/lvermeer/temp' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' select * from table;

is the correct answer.

If the number of records is really big, based on the number of files generated

the following command would give only partial result.

hive -e 'select * from some_table' > /home/yourfile.csv

edited Dec 2 '16 at 9:11

Kishore

3,84531238

answered Oct 2 '15 at 18:29

Jsim

6112

add a comment |

INSERT OVERWRITE LOCAL DIRECTORY '/home/lvermeer/temp' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' select * from table;

is the correct answer.

If the number of records is really big, based on the number of files generated

the following command would give only partial result.

hive -e 'select * from some_table' > /home/yourfile.csv

edited Dec 2 '16 at 9:11

Kishore

3,84531238

answered Oct 2 '15 at 18:29

Jsim

6112

INSERT OVERWRITE LOCAL DIRECTORY '/home/lvermeer/temp' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' select * from table;

is the correct answer.

If the number of records is really big, based on the number of files generated

the following command would give only partial result.

hive -e 'select * from some_table' > /home/yourfile.csv

edited Dec 2 '16 at 9:11

Kishore

3,84531238

answered Oct 2 '15 at 18:29

Jsim

6112

edited Dec 2 '16 at 9:11

Kishore

3,84531238

edited Dec 2 '16 at 9:11

Kishore

3,84531238

edited Dec 2 '16 at 9:11

Kishore

3,84531238

answered Oct 2 '15 at 18:29

Jsim

6112

answered Oct 2 '15 at 18:29

Jsim

6112

answered Oct 2 '15 at 18:29

Jsim

6112

add a comment |

I have used simple linux shell piping + perl to convert hive generated output from tsv to csv.

hive -e "SELECT col1, col2, … FROM table_name" | perl -lpe 's/"/\"/g; s/^|$/"/g; s/t/","/g' > output_file.csv

(I got the updated perl regex from someone in stackoverflow some time ago)

The result will be like regular csv:

"col1","col2","col3"... and so on

edited Nov 13 '16 at 14:10

Ram Ghadiyaram

16.4k64477

answered Mar 22 '15 at 4:45

Firman Gautama

815

add a comment |

I have used simple linux shell piping + perl to convert hive generated output from tsv to csv.

hive -e "SELECT col1, col2, … FROM table_name" | perl -lpe 's/"/\"/g; s/^|$/"/g; s/t/","/g' > output_file.csv

(I got the updated perl regex from someone in stackoverflow some time ago)

The result will be like regular csv:

"col1","col2","col3"... and so on

edited Nov 13 '16 at 14:10

Ram Ghadiyaram

16.4k64477

answered Mar 22 '15 at 4:45

Firman Gautama

815

add a comment |

I have used simple linux shell piping + perl to convert hive generated output from tsv to csv.

hive -e "SELECT col1, col2, … FROM table_name" | perl -lpe 's/"/\"/g; s/^|$/"/g; s/t/","/g' > output_file.csv

(I got the updated perl regex from someone in stackoverflow some time ago)

The result will be like regular csv:

"col1","col2","col3"... and so on

edited Nov 13 '16 at 14:10

Ram Ghadiyaram

16.4k64477

answered Mar 22 '15 at 4:45

Firman Gautama

815

I have used simple linux shell piping + perl to convert hive generated output from tsv to csv.

hive -e "SELECT col1, col2, … FROM table_name" | perl -lpe 's/"/\"/g; s/^|$/"/g; s/t/","/g' > output_file.csv

(I got the updated perl regex from someone in stackoverflow some time ago)

The result will be like regular csv:

"col1","col2","col3"... and so on

edited Nov 13 '16 at 14:10

Ram Ghadiyaram

16.4k64477

answered Mar 22 '15 at 4:45

Firman Gautama

815

edited Nov 13 '16 at 14:10

Ram Ghadiyaram

16.4k64477

edited Nov 13 '16 at 14:10

Ram Ghadiyaram

16.4k64477

edited Nov 13 '16 at 14:10

Ram Ghadiyaram

16.4k64477

answered Mar 22 '15 at 4:45

Firman Gautama

815

answered Mar 22 '15 at 4:45

Firman Gautama

815

answered Mar 22 '15 at 4:45

Firman Gautama

815

add a comment |

The following script should work for you:

#!/bin/bash

hive -e "insert overwrite local directory '/LocalPath/'

row format delimited fields terminated by ','

select * from Mydatabase,Mytable limit 100"

cat /LocalPath/* > /LocalPath/table.csv

I used limit 100 to limit the size of data since I had a huge table, but you can delete it to export the entire table.

answered Mar 21 '18 at 12:07

HISI

1,49511023

add a comment |

The following script should work for you:

#!/bin/bash

hive -e "insert overwrite local directory '/LocalPath/'

row format delimited fields terminated by ','

select * from Mydatabase,Mytable limit 100"

cat /LocalPath/* > /LocalPath/table.csv

I used limit 100 to limit the size of data since I had a huge table, but you can delete it to export the entire table.

answered Mar 21 '18 at 12:07

HISI

1,49511023

add a comment |

The following script should work for you:

#!/bin/bash

hive -e "insert overwrite local directory '/LocalPath/'

row format delimited fields terminated by ','

select * from Mydatabase,Mytable limit 100"

cat /LocalPath/* > /LocalPath/table.csv

I used limit 100 to limit the size of data since I had a huge table, but you can delete it to export the entire table.

answered Mar 21 '18 at 12:07

HISI

1,49511023

The following script should work for you:

#!/bin/bash

hive -e "insert overwrite local directory '/LocalPath/'

row format delimited fields terminated by ','

select * from Mydatabase,Mytable limit 100"

cat /LocalPath/* > /LocalPath/table.csv

I used limit 100 to limit the size of data since I had a huge table, but you can delete it to export the entire table.

answered Mar 21 '18 at 12:07

HISI

1,49511023

answered Mar 21 '18 at 12:07

HISI

1,49511023

answered Mar 21 '18 at 12:07

HISI

1,49511023

answered Mar 21 '18 at 12:07

HISI

1,49511023

add a comment |

Here using Hive warehouse dir you can export data instead of Hive table.
first give hive warehouse path and after local path where you want to store the .csv file
For this command is bellow :-

hadoop fs -cat /user/hdusr/warehouse/HiveDb/tableName/* > /users/hadoop/test/nilesh/sample.csv

answered Nov 3 '17 at 3:40

Nilesh Shinde

28547

add a comment |

Here using Hive warehouse dir you can export data instead of Hive table.
first give hive warehouse path and after local path where you want to store the .csv file
For this command is bellow :-

hadoop fs -cat /user/hdusr/warehouse/HiveDb/tableName/* > /users/hadoop/test/nilesh/sample.csv

answered Nov 3 '17 at 3:40

Nilesh Shinde

28547

add a comment |

Here using Hive warehouse dir you can export data instead of Hive table.
first give hive warehouse path and after local path where you want to store the .csv file
For this command is bellow :-

hadoop fs -cat /user/hdusr/warehouse/HiveDb/tableName/* > /users/hadoop/test/nilesh/sample.csv

answered Nov 3 '17 at 3:40

Nilesh Shinde

28547

Here using Hive warehouse dir you can export data instead of Hive table.
first give hive warehouse path and after local path where you want to store the .csv file
For this command is bellow :-

hadoop fs -cat /user/hdusr/warehouse/HiveDb/tableName/* > /users/hadoop/test/nilesh/sample.csv

answered Nov 3 '17 at 3:40

Nilesh Shinde

28547

answered Nov 3 '17 at 3:40

Nilesh Shinde

28547

answered Nov 3 '17 at 3:40

Nilesh Shinde

28547

answered Nov 3 '17 at 3:40

Nilesh Shinde

28547

add a comment |

I had a similar issue and this is how I was able to address it.

Step 1 - Loaded the data from hive table into another table as follows

DROP TABLE IF EXISTS TestHiveTableCSV;
CREATE TABLE TestHiveTableCSV ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY 'n' AS
SELECT Column List FROM TestHiveTable;

Step 2 - Copied the blob from hive warehouse to the new location with appropriate extension

Start-AzureStorageBlobCopy -DestContext $destContext
-SrcContainer "Source Container" -SrcBlob "hive/warehouse/TestHiveTableCSV/000000_0"
-DestContainer "Destination Container" `
-DestBlob "CSV/TestHiveTable.csv"

Hope this helps!

Best Regards,
Dattatrey Sindol (Datta)
http://dattatreysindol.com

answered May 29 '14 at 14:19

Dattatrey Sindol

637

add a comment |

I had a similar issue and this is how I was able to address it.

Step 1 - Loaded the data from hive table into another table as follows

DROP TABLE IF EXISTS TestHiveTableCSV;
CREATE TABLE TestHiveTableCSV ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY 'n' AS
SELECT Column List FROM TestHiveTable;

Step 2 - Copied the blob from hive warehouse to the new location with appropriate extension

Start-AzureStorageBlobCopy -DestContext $destContext
-SrcContainer "Source Container" -SrcBlob "hive/warehouse/TestHiveTableCSV/000000_0"
-DestContainer "Destination Container" `
-DestBlob "CSV/TestHiveTable.csv"

Hope this helps!

Best Regards,
Dattatrey Sindol (Datta)
http://dattatreysindol.com

answered May 29 '14 at 14:19

Dattatrey Sindol

637

add a comment |

I had a similar issue and this is how I was able to address it.

Step 1 - Loaded the data from hive table into another table as follows

DROP TABLE IF EXISTS TestHiveTableCSV;
CREATE TABLE TestHiveTableCSV ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY 'n' AS
SELECT Column List FROM TestHiveTable;

Step 2 - Copied the blob from hive warehouse to the new location with appropriate extension

Start-AzureStorageBlobCopy -DestContext $destContext
-SrcContainer "Source Container" -SrcBlob "hive/warehouse/TestHiveTableCSV/000000_0"
-DestContainer "Destination Container" `
-DestBlob "CSV/TestHiveTable.csv"

Hope this helps!

Best Regards,
Dattatrey Sindol (Datta)
http://dattatreysindol.com

answered May 29 '14 at 14:19

Dattatrey Sindol

637

I had a similar issue and this is how I was able to address it.

Step 1 - Loaded the data from hive table into another table as follows

DROP TABLE IF EXISTS TestHiveTableCSV;
CREATE TABLE TestHiveTableCSV ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY 'n' AS
SELECT Column List FROM TestHiveTable;

Step 2 - Copied the blob from hive warehouse to the new location with appropriate extension

Start-AzureStorageBlobCopy -DestContext $destContext
-SrcContainer "Source Container" -SrcBlob "hive/warehouse/TestHiveTableCSV/000000_0"
-DestContainer "Destination Container" `
-DestBlob "CSV/TestHiveTable.csv"

Hope this helps!

Best Regards,
Dattatrey Sindol (Datta)
http://dattatreysindol.com

answered May 29 '14 at 14:19

Dattatrey Sindol

637

answered May 29 '14 at 14:19

Dattatrey Sindol

637

answered May 29 '14 at 14:19

Dattatrey Sindol

637

answered May 29 '14 at 14:19

Dattatrey Sindol

637

add a comment |

There are ways to change the default delimiter, as shown by other answers.

There are also ways to convert the raw output to csv with some bash scripting. There are 3 delimiters to consider though, not just 01. Things get a bit more complicated when your hive table has maps.

I wrote a bash script that can handle all 3 default delimiters (01 02 and 03) from hive and output a csv. The script and some more info are here:

Hive Default Delimiters to CSV

Hive's default delimiters are
Row Delimiter => Control-A ('01')

Collection Item Delimiter => Control-B ('02')

Map Key Delimiter => Control-C ('03')
There are ways to change these delimiters when exporting tables but
sometimes you might still get stuck needing to convert this to csv.

Here's a quick bash script that can handle a DB export that's
segmented in multiple files and has the default delimiters. It will
output a single CSV file.

It is assumed that the segments all have the naming convention 000*_0
INDIRECTORY="path/to/input/directory"

for f in $INDIRECTORY/000*_0; do 

  echo "Processing $f file.."; 

  cat -v $f | 

      LC_ALL=C sed -e "s/^/"/g" | 

      LC_ALL=C sed -e "s/^A/","/g" | 

      LC_ALL=C sed -e "s/^C^B/"":"""",""/g" | 

      LC_ALL=C sed -e "s/^B/"",""/g" |  

      LC_ALL=C sed -e "s/^C/"":""/g" | 

      LC_ALL=C sed -e "s/$/"/g" > $f-temp

done

echo "you,can,echo,your,header,here,if,you,like" > $INDIRECTORY/final_output.csv

cat $INDIRECTORY/*-temp >> $INDIRECTORY/final_output.csv

rm $INDIRECTORY/*-temp

More explanation on the gist

edited Mar 18 '16 at 14:28

answered Mar 18 '16 at 13:11

alex9311

67311237

add a comment |

There are ways to change the default delimiter, as shown by other answers.

There are also ways to convert the raw output to csv with some bash scripting. There are 3 delimiters to consider though, not just 01. Things get a bit more complicated when your hive table has maps.

I wrote a bash script that can handle all 3 default delimiters (01 02 and 03) from hive and output a csv. The script and some more info are here:

Hive Default Delimiters to CSV

Hive's default delimiters are
Row Delimiter => Control-A ('01')

Collection Item Delimiter => Control-B ('02')

Map Key Delimiter => Control-C ('03')
There are ways to change these delimiters when exporting tables but
sometimes you might still get stuck needing to convert this to csv.

Here's a quick bash script that can handle a DB export that's
segmented in multiple files and has the default delimiters. It will
output a single CSV file.

It is assumed that the segments all have the naming convention 000*_0
INDIRECTORY="path/to/input/directory"

for f in $INDIRECTORY/000*_0; do 

  echo "Processing $f file.."; 

  cat -v $f | 

      LC_ALL=C sed -e "s/^/"/g" | 

      LC_ALL=C sed -e "s/^A/","/g" | 

      LC_ALL=C sed -e "s/^C^B/"":"""",""/g" | 

      LC_ALL=C sed -e "s/^B/"",""/g" |  

      LC_ALL=C sed -e "s/^C/"":""/g" | 

      LC_ALL=C sed -e "s/$/"/g" > $f-temp

done

echo "you,can,echo,your,header,here,if,you,like" > $INDIRECTORY/final_output.csv

cat $INDIRECTORY/*-temp >> $INDIRECTORY/final_output.csv

rm $INDIRECTORY/*-temp

More explanation on the gist

edited Mar 18 '16 at 14:28

answered Mar 18 '16 at 13:11

alex9311

67311237

add a comment |

There are ways to change the default delimiter, as shown by other answers.

There are also ways to convert the raw output to csv with some bash scripting. There are 3 delimiters to consider though, not just 01. Things get a bit more complicated when your hive table has maps.

I wrote a bash script that can handle all 3 default delimiters (01 02 and 03) from hive and output a csv. The script and some more info are here:

Hive Default Delimiters to CSV

Hive's default delimiters are
Row Delimiter => Control-A ('01')

Collection Item Delimiter => Control-B ('02')

Map Key Delimiter => Control-C ('03')
There are ways to change these delimiters when exporting tables but
sometimes you might still get stuck needing to convert this to csv.

Here's a quick bash script that can handle a DB export that's
segmented in multiple files and has the default delimiters. It will
output a single CSV file.

It is assumed that the segments all have the naming convention 000*_0
INDIRECTORY="path/to/input/directory"

for f in $INDIRECTORY/000*_0; do 

  echo "Processing $f file.."; 

  cat -v $f | 

      LC_ALL=C sed -e "s/^/"/g" | 

      LC_ALL=C sed -e "s/^A/","/g" | 

      LC_ALL=C sed -e "s/^C^B/"":"""",""/g" | 

      LC_ALL=C sed -e "s/^B/"",""/g" |  

      LC_ALL=C sed -e "s/^C/"":""/g" | 

      LC_ALL=C sed -e "s/$/"/g" > $f-temp

done

echo "you,can,echo,your,header,here,if,you,like" > $INDIRECTORY/final_output.csv

cat $INDIRECTORY/*-temp >> $INDIRECTORY/final_output.csv

rm $INDIRECTORY/*-temp

More explanation on the gist

edited Mar 18 '16 at 14:28

answered Mar 18 '16 at 13:11

alex9311

67311237

There are ways to change the default delimiter, as shown by other answers.

There are also ways to convert the raw output to csv with some bash scripting. There are 3 delimiters to consider though, not just 01. Things get a bit more complicated when your hive table has maps.

I wrote a bash script that can handle all 3 default delimiters (01 02 and 03) from hive and output a csv. The script and some more info are here:

Hive Default Delimiters to CSV

Hive's default delimiters are
Row Delimiter => Control-A ('01')

Collection Item Delimiter => Control-B ('02')

Map Key Delimiter => Control-C ('03')
There are ways to change these delimiters when exporting tables but
sometimes you might still get stuck needing to convert this to csv.

Here's a quick bash script that can handle a DB export that's
segmented in multiple files and has the default delimiters. It will
output a single CSV file.

It is assumed that the segments all have the naming convention 000*_0
INDIRECTORY="path/to/input/directory"

for f in $INDIRECTORY/000*_0; do 

  echo "Processing $f file.."; 

  cat -v $f | 

      LC_ALL=C sed -e "s/^/"/g" | 

      LC_ALL=C sed -e "s/^A/","/g" | 

      LC_ALL=C sed -e "s/^C^B/"":"""",""/g" | 

      LC_ALL=C sed -e "s/^B/"",""/g" |  

      LC_ALL=C sed -e "s/^C/"":""/g" | 

      LC_ALL=C sed -e "s/$/"/g" > $f-temp

done

echo "you,can,echo,your,header,here,if,you,like" > $INDIRECTORY/final_output.csv

cat $INDIRECTORY/*-temp >> $INDIRECTORY/final_output.csv

rm $INDIRECTORY/*-temp

More explanation on the gist

edited Mar 18 '16 at 14:28

answered Mar 18 '16 at 13:11

alex9311

67311237

edited Mar 18 '16 at 14:28

answered Mar 18 '16 at 13:11

alex9311

67311237

answered Mar 18 '16 at 13:11

alex9311

67311237

answered Mar 18 '16 at 13:11

alex9311

67311237

add a comment |

In case you are doing it from Windows you can use Python script hivehoney to extract table data to local CSV file.

It will:

pbrun.

kinit.

beeline (with your query).

Save
echo from beeline to a file on Windows.

Execute it like this:

set PROXY_HOST=your_bastion_host



set SERVICE_USER=you_func_user



set LINUX_USER=your_SOID



set LINUX_PWD=your_pwd



python hh.py --query_file=query.sql

edited Sep 20 '18 at 19:19

answered Sep 4 '18 at 18:42

Alex B

708719

add a comment |

In case you are doing it from Windows you can use Python script hivehoney to extract table data to local CSV file.

It will:

pbrun.

kinit.

beeline (with your query).

Save
echo from beeline to a file on Windows.

Execute it like this:

set PROXY_HOST=your_bastion_host



set SERVICE_USER=you_func_user



set LINUX_USER=your_SOID



set LINUX_PWD=your_pwd



python hh.py --query_file=query.sql

edited Sep 20 '18 at 19:19

answered Sep 4 '18 at 18:42

Alex B

708719

add a comment |

In case you are doing it from Windows you can use Python script hivehoney to extract table data to local CSV file.

It will:

pbrun.

kinit.

beeline (with your query).

Save
echo from beeline to a file on Windows.

Execute it like this:

set PROXY_HOST=your_bastion_host



set SERVICE_USER=you_func_user



set LINUX_USER=your_SOID



set LINUX_PWD=your_pwd



python hh.py --query_file=query.sql

edited Sep 20 '18 at 19:19

answered Sep 4 '18 at 18:42

Alex B

708719

In case you are doing it from Windows you can use Python script hivehoney to extract table data to local CSV file.

It will:

pbrun.

kinit.

beeline (with your query).

Save
echo from beeline to a file on Windows.

Execute it like this:

set PROXY_HOST=your_bastion_host



set SERVICE_USER=you_func_user



set LINUX_USER=your_SOID



set LINUX_PWD=your_pwd



python hh.py --query_file=query.sql

edited Sep 20 '18 at 19:19

answered Sep 4 '18 at 18:42

Alex B

708719

edited Sep 20 '18 at 19:19

answered Sep 4 '18 at 18:42

Alex B

708719

answered Sep 4 '18 at 18:42

Alex B

708719

answered Sep 4 '18 at 18:42

Alex B

708719

add a comment |

The problem solutions are fine but I found some problems in both:

As Carter Shanklin said, with this command we will obtain a csv file with the results of the query in the path specified:
```
insert overwrite local directory '/home/carter/staging' row format delimited fields terminated by ',' select * from hugetable;
```
The problem with this solution is that the csv obtained won´t have headers and will create a file that is not a CSV (so we have to rename it).

As user1922900 said, with the following command we will obtain a CSV files with the results of the query in the specified file and with headers:
```
hive -e 'select * from some_table' | sed 's/[t]/,/g' > /home/yourfile.csv
```
With this solution we will get a CSV file with the result rows of our query, but with log messages between these rows too. As a solution of this problem I tried this, but without results.

 #!/bin/sh

 QUERIES=("select * from table1" "select * from table2")

 IFS=""

 directoryname=$(echo "ScriptResults$timestamp")

 mkdir $directoryname 

 counter=1 

for query in ${QUERIES[*]}

 do 

     tablename="query"$counter 

     hive -S -e "INSERT OVERWRITE LOCAL DIRECTORY '/data/2/DOMAIN_USERS/SANUK/users/$USER/$tablename' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' $query ;"

     hive -S -e "set hive.cli.print.header=true; $query limit 1" | head -1 | sed 's/[t]/,/g' >> /data/2/DOMAIN_USERS/SANUK/users/$USER/$tablename/header.csv

     mv $tablename/000000_0 $tablename/$tablename.csv

     cat $tablename/$tablename.csv >> $tablename/header.csv.

     rm $tablename/$tablename.csv

     mv $tablename/header.csv $tablename/$tablename.csv 

     mv $tablename/$tablename.csv $directoryname

     counter=$((counter+1))

     rm -rf $tablename/ 

 done

answered Nov 21 '18 at 9:31

AngryCoder

325

add a comment |

The problem solutions are fine but I found some problems in both:

As Carter Shanklin said, with this command we will obtain a csv file with the results of the query in the path specified:
```
insert overwrite local directory '/home/carter/staging' row format delimited fields terminated by ',' select * from hugetable;
```
The problem with this solution is that the csv obtained won´t have headers and will create a file that is not a CSV (so we have to rename it).

As user1922900 said, with the following command we will obtain a CSV files with the results of the query in the specified file and with headers:
```
hive -e 'select * from some_table' | sed 's/[t]/,/g' > /home/yourfile.csv
```
With this solution we will get a CSV file with the result rows of our query, but with log messages between these rows too. As a solution of this problem I tried this, but without results.

 #!/bin/sh

 QUERIES=("select * from table1" "select * from table2")

 IFS=""

 directoryname=$(echo "ScriptResults$timestamp")

 mkdir $directoryname 

 counter=1 

for query in ${QUERIES[*]}

 do 

     tablename="query"$counter 

     hive -S -e "INSERT OVERWRITE LOCAL DIRECTORY '/data/2/DOMAIN_USERS/SANUK/users/$USER/$tablename' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' $query ;"

     hive -S -e "set hive.cli.print.header=true; $query limit 1" | head -1 | sed 's/[t]/,/g' >> /data/2/DOMAIN_USERS/SANUK/users/$USER/$tablename/header.csv

     mv $tablename/000000_0 $tablename/$tablename.csv

     cat $tablename/$tablename.csv >> $tablename/header.csv.

     rm $tablename/$tablename.csv

     mv $tablename/header.csv $tablename/$tablename.csv 

     mv $tablename/$tablename.csv $directoryname

     counter=$((counter+1))

     rm -rf $tablename/ 

 done

answered Nov 21 '18 at 9:31

AngryCoder

325

add a comment |

The problem solutions are fine but I found some problems in both:

As Carter Shanklin said, with this command we will obtain a csv file with the results of the query in the path specified:
```
insert overwrite local directory '/home/carter/staging' row format delimited fields terminated by ',' select * from hugetable;
```
The problem with this solution is that the csv obtained won´t have headers and will create a file that is not a CSV (so we have to rename it).

As user1922900 said, with the following command we will obtain a CSV files with the results of the query in the specified file and with headers:
```
hive -e 'select * from some_table' | sed 's/[t]/,/g' > /home/yourfile.csv
```
With this solution we will get a CSV file with the result rows of our query, but with log messages between these rows too. As a solution of this problem I tried this, but without results.

 #!/bin/sh

 QUERIES=("select * from table1" "select * from table2")

 IFS=""

 directoryname=$(echo "ScriptResults$timestamp")

 mkdir $directoryname 

 counter=1 

for query in ${QUERIES[*]}

 do 

     tablename="query"$counter 

     hive -S -e "INSERT OVERWRITE LOCAL DIRECTORY '/data/2/DOMAIN_USERS/SANUK/users/$USER/$tablename' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' $query ;"

     hive -S -e "set hive.cli.print.header=true; $query limit 1" | head -1 | sed 's/[t]/,/g' >> /data/2/DOMAIN_USERS/SANUK/users/$USER/$tablename/header.csv

     mv $tablename/000000_0 $tablename/$tablename.csv

     cat $tablename/$tablename.csv >> $tablename/header.csv.

     rm $tablename/$tablename.csv

     mv $tablename/header.csv $tablename/$tablename.csv 

     mv $tablename/$tablename.csv $directoryname

     counter=$((counter+1))

     rm -rf $tablename/ 

 done

answered Nov 21 '18 at 9:31

AngryCoder

325

The problem solutions are fine but I found some problems in both:

As Carter Shanklin said, with this command we will obtain a csv file with the results of the query in the path specified:
```
insert overwrite local directory '/home/carter/staging' row format delimited fields terminated by ',' select * from hugetable;
```
The problem with this solution is that the csv obtained won´t have headers and will create a file that is not a CSV (so we have to rename it).

As user1922900 said, with the following command we will obtain a CSV files with the results of the query in the specified file and with headers:
```
hive -e 'select * from some_table' | sed 's/[t]/,/g' > /home/yourfile.csv
```
With this solution we will get a CSV file with the result rows of our query, but with log messages between these rows too. As a solution of this problem I tried this, but without results.

 #!/bin/sh

 QUERIES=("select * from table1" "select * from table2")

 IFS=""

 directoryname=$(echo "ScriptResults$timestamp")

 mkdir $directoryname 

 counter=1 

for query in ${QUERIES[*]}

 do 

     tablename="query"$counter 

     hive -S -e "INSERT OVERWRITE LOCAL DIRECTORY '/data/2/DOMAIN_USERS/SANUK/users/$USER/$tablename' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' $query ;"

     hive -S -e "set hive.cli.print.header=true; $query limit 1" | head -1 | sed 's/[t]/,/g' >> /data/2/DOMAIN_USERS/SANUK/users/$USER/$tablename/header.csv

     mv $tablename/000000_0 $tablename/$tablename.csv

     cat $tablename/$tablename.csv >> $tablename/header.csv.

     rm $tablename/$tablename.csv

     mv $tablename/header.csv $tablename/$tablename.csv 

     mv $tablename/$tablename.csv $directoryname

     counter=$((counter+1))

     rm -rf $tablename/ 

 done

answered Nov 21 '18 at 9:31

AngryCoder

325

answered Nov 21 '18 at 9:31

AngryCoder

325

answered Nov 21 '18 at 9:31

AngryCoder

325

answered Nov 21 '18 at 9:31

AngryCoder

325

add a comment |

This page is only for reference, If you need detailed information, please check here

JTWAyc,hWdeMBU4a5oln,LEi1HdmF8mi n14R BdqvA vji,CIl,d 4COnrT,RSEdkli

搜尋此網誌

Argthtjtr