_ | 覦覈襦 | 豕蠏手 | 殊螳 | 譯殊碁
FrontPage › Hive覯HivePython企殊伎誤語蠍

Contents

1 Abstraction
2 Hive
3 Apache Thrift
4 Hive Server
5 Hive Python 企殊伎誤
6 References



1 Abstraction #

Hive Thrift 覯 Python 企殊伎誤碁ゼ 覦覯 覲碁.

2 Hive #

  • Hive is a data warehouse infrastructure built on top of Hadoop that provides tools to enable easy data summarization, adhoc querying and analysis of large datasets data stored in Hadoop files.
  • http://hadoop.apache.org/hive/

3 Apache Thrift #

  • Thrift is a software framework for scalable cross-language services development. It combines a software stack with a code generation engine to build services that work efficiently and seamlessly between C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, Smalltalk, and OCaml.
  • http://incubator.apache.org/thrift/

4 Hive Server #

Hive 覯 Thrift 覯襦 .



$ hive --service hiveserver 
[1] 9818

$ Starting Hive Thrift Server

09/12/17 16:59:39 INFO service.HiveServer: Starting hive server on port 10000



.

.



$

5 Hive Python 企殊伎誤 #

Hadoop & Hive れ 覦 (Hadoop 0.20.1 & Hive 0.4.0)
$ rpm -qa | grep hadoop-0.20

hadoop-0.20-jobtracker-0.20.1+133-1

hadoop-0.20-libhdfs-0.20.1+133-1

hadoop-0.20-tasktracker-0.20.1+133-1

hadoop-0.20-0.20.1+133-1

hadoop-0.20-datanode-0.20.1+133-1

hadoop-0.20-secondarynamenode-0.20.1+133-1

hadoop-0.20-conf-pseudo-0.20.1+133-1

hadoop-0.20-pipes-0.20.1+133-1

hadoop-0.20-namenode-0.20.1+133-1

hadoop-0.20-native-0.20.1+133-1

hadoop-0.20-docs-0.20.1+133-1

$

$ rpm -qa | grep hive

hadoop-hive-webinterface-0.4.0+14-1

hadoop-hive-0.4.0+14-1

一危
$ cat /tmp/r.txt

a       1       1.0

b       2       2.0

c       3       3.0

$

PYTHONPATH れ (Hive Python 殊企襴)
$ export PYTHONPATH="/usr/lib/hive/lib/py"

$ env | grep PYTHONPATH

PYTHONPATH=/usr/lib/hive/lib/py


import sys



from hive_service import ThriftHive

from hive_service.ttypes import HiveServerException

from thrift import Thrift

from thrift.transport import TSocket

from thrift.transport import TTransport

from thrift.protocol import TBinaryProtocol



try:

    transport = TSocket.TSocket('localhost', 10000)

    transport = TTransport.TBufferedTransport(transport)

    protocol = TBinaryProtocol.TBinaryProtocol(transport)



    client = ThriftHive.Client(protocol)

    transport.open()



    client.execute("CREATE TABLE r(a STRING, b INT, c DOUBLE) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\\t' STORED AS TEXTFILE")

    client.execute("LOAD DATA LOCAL INPATH '/tmp/r.txt' OVERWRITE INTO TABLE r")

    client.execute("SELECT * FROM r")

    for row in client.fetchAll():

      print row



    transport.close()



except Thrift.TException, tx:

    print '%s' % (tx.message)

ろ
{{{
$ python hive_py.py

a       1       1.0

b       2       2.0

c       3       3.0

6 References #

  • Hive Wiki
  • Apache Thrift
蠍 蠍郁鍵..
企: : るジ讓曙 襦螻豺 企Ν 譯殊語. 襦螻豺
EditText : Print : Mobile : FindPage : DeletePage : LikePages : Powered by MoniWiki : Last modified 2018-04-13 23:12:52

語 譯朱覦 碁朱れ 炎概蠍 覦 レ覓殊 覿豎れ 螳 讀覈 譴. 蠏碁れ 蟇磯 ろ 蠍磯ゼ 讌 蠍 覓語 豪Μ螳 . (B.C 觚)