_ | 覦覈襦 | 豕蠏手 | 殊螳 | 譯殊碁
FrontPage › HivePython

伎 襦蠏碁 讌.
import sys
import string
from pykospacing import spacing

for line in sys.stdin:
    line = line.strip()
    pc_id, msg = line.split("\t")
    print ("\t".join([pc_id, spacing(msg)]))

hdfs k.
hadoop fs -put -f /home/hdfs/py/chat_spacing.py /user/hive/udfs

transform 襯 .
set hive.execution.engine=mr;
add file hdfs:///user/hive/udfs/chat_spacing.py;

select transform(msg) using 'python3.4 chat_spacing.py' as (pc_id bigint, msg string)
from (
    select concat(cast(id as string), "\t", sss) msg
    from sample
) t;
蠍 蠍郁鍵..
企: : るジ讓曙 襦螻豺 企Ν 譯殊語. 襦螻豺
EditText : Print : Mobile : FindPage : DeletePage : LikePages : Powered by MoniWiki : Last modified 2018-12-17 15:35:51Anonymous