将MongoDB数据导入Hive,按照https://blog.csdn.net/thriving_fcl/article/details/51471248文章,在hive建外部表与mongodb做映射后,执行后出现
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. com/mongodb/util/JSON
建表语句如下:
CREATE EXTERNAL TABLE mongotohive ( id string, userid string, age bigint, status string ) STORED BY ‘com.mongodb.hadoop.hive.MongoStorageHandler‘ WITH SERDEPROPERTIES(‘mongo.columns.mapping‘=‘{"id":"_id","userid":"user_id","age":"age","status":"status"}‘) TBLPROPERTIES(‘mongo.uri‘=‘mongodb://localhost:27017/mydb.users‘);
mongodb 数据如下:
db.users.find() { "_id" : ObjectId("5b456e33a93daf7ae53e6419"), "user_id" : "abc123", "age" : 58, "status" : "D" } { "_id" : ObjectId("5b45705ca93daf7ae53e8b2a"), "user_id" : "bcd001", "age" : 45, "status" : "C" }
解决方案:
将mongo-hadoop-core-2.0.0.jar、mongo-hadoop-hive-2.0.0.jar、mongo-java-driver-3.7.1.jar三个jar包放到hive的lib文件夹下后,再次运行成功。如下:
hive> CREATE EXTERNAL TABLE mongotohive > ( > id string, > userid string, > age bigint, > status string > ) > STORED BY ‘com.mongodb.hadoop.hive.MongoStorageHandler‘ > WITH SERDEPROPERTIES(‘mongo.columns.mapping‘=‘{"id":"_id","userid":"user_id","age":"age","status":"status"}‘) > TBLPROPERTIES(‘mongo.uri‘=‘mongodb://localhost:27017/mydb.users‘); OK Time taken: 1.431 seconds hive> select * from mongotohive; OK 5b456e33a93daf7ae53e6419 abc123 58 D 5b45705ca93daf7ae53e8b2a bcd001 45 C Time taken: 0.601 seconds, Fetched: 2 row(s) hive>