Category Archives: apache-hive

Apache Hive Not Working PHP Thrift Client

We have install Hadoop 2.7.1 cluster and Hive 1.2.1 Successfully but we are stuck on connecting Apache Hive to PHP using thrift.

We got following error while we try to connect with php library provided in hive distribution.

Parse error: syntax error, unexpected ';', expecting identifier (T_STRING) or \ (T_NS_SEPARATOR) or '{' in /usr/local/hive/lib/php/packages/hive_service/ThriftHive.php on line 2

We have tried this tutorials so far but no luck:

http://phptrends.com/dig_in/php-thrift-hive-client

Apache hive MSCK REPAIR TABLE new partition not added

I am new for Apache Hive. While working on external table partition, if I add new partition directly to HDFS, the new partition is not added after running MSCK REPAIR table. Below are the codes I tried,

-- creating external table

hive> create external table factory(name string, empid int, age int) partitioned by(region string)  
    > row format delimited fields terminated by ','; 

--Detailed Table Information

Location:  hdfs://localhost.localdomain:8020/user/hive/warehouse/factory     
Table Type:             EXTERNAL_TABLE           
Table Parameters:        
    EXTERNAL                TRUE                
    transient_lastDdlTime   1438579844  

-- creating directory in HDFS to load data for table factory

[[email protected] ~]$ hadoop fs -mkdir 'hdfs://localhost.localdomain:8020/user/hive/testing/testing1/factory1'
[[email protected] ~]$ hadoop fs -mkdir 'hdfs://localhost.localdomain:8020/user/hive/testing/testing1/factory2'

-- Table data

cat factory1.txt
emp1,500,40
emp2,501,45
emp3,502,50

cat factory2.txt
EMP10,200,25
EMP11,201,27
EMP12,202,30

-- copying from local to HDFS

[[email protected] ~]$ hadoop fs -copyFromLocal '/home/cloudera/factory1.txt' 'hdfs://localhost.localdomain:8020/user/hive/testing/testing1/factory1'
[[email protected] ~]$ hadoop fs -copyFromLocal '/home/cloudera/factory2.txt' 'hdfs://localhost.localdomain:8020/user/hive/testing/testing1/factory2'

-- Altering table to update in the metastore

hive> alter table factory add partition(region='southregion') location '/user/hive/testing/testing1/factory2';
hive> alter table factory add partition(region='northregion') location '/user/hive/testing/testing1/factory1';            
hive> select * from factory;                                                                      
OK
emp1    500 40  northregion
emp2    501 45  northregion
emp3    502 50  northregion
EMP10   200 25  southregion
EMP11   201 27  southregion
EMP12   202 30  southregion

Now I created new file factory3.txt to add as new partition for the table factory

cat factory3.txt
user1,100,25
user2,101,27
user3,102,30

-- creating the path and copying table data

[[email protected] ~]$ hadoop fs -mkdir 'hdfs://localhost.localdomain:8020/user/hive/testing/testing1/factory2'
[[email protected] ~]$ hadoop fs -copyFromLocal '/home/cloudera/factory3.txt' 'hdfs://localhost.localdomain:8020/user/hive/testing/testing1/factory3'

now I executed the below query to update the metastore for the new partition added

MSCK REPAIR TABLE factory;

Now the table is not giving the new partition content of factory3 file. Can I know where I am doing mistake while adding partition for table factory?

whereas, if I run the alter command then it is showing the new partition data.

hive> alter table factory add partition(region='eastregion') location '/user/hive/testing/testing1/factory3';

Can I know why the MSCK REPAIR TABLE command is not working?