Pig的安装和使用方法

本文使用的pig版本是pig-0.12.0.tar.gz,在安装以前已经安装好了hadoop,hadoop的安装方法参考 hadoop-1.2.1安装方法详解

pig的安装方法很简单,配置一下环境即可,pig有两种工作模式:本地模式和MapReduce模式(默认)。
1、上传并解压pig-0.12.0.tar.gz
[hadoop@mdw temp]$ tar zxf pig-0.12.0.tar.gz 

2、配置pig的环境变量并使之生效

export PIG_HOME=/home/hadoop/pig-0.12.0
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HBASE_HOME/bin:$HIVE_HOME/bin:$PIG_HOME/bin

3、通过pig命令验证安装(本地模式)
[hadoop@mdw ~]$  pig -x local
2015-06-12 00:23:30,823 [main] INFO  org.apache.pig.Main - Apache Pig version 0.12.0 (r1529718) compiled Oct 07 2013, 12:20:14
2015-06-12 00:23:30,824 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/hadoop/pig_1434093810822.log
2015-06-12 00:23:30,876 [main] INFO  org.apache.pig.impl.util.Utils - Default bootup file /home/hadoop/.pigbootup not found
2015-06-12 00:23:30,964 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -  Connecting to hadoop file system at: file:///
grunt>  quit;

[hadoop@mdw ~]$ 

能看到  grunt>  就说明已经配置成功, file:/// 表示现在是local模式,要使用MapReduce模式,需要正确配置启动hadoop集群,并且pig可以读取到hadoop的配置文件(hadoop的conf目录下的文件)
4、在 .bashrc文件中配置PIG_CLASSPATH,并使用生效
export PIG_CLASSPATH=/home/hadoop/hadoop-1.2.1/conf

5、使用pig命令验证安装(MapReduce模式)
[hadoop@mdw ~]$ pig
2015-06-12 00:35:43,322 [main] INFO  org.apache.pig.Main - Apache Pig version 0.12.0 (r1529718) compiled Oct 07 2013, 12:20:14
2015-06-12 00:35:43,322 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/hadoop/pig_1434094543321.log
2015-06-12 00:35:43,342 [main] INFO  org.apache.pig.impl.util.Utils - Default bootup file /home/hadoop/.pigbootup not found
2015-06-12 00:35:43,463 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -  Connecting to hadoop file system at: hdfs://master:9000
2015-06-12 00:35:43,613 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -  Connecting to map-reduce job tracker at: master:9001
grunt>   quit;
[hadoop@mdw ~]$ 


通过上面的红色部分可以看出,现在的文件系统是hdfs的文件系统,跟本地模式不一样

至此我们已经安装好了pig,由于pig的日志文件是保存到执行pig命令的目录下(不同目录下进入pig日志位置不一样),不利于日志的分析和管理,所以通常指定一个具体的目录,方法如下:
1、创建一个pig的日志目录,我这里放到hadoop用户下的pig/logs文件夹下
[hadoop@mdw ~]$  mkdir -p /home/hadoop/pig/logs

2、修改/home/hadoop/pig-0.12.0/conf/pig.properties文件,去掉配置pig.logfile参数的注释,并配置如下
pig.logfile=/home/hadoop/pig/logs

这样pig的日志就写到指定的目录下了,如下:
[hadoop@mdw conf]$ pig
2015-06-12 00:51:12,399 [main] INFO  org.apache.pig.Main - Apache Pig version 0.12.0 (r1529718) compiled Oct 07 2013, 12:20:14
2015-06-12 00:51:12,399 [main] INFO  org.apache.pig.Main -  Logging error messages to: /home/hadoop/pig/logs/pig_1434095472397.log
2015-06-12 00:51:12,418 [main] INFO  org.apache.pig.impl.util.Utils - Default bootup file /home/hadoop/.pigbootup not found
2015-06-12 00:51:12,524 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://master:9000
2015-06-12 00:51:12,659 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: master:9001
grunt> 

















已标记关键词 清除标记
©️2020 CSDN 皮肤主题: 编程工作室 设计师:CSDN官方博客 返回首页