注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

花随梦影蝶纷飞

记住该记住的,忘记该忘记的,改变能改变的,接受不能改变的。

 
 
 

日志

 
 

Hadoop2.4.0单节点部署  

2014-05-08 13:46:55|  分类: ★数据存储★ |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |
1. liunx下创建用户hadoop

adduser hadoop

2. 安装JDK
安装目录: /home/hadoop/java
安装包:jdk-7u55-linux-x64.tar.gz(tar -zxvf jdk-7u55-linux-x64.tar.gz)
配置环境变量(.bashrc):

export JAVA_HOME=/home/hadoop/java/jdk1.7.0_55
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jar:$CLASSPATH
export JRE_HOME=$JAVA_HOME/jre

运行: java -version

3. 安装hadoop
安装目录: /home/hadoop/apps
安装包:hadoop-2.4.0.tar.gz(tar -zxvf hadoop-2.4.0.tar.gz)
配置环境变量(.bashrc):

export HADOOP_PREFIX=/home/hadoop/apps/hadoop-2.4.0
export PATH=$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin:$PATH

运行: hadoop version

4. hadoop支持的3种集群模式
(1)Local (Standalone) Mode(本地(独立)模式)
(2)Pseudo-Distributed Mode(伪分布式模式)
(3)Fully-Distributed Mode(完全分布式模式)

5. 独立模式

mkdir $HADOOP_PREFIX/input
cp $HADOOP_PREFIX/etc/hadoop/*.xml $HADOOP_PREFIX/input
---------------------------------------------------------------------------------
-rw-r-----. 1 hadoop hadoop 3589 May  8 12:51 capacity-scheduler.xml
-rw-r-----. 1 hadoop hadoop  774 May  8 12:51 core-site.xml
-rw-r-----. 1 hadoop hadoop 9257 May  8 12:51 hadoop-policy.xml
-rw-r-----. 1 hadoop hadoop  775 May  8 12:51 hdfs-site.xml
-rw-r-----. 1 hadoop hadoop  620 May  8 12:51 httpfs-site.xml
-rw-r-----. 1 hadoop hadoop  690 May  8 12:51 yarn-site.xml
---------------------------------------------------------------------------------
hadoop jar $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.0.jar grep $HADOOP_PREFIX/input $HADOOP_PREFIX/output 'dfs[a-z.]+' 
cat $HADOOP_PREFIX/output/*
6. 伪分布式模式
$HADOOP_PREFIX/etc/hadoop/core-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>
</configuration>
$HADOOP_PREFIX/etc/hadoop/hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
         <name>dfs.replication</name>
         <value>1</value>
    </property>
</configuration>
SSH本地无口令连接:

cd ~
ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
执行:

(1)Format the filesystem:
    $ bin/hdfs namenode -format
(2)Start NameNode daemon and DataNode daemon:
    $ sbin/start-dfs.sh
(3)Browse the web interface for the NameNode; by default it is available at:
    http://localhost:50070/
(4)Make the HDFS directories required to execute MapReduce jobs:
    $ bin/hdfs dfs -mkdir /user
    $ bin/hdfs dfs -mkdir /user/<username>
(5)Copy the input files into the distributed filesystem:
    $ bin/hdfs dfs -put etc/hadoop input
(6)Run some of the examples provided:
    $ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.0.jar grep input output 'dfs[a-z.]+'
(7)Examine the output files; Copy the output files from the distributed filesystem to the local filesystem and examine them; or View the output files on the distributed filesystem:
    $ bin/hdfs dfs -get output output
    $ cat output/*
    $ bin/hdfs dfs -cat output/*
(8)When you're done, stop the daemons with:
    $ sbin/stop-dfs.sh

7. 部署单节点的YARN
You can run a MapReduce job on YARN in a pseudo-distributed mode by setting a few parameters and running ResourceManager daemon and NodeManager daemon in addition.
$HADOOP_PREFIX/etc/hadoop/mapred-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>
$HADOOP_PREFIX/etc/hadoop/yarn-site.xml

<?xml version="1.0"?>
<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
</configuration>
执行:

(1)Start ResourceManager daemon and NodeManager daemon:
    $ sbin/start-yarn.sh
(2)Browse the web interface for the ResourceManager; by default it is available at:
    http://localhost:8088/
(3)Run a MapReduce job.
(4)When you're done, stop the daemons with:
    $ sbin/stop-yarn.sh
  评论这张
 
阅读(5)| 评论(0)
推荐 转载

历史上的今天

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2017