一、基础准备工作
部署环境:CentOS 7 64
1、关闭本地iptables防火墙并设置开机不自启动
#systemctlstopfirewalld.service #systemctldisablefirewalld.service |
2、关闭本地selinux防火墙
#vim/etc/sysconfig/selinux SELINUX=disabled |
3、设置主机计算机名称
#hostnamectlset-hostnamecontroller |
4、本地主机名称和ip的解析
#vim/etc/hosts 192.168.0.104controller |
5、安装ntp时间校准工具
#yum-yinstallntp #ntpdateasia.pool.ntp.org |
6、安装第三方yum源
#yum-yinstallyum-plugin-priorities #yum-yinstallhttp://dl.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-2.noarch.rpm #yum-yinstallhttp://rdo.fedorapeople.org/openstack-juno/rdo-release-juno.rpm |
7、升级系统软件包并重新系统
二、安装配置mariadb数据库
1、安装mariadb数据库
#yum-yinstallmariadbmariadb-serverMySQL-python |
2、配置mariadb数据库
#cp/etc/my.cnf/etc/my.cnf.bak #rpm-qlmariadb #vim/etc/my.cnf.d/server.cnf bind-address=0.0.0.0 default-storage-engine=innodb innodb_file_per_table collation-server=utf8_general_ci init-connect='SETNAMESutf8' character-set-server=utf8 |
3、启动mariadb数据库
#systemctlenablemariadb.service #systemctlstartmariadb.service |
三、安装消息队列服务
1、安装rabbit所需软件包
#yum-yinstallrabbitmq-server |
2、启动rabbit服务
#systemctlenablerabbitmq-server.service #systemctlstartrabbitmq-server.service |
3、设置rabbit服务密码
最近线上的的nm 有crash的问题,查看错误日志:
2014-06-1900:01:22,308FATAL org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:Error:Shuttingdownjava.util. ConcurrentModificationException atjava.util.LinkedList$ListItr.checkForComodification(LinkedList.java:761) atjava.util.LinkedList$ListItr.next(LinkedList.java:696) atorg.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource.toString(LocalizedResource.java:120) atjava.lang.String.valueOf(String.java:2826) atjava.lang.StringBuilder.append(StringBuilder.java:115) atorg.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.run(ResourceLocalizationService.java:656) 2014-06-1900:01:22,308INFOorg.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:Publiccacheexiting 2014-06-1900:03:40,685INFOorg.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:Downloadingpublicrsrc:{hdfs://bipcluster/tmp/hive-hdfs/hive_2014-06-19_00-05-51_049_5891972191087895437/-mr-10004/a1495555-b0dc-4356-8b68-1c881012e123,1403107405580,FILE,null} 2014-06-1900:03:40,685FATALorg.apache.hadoop.yarn.event.AsyncDispatcher:Errorindispatcherthread java.util.concurrent.RejectedExecutionException atjava.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:1768) atjava.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767) atjava.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658) atjava.util.concurrent.ExecutorCompletionService.submit(ExecutorCompletionService.java:152) atorg.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.addResource(ResourceLocalizationService.java:618) atorg.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:514) atorg.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:456) atorg.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:128) atorg.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:77) atjava.lang.Thread.run(Thread.java:662) 2014-06-1900:03:40,685INFOorg.apache.hadoop.yarn.event.AsyncDispatcher:Exiting,bbye. |
是在做resource localize时多线程的并发更新问题导致nm异常退出
这是一个bug,bug id:
https://issues.apache.org/jira/browse/YARN-573
bug描述:
ShareddatastructuresinPublicLocalizerandPrivateLocalizerarenotThreadsafe. PublicLocalizer 1)pendingaccessedbyaddResource(partofeventhandling)andrunmethod(asapartofPublicLocalizer.run()). PrivateLocalizer(LocalizerRunner?) 1)pendingaccessedbyaddResource(partofeventhandling)andfindNextResource(i.remove()).
订阅:
博文评论 (Atom)
|
没有评论:
发表评论