2014年12月19日星期五

实现hive proxy5-数据目录权限问题解决

本邮件内容由第三方提供,如果您不想继续收到该邮件,可 点此退订
实现hive proxy5-数据目录权限问题解决  阅读原文»

hive创建目录时相关的几个hdfs中的类:

  org.apache.hadoop.hdfs.DistributedFileSystem,FileSystem 的具体实现类  org.apache.hadoop.hdfs.DFSClient,client操作hdfs文件系统的类  org.apache.hadoop.fs.permission.FsPermission 文件权限相关类,主要的方法有getUMask和applyUMask方法  

org.apache.hadoop.hdfs.DistributedFileSystem中需要注意的几个方法:
initialize,主要用来初始DFSClient的实例:

    @Override    public void initialize(URI uri, Configuration conf) throws IOException {      super.initialize(uri, conf);      setConf(conf);      String host = uri.getHost();      if (host == null) {        throw new IOException("Incomplete HDFS URI, no host: "+ uri);      }      this.dfs = new DFSClient(uri, conf, statistics);      this.uri = URI.create(uri.getScheme()+"://"+uri.getAuthority());      this.workingDir = getHomeDirectory();    }  

mkdir用来创建一个目录,mkdirs用来创建多个目录(类似于mkdir -p):

    public boolean mkdir(Path f, FsPermission permission) throws IOException {      statistics.incrementWriteOps(1);      return dfs.mkdirs(getPathName(f), permission, false);    }    public boolean mkdirs(Path f, FsPermission permission) throws IOException {      statistics.incrementWriteOps(1);      return dfs.mkdirs(getPathName(f), permission, true);    }  

两者最终调用的都是DFSClient.mkdirs方法,org.apache.hadoop.hdfs.DFSClient的mkdirs方法:

  final Conf dfsClientConf;  ...    public boolean mkdirs(String src, FsPermission permission,        boolean createParent) throws IOException {      if (permission == null) { //如果传入的权限为null        permission = FsPermission.getDefault();      }      FsPermission masked = permission.applyUMask(dfsClientConf.uMask);      return primitiveMkdir(src, masked, createParent); //调用primitiveMkdir方法    }  

这里需要注意 FsPermission.getDefault方法和Conf.uMask属性(Conf是DFSClient的内部类,主要用来设置默认配置)
Conf.uMask属性:

  uMask = FsPermission.getUMask(conf); //由getUMask获取  

getUMask方法:

   public static final String DEPRECATED_UMASK_LABEL = "dfs.umask";    public static final String UMASK_LABEL =                    CommonConfigurationKeys.FS_PERMISSIONS_UMASK_KEY;  //fs.permissions.umask-mode    public static final int DEFAULT_UMASK =                    CommonConfigurationKeys.FS_PERMISSIONS_UMASK_DEFAULT; //0022    public static FsPermission getUMask(Configuration conf) {      int umask = DEFAULT_UMASK;      if(conf != null) {        String confUmask = conf.get(UMASK_LABEL);        int oldUmask = conf.getInt(DEPRECATED_UMASK_LABEL, Integer.MIN_VALUE); //老的配置项:dfs.umask,默认值为Integer.MIN_VALUE(-2147483648)        try {          if(confUmask != null) { //如果设置了fs.permissions.umask-mode,则按这个umask,否则为默认的umask(0022)            umask = new UmaskParser(confUmask).getUMask();          }        } catch(IllegalArgumentException iae) {          // Provide more explanation for user-facing message          String type = iae instanceof NumberFormatException ? "decimal"              : "octal or symbolic";          String error = "Unable to parse configuration " + UMASK_LABEL              + " with value " + confUmask + " as " + type + " umask.";          LOG.warn(error);          // If oldUmask is not set, then throw the exception          if (oldUmask == Integer.MIN_VALUE) {            throw new IllegalArgumentException(error);          }        }        if(oldUmask != Integer.MIN_VALUE) { //如果手动设置了老的配置项dfs.umask          if (umask != oldUmask) { //并且dfs.umask的值不等于0022            LOG.warn(DEPRECATED_UMASK_LABEL                + " configuration key is deprecated. " + "Convert to "                + UMASK_LABEL + ", using octal or symbolic umask "                + "specifications.");            // Old and new umask values do not match - Use old umask            umask = oldUmask; //umask为默认值0022          }        }      }      return new FsPermission((short)umask);    }  

在hive中创建hdfs的目录有两种方法
1)通过Utilities的createDirsWithPermission方法,这种方法会重设fs.permissions.umask-mode
2)直接通过DistributedFileSystem的mkdirs方法创建
两者最终都是调用了DFSClient的mkdirs方法,不同的是调用Utilities.createDirsWithPermission创建的目录权限在proxy时权限有可能是777(因为手动设置了权限为777),
比如:
Context类的构造函数中创建临时文件目录通过Context.getMRScratchDir调getLocalScratchDir(local job)或getScratchDir(非local job),其中getScratchDir中调用Utilities.createDirsWithPermission方法调用目录

  public static boolean createDirsWithPermission(Configuration conf, Path mkdirPath,      FsPermission fsPermission, boolean recursive) throws IOException {    String origUmask = null;    LOG.warn("Create dirs " + mkdirPath + " with permission " + fsPermission + " recursive " +        recursive);    if (recursive) {    //如果recursive为true,设置fs.permissions.umask-mode为000,    //默认情况下recursive = SessionState.get().isHiveServerQuery() &&conf.getBoolean(HiveConf.ConfVars.HIVE_SERVER2_ENABLE_DOAS.varname,HiveConf.ConfVars.HIVE_SERVER2_ENABLE_DOAS.defaultBoolVal);    //即时来自hiveserver的请求,并且开启了doas,这里还会把权限设置为777(这里我增加了一个逻辑,如果设置了proxy,recursive也为true)    /**    boolean recursive = false;    if (SessionState.get() != null) {      recursive = (SessionState.get().isHiveServerQuery() &&          conf.getBoolean(HiveConf.ConfVars.HIVE_SERVER2_ENABLE_DOAS.varname,              HiveConf.ConfVars.HIVE_SERVER2_ENABLE_DOAS.defaultBoolVal))||(HiveConf.getBoolVar(conf,HiveConf.ConfVars.HIVE_USE_CUSTOM_PROXY));      fsPermission = new FsPermission((short)00777);    }    */      origUmask = conf.get("fs.permissions.umask-mode");      conf.set("fs.permissions.umask-mode", "000");    }    FileSystem fs = ShimLoader.getHadoopShims().getNonCachedFileSystem(mkdirPath.toUri(), conf);    //这里是DFSClient的实例    boolean retval = false;    try {      retval = fs.mkdirs(mkdirPath, fsPermission);      resetConfAndCloseFS(conf, recursive, origUmask, fs);    } catch (IOException ioe) {      try {        resetConfAndCloseFS(conf, recursive, origUmask, fs); //调用resetConfAndCloseFS,reset fs.permissions.umask-mode的设置      }      catch (IOException e) {        // do nothing - double failure      }    }    return retval;  }  

resetConfAndCloseFS方法用来重设fs.permissions.umask-mode的设置,这样如果后面创建目录不是使用Utilities.createDirsWithPermission就会使用这个重设的配置

  private static void resetConfAndCloseFS (Configuration conf, boolean unsetUmask,      String origUmask, FileSystem fs) throws IOException {    if (unsetUmask) { //unsetUmask为true,即recursive为true的话,需要重设fs.permissions.umask-mode      if (origUmask != null) { //如果有设置项的话,使用设置项        conf.set("fs.permissions.umask-mode", origUmask);      } else {        conf.unset("fs.permissions.umask-mode"); //这里虽然可以unset,后面会有默认值      }    }    fs.close();  }  

通过查看DFSClient的源码,发现在DFSClient的构造函数中会初始化ugi的信息,默认为当前用户

  final UserGroupInformation ugi;  ...  this.ugi = UserGroupInformation.getCurrentUser();  如果更改成proxy用户,通过运行hadoop fs -mkdir测试,发现生成的文件目录属主还是当前登录用户  更改DFSClient的构造方法:  //this.ugi = UserGroupInformation.getCurrentUser();  if(conf.getBoolean("use.custom.proxy",false)){    this.ugi = UserGroupInformation.createRemoteUser(conf.get("custom.proxy.user"));  }else{    this.ugi = UserGroupInformation.getCurrentUser();  }  

在hdfs-site.xml配置中增加:
dfs配置中增加:

  <property>      <name>use.custom.proxy</name>      <value>true</value>  </property>  <property>      <name>custom.proxy.user</name>      <value>ericni</value>  </propert
MySQL主主互备模式(Keepalived)  阅读原文»

MySQL主主互备模式(Keepalived)

MySQL双主-高可用

  1. 单台数据库实例安装

    请参考:二进制包安装MySQL

  2. 资源规划

主机名
os 版本
MySQL 版本
主机 IP
MySQL VIP
db01.lyk.com
centos 6.4
mysql-5.6.21-linux-glibc2.5-x86_64

172.31.30.12
172.31.30.222
db02.lyk.com
centos 6.4 mysql-5.6.21-linux-glibc2.5-x86_64 172.31.30.11

3.修改MySQL配置文件

修改DB01的配置文件:

#在[mysqld]添加如下内容#
server-id=100
log-bin=/usr/local/mysql/data/ttpai-bin
binlog_format=MIXED#非必需
relay-log=/usr/local/mysql/data/ttpai-relay-bin
binlog-ignore-db=mysql
binlog-ignore-db=test
binlog-ignore-db=information_schema
binlog-ignore-db=performance_schema
replicate-wild-ignore-table=mysql.%
replicate-wild-ignore-table=test.%
replicate-wild-ignore-table=information_schema.%
replicate-wild-ignore-table=performance_schema.%

修改DB02的配置文件:

#在[mysqld]添加如下内容#
server-id=110
log-bin=/usr/local/mysql/data/ttpai-bin
binlog_format=MIXED#非必需
relay-log=/usr/local/mysql/data/ttpai-relay-bin
binlog-ignore-db=mysql
binlog-ignore-db=test
binlog-ignore-db=information_schema
binlog-ignore-db=performance_schema
replicate-wild-ignore-table=mysql.%
replicate-wild-ignore-table=test.%
replicate-wild-ignore-table=information_schema.%
replicate-wild-ignore-table=performance_schema.%

4.手动同步数据库

如果DB01上已经有MySQL数据,那么执行主主互备之前,需要将DB01和DB02上的两个MySQL的数据保持同步,首先在DB01上备份MySQL数据,执行如下SQL:

mysql>FLUSHTABLESWITHREADLOCK;

在不退出终端的情况下(推出锁失效),再开启一个session,直接打包MySQL的数据文件或者mysqldump工具导出:

cd/usr/local/mysql/
tarzcvfdata.tar.gzdata/

将data.tar.gz 传输到DB02,依次重启DB01和DB02。

其实,可以在不执行READ LOCK语句,直接使用mysqldump语句备份,最起码个人测试是数据不会丢失或者说出现同步异常。使用如下命令:

mysqldump--default-character-set=gbk--opt--triggers-R-E--hex-blob--single-transaction--master-data=2ttpai>ttpai.sql

其中-

阅读更多内容

没有评论:

发表评论