centos 6.5 安装配置 SGE(Grid Engine)

网络拓扑

        计算机名称               IP地址                    角色
        master                     192.168.35.22       Master Server
        compute1                192.168.52.105      compute1
        compute2                192.168.189.30      compute2



        
防火墙设置

    1. 集群节点执行命令
    
        #service iptables stop
        #chkconfig iptables off
    
    2. 集群节点关闭 selinux
    
        #cat /etc/selinux/config
            
            SELINUX=disabled
            
    3. 集群节点修改机器名
        
        #cat /etc/sysconfig/network
            
            HOSTNAME=计算机名称
            
        #cat /etc/hosts
       
            *.*.*.*   计算机名称
            

       
    
安装 NIS 服务器端

    1. master节点安装必备软件,执行命令
    
        #yum install -y rpcbind yp-tools ypserv
        
    2. master节点设置nis域名,执行命令

        #nisdomainname simcloud.com
        #echo "nisdomainname simcloud.com"  >>/etc/rc.local
        #echo "NISDOMAIN=simcloud.com" >> /etc/sysconfig/network
        
    3. master节点增加配置
        
        #cat /etc/hosts
        
            192.168.35.22 master                                                                                                                             
            192.168.52.105 compute1
            192.168.189.30 compute2            
        
        #cat /etc/sysconfig/network
    
            YPSERV_ARGS="-p 1011"
        
        #/etc/sysconfig/yppasswdd

            YPPASSWDD_ARGS="--port 1012"
            
    4. master节点,替换文件内容
        
        #cat /etc/ypserv.conf
        
            dns: no
            files: 30
            xfr_check_port: yes
            * : * : shadow.byname : port
            * : * : passwd.adjunct.byname : port
        
    5. master节点启动服务
        
        #service rpcbind start
        #service ypserv start                     
        #service yppasswdd start                  
        
        
    6. master节点设置为开机启动项
    
        #chkconfig rpcbind on
        #chkconfig ypserv on
        #chkconfig yppasswdd on
        
    7. 利用 rpcinfo 来检查看看
        #rpcinfo -p localhost
        #rpcinfo -u localhost ypserv
    
    8. master节点执行创建库
    
        #/usr/lib64/yp/ypinit -m
    
    9. master节点更新NIS账户和资料库
    
        #make -C /var/yp
        
    
            
        
NIS Client 客户端设置
    
    1. NIS Client节点安装软件,执行命令
        
        #yum install -y rpcbind yp-tools ypbind

    2. NIS Client节点设置nis域名,执行命令

        #nisdomainname simcloud.com
        #echo "nisdomainname simcloud.com"  >>/etc/rc.local
        #echo "NISDOMAIN=simcloud.com" >> /etc/sysconfig/network
        
    3. NIS Client节点配置文件
    
        #cat /etc/hosts
        
            192.168.35.22 master                                                                                                                             
            192.168.52.105 compute1
            192.168.189.30 compute2
            
        #cat /etc/nsswitch.conf
        
            passwd: files nis
            shadow: files nis
            group:  files nis
            hosts:  files nis dns
    
        #cat /etc/sysconfig/authconfig

            USENIS=yes

        #cat /etc/pam.d/system-auth

            password    sufficient    pam_unix.so sha512 shadow nis nullok try_first_pass use_authtok

        #cat /etc/yp.conf

            domain simcloud.com server 192.168.35.22
        
    4. NIS Client节点启动服务
                                
        #service rpcbind restart
        #service ypbind restart         
        
    5. NIS Client节点开机启动
        
        #chkconfig rpcbind on
        #chkconfig ypbind on

    6. NIS Client节点yptest用来测试 server 端和 client 端能否正常通讯
    
        #yptest
    

        

        
NFS 环境搭建服务端

    1. 安装软件
    
        #yum -y install nfs-utils
        
    2. 设置开机自启动
    
        #chkconfig rpcbind on    // 如果设置 NIS 请忽略
        #chkconfig nfs on
        
    3. 启动服务:一定要先启动rpc,然后启动nfs(nfs需要向rpc注册,rpc一旦重启,所以注册的文件都丢失,其他向注册的服务都需要重启)
       
        #service rpcbind restart // 如果设置 NIS 请忽略
        #service nfs restart
        
    4. 确认NFS服务器启动成功
    
        #rpcinfo -p
    
    5. 修改NFS配置文件,定义共享
        
        #cat /etc/exports 或 直接打开 /etc/exports 文件修改
            
            /opt *(insecure,rw,async,no_root_squash)
            /home *(insecure,rw,async,no_root_squash)

    6. 重启 rpcbind nfs
        
        #service rpcbind restart // 如果设置 NIS 请忽略
        #service nfs restart
        
    7. netfs启动
    
        #service netfs start

    8. 开机启动
    
        #chkconfig netfs on
        
        
    
    
NFS Client 设置
    
    1. 安装软件
    
        #yum -y install nfs-utils
        
    2. 设置开机自启动
    
        #chkconfig rpcbind on    // 如果设置 NIS 请忽略
        #chkconfig nfs on
        
    3. 启动服务:一定要先启动rpc,然后启动nfs(nfs需要向rpc注册,rpc一旦重启,所以注册的文件都丢失,其他向注册的服务都需要重启)
       
        #service rpcbind restart // 如果设置 NIS 请忽略
        #service nfs restart
        
    1. netfs 启动
    
        #service netfs start

    2. 开机启动
    
        #chkconfig netfs on

    3. 在 compute1 查看192.168.35.22服务器可挂载的目录
    
        #showmount -e 192.168.35.22

    4. 挂载共享的NFS文件系统
    
        #mount 192.168.35.22:/home /home
        #mount 192.168.35.22:/opt /opt

    5. 查看是否已经挂载成功
    
        #mount | grep opt

    6. 设置客户端开机时自动挂载,在root用户下,/etc/fstab 追加
        
        #cat /etc/fstab

            192.168.35.22:/home /home nfs defaults 0 0
            192.168.35.22:/opt /opt nfs hard,intr,defaults 0 0
        
        
        
    
四. 安装SGE软件配置

    1. 集群节点必须安装软件
    
        #yum install vim dos2unix expect syslinux telnet lrzsz -y
    
    2. 在计算节点自动挂载存储的方法是在 /etc/fstab 中添加
    
        #cat /etc/fstab   // 如果安装NFS请忽略
    
            192.168.35.22:/home /home nfs hard,intr,defaults 0 1
            192.168.35.22:/opt /opt nfs hard,intr,defaults 0 1
            
    3. 增加sgeadmin 用户
        
        #useradd sgeadmin
        #passwd sgeadmin
    
    4. NIS 用户同步
        
        #/usr/lib64/yp/ypinit -m
        #make -C /var/yp
        #make -C /var/yp passwd
    
    3. 在 master 解压 SGE 安装包,指令如下
    
        #cd /opt/
        #mkdir gridengine
        #tar -zxvf sge-6_2u5-bin-linux24-x64.tar.gz -C /opt/gridengine
        #tar -zxvf sge-6_2u5-common.tar.gz -C /opt/gridengine
        
    4. 赋权设置 777 权限

        #chmod 777 /opt/gridengine
    
    4. 编辑/opt/gridengine/util/arch文件,找到下面这一段:
        if [ $? -ne 0 ]; then
            unsupported="UNSUPPORTED-"
            lxrelease="${lxrelease}-GLIBC"
        else
            libc_version=`echo $libc_string | tr ' ,' '\n' | grep "2\." | cut -f 2 -d "."`
        if [ $libc_version -lt 2 ]; then
            unsupported="UNSUPPORTED-"
            lxrelease=24-GLIBC-2.${libc_version}
            
        在 libc_version=`echo $libc_string | tr ' ,' '\n' | grep "2\." | cut -f 2 -d "."` 与if [ $libc_version -lt 2 ]; then 这两行之间,加上 libc_version=12    
    
    5. 设置 SGE_ROOT
    
        #cat /etc/profile
        
            export SGE_ROOT="/opt/gridengine"
            
        #source /etc/profile    
    
    4. 在 master 安装
    
        #cd /opt/gridengine
        #./install_qmaster
            --More--(100%)
            Do you agree with that license? (y/n) [n] >> 输入:y
            Welcome to the Grid Engine installation
            ---------------------------------------

            Grid Engine qmaster host installation
            -------------------------------------

            Before you continue with the installation please read these hints:

               - Your terminal window should have a size of at least
                 80x24 characters

               - The INTR character is often bound to the key Ctrl-C.
                 The term >Ctrl-C< is used during the installation if you
                 have the possibility to abort the installation

            The qmaster installation procedure will take approximately 5-10 minutes.

            Hit <RETURN> to continue >> 输入:回车
            
            Choosing Grid Engine admin user account
            ---------------------------------------

            You may install Grid Engine that all files are created with the user id of an
            unprivileged user.

            This will make it possible to install and run Grid Engine in directories
            where user >root< has no permissions to create and write files and directories.

               - Grid Engine still has to be started by user >root<

               - this directory should be owned by the Grid Engine administrator

            Do you want to install Grid Engine
            under an user id other than >root< (y/n) [y] >>  输入:回车(即选择默认的y)
            
            Choosing a Grid Engine admin user name
            --------------------------------------

            Please enter a valid user name >> 输入:sgeadmin
            
            Installing Grid Engine as admin user >sgeadmin<

            Hit <RETURN> to continue >> 输入:回车
            
            Checking $SGE_ROOT directory
            ----------------------------

            The Grid Engine root directory is not set!
            Please enter a correct path for SGE_ROOT.

            If this directory is not correct (e.g. it may contain an automounter
            prefix) enter the correct path to this directory or hit <RETURN>
            to use default [/opt/gridengine] >> 输入:回车
            
            Your $SGE_ROOT directory: /opt/gridengine

            Hit <RETURN> to continue >>  输入:回车
            
            Grid Engine TCP/IP communication service
            ----------------------------------------

            The port for sge_qmaster is currently set as service.

               sge_qmaster service set to port 6444

            Now you have the possibility to set/change the communication ports by using the
            >shell environment< or you may configure it via a network service, configured
            in local >/etc/service<, >NIS< or >NIS+<, adding an entry in the form

                sge_qmaster <port_number>/tcp

            to your services database and make sure to use an unused port number.

            How do you want to configure the Grid Engine communication ports?

            Using the >shell environment<:                           [1]

            Using a network service like >/etc/service<, >NIS/NIS+<: [2]

            (default: 2) >> 输入:回车
            
            Grid Engine TCP/IP service >sge_qmaster<
            ----------------------------------------

            Using the service

               sge_qmaster

            for communication with Grid Engine.

            Hit <RETURN> to continue >> 输入:回车
            
            Grid Engine TCP/IP communication service
            ----------------------------------------

            The port for sge_execd is currently set as service.

               sge_execd service set to port 6445

            Now you have the possibility to set/change the communication ports by using the
            >shell environment< or you may configure it via a network service, configured
            in local >/etc/service<, >NIS< or >NIS+<, adding an entry in the form

                sge_execd <port_number>/tcp

            to your services database and make sure to use an unused port number.

            How do you want to configure the Grid Engine communication ports?

            Using the >shell environment<:                           [1]

            Using a network service like >/etc/service<, >NIS/NIS+<: [2]

            (default: 2) >> 输入:回车
            
            Grid Engine TCP/IP communication service
            -----------------------------------------

            Using the service

               sge_execd

            for communication with Grid Engine.

            Hit <RETURN> to continue >> 输入:回车
            
            Grid Engine cells
            -----------------

            Grid Engine supports multiple cells.

            If you are not planning to run multiple Grid Engine clusters or if you don't
            know yet what is a Grid Engine cell it is safe to keep the default cell name

               default

            If you want to install multiple cells you can enter a cell name now.

            The environment variable

               $SGE_CELL=<your_cell_name>

            will be set for all further Grid Engine commands.

            Enter cell name [default] >> 输入:回车
            
            Using cell >default<.
            Hit <RETURN> to continue >> 输入:回车
            
            Unique cluster name
            -------------------

            The cluster name uniquely identifies a specific Sun Grid Engine cluster.
            The cluster name must be unique throughout your organization. The name
            is not related to the SGE cell.

            The cluster name must start with a letter ([A-Za-z]), followed by letters,
            digits ([0-9]), dashes (-) or underscores (_).

            Enter new cluster name or hit <RETURN>
            to use default [p6444] >> 输入:回车
            
            creating directory: /opt/gridengine/default/common

            Your $SGE_CLUSTER_NAME: p6444

            Hit <RETURN> to continue >> 输入:回车
            
            Grid Engine qmaster spool directory
            -----------------------------------

            The qmaster spool directory is the place where the qmaster daemon stores
            the configuration and the state of the queuing system.

            The admin user >sgeadmin< must have read/write access
            to the qmaster spool directory.

            If you will install shadow master hosts or if you want to be able to start
            the qmaster daemon on other hosts (see the corresponding section in the
            Grid Engine Installation and Administration Manual for details) the account
            on the shadow master hosts also needs read/write access to this directory.

            Enter a qmaster spool directory [/opt/gridengine/default/spool/qmaster] >> 输入:回车
            
            Using qmaster spool directory >/opt/gridengine/default/spool/qmaster<.
            Hit <RETURN> to continue >> 输入:回车
            
            Windows Execution Host Support
            ------------------------------

            Are you going to install Windows Execution Hosts? (y/n) [n] >> 输入:回车
            
            Verifying and setting file permissions
            --------------------------------------

            Did you install this version with >pkgadd< or did you already verify
            and set the file permissions of your distribution (enter: y) (y/n) [y] >> 输入:回车
            
            We do not verify file permissions. Hit <RETURN> to continue >> 输入:回车
            
            Select default Grid Engine hostname resolving method
            ----------------------------------------------------

            Are all hosts of your cluster in one DNS domain? If this is
            the case the hostnames

               >hostA< and >hostA.foo.com<

            would be treated as equal, because the DNS domain name >foo.com<
            is ignored when comparing hostnames.

            Are all hosts of your cluster in a single DNS domain (y/n) [y] >> 输入:回车
            
            Ignoring domain name when comparing hostnames.

            Hit <RETURN> to continue >> 输入:回车
            
            Grid Engine JMX MBean server
            ----------------------------

            In order to use the SGE Inspect or the Service Domain Manager (SDM)
            SGE adapter you need to configure a JMX server in qmaster. Qmaster
            will then load a Java Virtual Machine through a shared library.
            NOTE: Java 1.5 or later is required for the JMX MBean server.

            Do you want to enable the JMX MBean server (y/n) [y] >> 输入:n (这个地方一定要选择n)回车

            Making directories
            ------------------

            creating directory: /opt/gridengine/default/spool/qmaster
            creating directory: /opt/gridengine/default/spool/qmaster/job_scripts
            Hit <RETURN> to continue >> 输入:回车
            
            Setup spooling
            --------------
            Your SGE binaries are compiled to link the spooling libraries
            during runtime (dynamically). So you can choose between Berkeley DB
            spooling and Classic spooling method.
            Please choose a spooling method (berkeleydb|classic) [berkeleydb] >> 输入:回车
            
            The Berkeley DB spooling method provides two configurations!

            Local spooling:
            The Berkeley DB spools into a local directory on this host (qmaster host)
            This setup is faster, but you can't setup a shadow master host

            Berkeley DB Spooling Server:
            If you want to setup a shadow master host, you need to use
            Berkeley DB Spooling Server!
            In this case you have to choose a host with a configured RPC service.
            The qmaster host connects via RPC to the Berkeley DB. This setup is more
            failsafe, but results in a clear potential security hole. RPC communication
            (as used by Berkeley DB) can be easily compromised. Please only use this
            alternative if your site is secure or if you are not concerned about
            security. Check the installation guide for further advice on how to achieve
            failsafety without compromising security.

            Do you want to use a Berkeley DB Spooling Server? (y/n) [n] >> 输入:回车
            
            Hit <RETURN> to continue >> 输入:回车
            
            Berkeley Database spooling parameters
            -------------------------------------

            Please enter the database directory now, even if you want to spool locally,
            it is necessary to enter this database directory.

            Default: [/opt/gridengine/default/spool/spooldb] >> 输入:回车
            
            creating directory: /opt/gridengine/default/spool/spooldb
            Dumping bootstrapping information
            Initializing spooling database

            Hit <RETURN> to continue >> 输入:回车
            
            Grid Engine group id range
            --------------------------

            When jobs are started under the control of Grid Engine an additional group id
            is set on platforms which do not support jobs. This is done to provide maximum
            control for Grid Engine jobs.

            This additional UNIX group id range must be unused group id's in your system.
            Each job will be assigned a unique id during the time it is running.
            Therefore you need to provide a range of id's which will be assigned
            dynamically for jobs.

            The range must be big enough to provide enough numbers for the maximum number
            of Grid Engine jobs running at a single moment on a single host. E.g. a range
            like >20000-20100< means, that Grid Engine will use the group ids from
            20000-20100 and provides a range for 100 Grid Engine jobs at the same time
            on a single host.

            You can change at any time the group id range in your cluster configuration.

            Please enter a range [20000-20100] >> 输入:20000-21000  回车
            
            Using >20000-21000< as gid range. Hit <RETURN> to continue >> 输入:回车
            
            Grid Engine cluster configuration
            ---------------------------------

            Please give the basic configuration parameters of your Grid Engine
            installation:

               <execd_spool_dir>

            The pathname of the spool directory of the execution hosts. User >sgeadmin<
            must have the right to create this directory and to write into it.

            Default: [/opt/gridengine/default/spool] >> 输入:回车
            
            Grid Engine cluster configuration (continued)
            ---------------------------------------------

            <administrator_mail>

            The email address of the administrator to whom problem reports are sent.

            It is recommended to configure this parameter. You may use >none<
            if you do not wish to receive administrator mail.

            Please enter an email address in the form >user@foo.com<.

            Default: [none] >> 输入:yongqian.liu@peraglobal.com 回车
            
            The following parameters for the cluster configuration were configured:

               execd_spool_dir        /opt/gridengine/default/spool
               administrator_mail     yongqian.liu@peraglobal.com

            Do you want to change the configuration parameters (y/n) [n] >> 输入:回车
            
            Creating local configuration
            ----------------------------
            Creating >act_qmaster< file
            Adding default complex attributes
            Adding default parallel environments (PE)
            Adding SGE default usersets
            Adding >sge_aliases< path aliases file
            Adding >qtask< qtcsh sample default request file
            Adding >sge_request< default submit options file
            Creating >sgemaster< script
            Creating >sgeexecd< script
            Creating settings files for >.profile/.cshrc<

            Hit <RETURN> to continue >> 输入:回车
            
            qmaster startup script
            ----------------------

            We can install the startup script that will
            start qmaster at machine boot (y/n) [y] >> 输入:回车
            
            
            Installing startup script /etc/rc.d/rc3.d/S95sgemaster.p6444 and /etc/rc.d/rc3.d/K03sgemaster.p6444

            Hit <RETURN> to continue >> 输入:回车
            
            Grid Engine qmaster startup
            ---------------------------

            Starting qmaster daemon. Please wait ...
               starting sge_qmaster
            Hit <RETURN> to continue >> 输入:回车
            
            Adding Grid Engine hosts
            ------------------------

            Please now add the list of hosts, where you will later install your execution
            daemons. These hosts will be also added as valid submit hosts.

            Please enter a blank separated list of your execution hosts. You may
            press <RETURN> if the line is getting too long. Once you are finished
            simply press <RETURN> without entering a name.

            You also may prepare a file with the hostnames of the machines where you plan
            to install Grid Engine. This may be convenient if you are installing Grid
            Engine on many hosts.

            Do you want to use a file which contains the list of hosts (y/n) [n] >> 输入:回车
            
            Adding admin and submit hosts
            -----------------------------

            Please enter a blank seperated list of hosts.

            Stop by entering <RETURN>. You may repeat this step until you are
            entering an empty list. You will see messages from Grid Engine
            when the hosts are added.

            Host(s): 输入:master 回车
            
            adminhost "master" already exists
            master added to submit host list
            Hit <RETURN> to continue >> 输入:回车
            
            Adding admin and submit hosts
            -----------------------------

            Please enter a blank seperated list of hosts.

            Stop by entering <RETURN>. You may repeat this step until you are
            entering an empty list. You will see messages from Grid Engine
            when the hosts are added.

            Host(s): 输入:compute1 回车
            
            compute1 added to administrative host list
            compute1 added to submit host list
            Hit <RETURN> to continue >> 输入:回车
            
            Adding admin and submit hosts
            -----------------------------

            Please enter a blank seperated list of hosts.

            Stop by entering <RETURN>. You may repeat this step until you are
            entering an empty list. You will see messages from Grid Engine
            when the hosts are added.

            Host(s): 输入:compute2 回车
            
            compute2 added to administrative host list
            compute2 added to submit host list
            Hit <RETURN> to continue >> 输入:回车
            
            
            Adding admin and submit hosts
            -----------------------------

            Please enter a blank seperated list of hosts.

            Stop by entering <RETURN>. You may repeat this step until you are
            entering an empty list. You will see messages from Grid Engine
            when the hosts are added.

            Host(s): 输入:回车
            
            Finished adding hosts. Hit <RETURN> to continue >> 输入:回车
            
            If you want to use a shadow host, it is recommended to add this host
            to the list of administrative hosts.

            If you are not sure, it is also possible to add or remove hosts after the
            installation with <qconf -ah hostname> for adding and <qconf -dh hostname>
            for removing this host

            Attention: This is not the shadow host installation
            procedure.
            You still have to install the shadow host separately

            Do you want to add your shadow host(s) now? (y/n) [y] >> 输入:n 回车
            
            Creating the default <all.q> queue and <allhosts> hostgroup
            -----------------------------------------------------------

            root@master added "@allhosts" to host group list
            root@master added "all.q" to cluster queue list

            Hit <RETURN> to continue >> 输入:回车
            
            Scheduler Tuning
            ----------------

            The details on the different options are described in the manual.

            Configurations
            --------------
            1) Normal
                      Fixed interval scheduling, report limited scheduling information,
                      actual + assumed load

            2) High
                      Fixed interval scheduling, report limited scheduling information,
                      actual load

            3) Max
                      Immediate Scheduling, report no scheduling information,
                      actual load

            Enter the number of your preferred configuration and hit <RETURN>!
            Default configuration is [1] >> 输入:回车
            
            We're configuring the scheduler with >Normal< settings!
            Do you agree? (y/n) [y] >> 输入:回车
            
            Using Grid Engine
            -----------------

            You should now enter the command:

               source /opt/gridengine/default/common/settings.csh

            if you are a csh/tcsh user or

               # . /opt/gridengine/default/common/settings.sh

            if you are a sh/ksh user.

            This will set or expand the following environment variables:

               - $SGE_ROOT         (always necessary)
               - $SGE_CELL         (if you are using a cell other than >default<)
               - $SGE_CLUSTER_NAME (always necessary)
               - $SGE_QMASTER_PORT (if you haven't added the service >sge_qmaster<)
               - $SGE_EXECD_PORT   (if you haven't added the service >sge_execd<)
               - $PATH/$path       (to find the Grid Engine binaries)
               - $MANPATH          (to access the manual pages)

            Hit <RETURN> to see where Grid Engine logs messages >> 输入:回车
            
            Grid Engine messages
            --------------------

            Grid Engine messages can be found at:

               /tmp/qmaster_messages (during qmaster startup)
               /tmp/execd_messages   (during execution daemon startup)

            After startup the daemons log their messages in their spool directories.

               Qmaster:     /opt/gridengine/default/spool/qmaster/messages
               Exec daemon: <execd_spool_dir>/<hostname>/messages


            Grid Engine startup scripts
            ---------------------------

            Grid Engine startup scripts can be found at:

               /opt/gridengine/default/common/sgemaster (qmaster)
               /opt/gridengine/default/common/sgeexecd (execd)

            Do you want to see previous screen about using Grid Engine again (y/n) [n] >> 输入:回车
            
            Your Grid Engine qmaster installation is now completed
            ------------------------------------------------------

            Please now login to all hosts where you want to run an execution daemon
            and start the execution host installation procedure.

            If you want to run an execution daemon on this host, please do not forget
            to make the execution host installation in this host as well.

            All execution hosts must be administrative hosts during the installation.
            All hosts which you added to the list of administrative hosts during this
            installation procedure can now be installed.

            You may verify your administrative hosts with the command

               # qconf -sh

            and you may add new administrative hosts with the command

               # qconf -ah <hostname>

            Please hit <RETURN> >> 输入:回车    
        
    5. 安装完成后,集群节点要在/etc/profile 中加入如下指令
    
        #cat /etc/profile
        
            . /opt/gridengine/default/common/settings.sh
        
        #source /etc/profile
        
    6. 在 master 节点执行

        #qconf -sh
        
    7. 在 compute1 和 compute2 节点上执行安装 执行节点
    
        #./install_execd
        
        Welcome to the Grid Engine execution host installation
        ------------------------------------------------------

        If you haven't installed the Grid Engine qmaster host yet, you must execute
        this step (with >install_qmaster<) prior the execution host installation.

        For a sucessfull installation you need a running Grid Engine qmaster. It is
        also neccesary that this host is an administrative host.

        You can verify your current list of administrative hosts with
        the command:

           # qconf -sh

        You can add an administrative host with the command:

           # qconf -ah <hostname>

        The execution host installation will take approximately 5 minutes.

        Hit <RETURN> to continue >> 输入:回车


        Checking $SGE_ROOT directory
        ----------------------------

        The Grid Engine root directory is:

           $SGE_ROOT = /opt/gridengine

        If this directory is not correct (e.g. it may contain an automounter
        prefix) enter the correct path to this directory or hit <RETURN>
        to use default [/opt/gridengine] >> 输入:回车

        Your $SGE_ROOT directory: /opt/gridengine

        Hit <RETURN> to continue >> 输入:回车


        Grid Engine cells
        -----------------

        Please enter cell name which you used for the qmaster
        installation or press <RETURN> to use [default] >> 输入:回车

        Using cell: >default<

        Hit <RETURN> to continue >> 输入:回车



        Grid Engine TCP/IP communication service
        ----------------------------------------

        The port for sge_execd is currently set as service.

           sge_execd service set to port 6445

        Hit <RETURN> to continue >>

        Checking hostname resolving
        ---------------------------

        This hostname is known at qmaster as an administrative host.

        Hit <RETURN> to continue >>

        Execd spool directory configuration
        -----------------------------------

        You defined a global spool directory when you installed the master host.
        You can use that directory for spooling jobs from this execution host
        or you can define a different spool directory for this execution host.

        ATTENTION: For most operating systems, the spool directory does not have to
        be located on a local disk. The spool directory can be located on a
        network-accessible drive. However, using a local spool directory provides
        better performance.

        FOR WINDOWS USERS: On Windows systems, the spool directory MUST be located
        on a local disk. If you install an execution daemon on a Windows system
        without a local spool directory, the execution host is unusable.

        The spool directory is currently set to:
        <</opt/gridengine/default/spool/compute1>>

        Do you want to configure a different spool directory
        for this host (y/n) [n] >> 输入:y 回车

        Enter the spool directory now! >> 输入:/home/sgeadmin/compute1 回车,(compute2 节点就输入 /home/sgeadmin/compute2
        Using execd spool directory [/home/sgeadmin/compute1]
        Hit <RETURN> to continue >> 输入:回车

        creating directory: /home/sgeadmin/compute1

        Creating local configuration
        ----------------------------
        sgeadmin@compute1 added "compute1" to configuration list
        Local configuration for host >compute1< created.

        Hit <RETURN> to continue >> 输入:回车

        execd startup script
        --------------------

        We can install the startup script that will
        start execd at machine boot (y/n) [y] >> 输入:回车

        Installing startup script /etc/rc.d/rc3.d/S96sgeexecd.p6444 and /etc/rc.d/rc3.d/K02sgeexecd.p6444

        Hit <RETURN> to continue >> 输入:回车

        Grid Engine execution daemon startup
        ------------------------------------

        Starting execution daemon. Please wait ...
           starting sge_execd

        Hit <RETURN> to continue >> 输入:回车

        Adding a queue for this host
        ----------------------------

        We can now add a queue instance for this host:

           - it is added to the >allhosts< hostgroup
           - the queue provides 4 slot(s) for jobs in all queues
             referencing the >allhosts< hostgroup

        You do not need to add this host now, but before running jobs on this host
        it must be added to at least one queue.

        Do you want to add a default queue instance for this host (y/n) [y] >> 输入:回车

        root@compute1 modified "@allhosts" in host group list
        root@compute1 modified "all.q" in cluster queue list

        Hit <RETURN> to continue >> 输入:回车



        Using Grid Engine
        -----------------

        You should now enter the command:

           source /opt/gridengine/default/common/settings.csh

        if you are a csh/tcsh user or

           # . /opt/gridengine/default/common/settings.sh

        if you are a sh/ksh user.

        This will set or expand the following environment variables:

           - $SGE_ROOT         (always necessary)
           - $SGE_CELL         (if you are using a cell other than >default<)
           - $SGE_CLUSTER_NAME (always necessary)
           - $SGE_QMASTER_PORT (if you haven't added the service >sge_qmaster<)
           - $SGE_EXECD_PORT   (if you haven't added the service >sge_execd<)
           - $PATH/$path       (to find the Grid Engine binaries)
           - $MANPATH          (to access the manual pages)

        Hit <RETURN> to see where Grid Engine logs messages >> 输入:回车

        Grid Engine messages
        --------------------

        Grid Engine messages can be found at:

           /tmp/qmaster_messages (during qmaster startup)
           /tmp/execd_messages   (during execution daemon startup)

        After startup the daemons log their messages in their spool directories.

           Qmaster:     /opt/gridengine/default/spool/qmaster/messages
           Exec daemon: <execd_spool_dir>/<hostname>/messages


        Grid Engine startup scripts
        ---------------------------

        Grid Engine startup scripts can be found at:

           /opt/gridengine/default/common/sgemaster (qmaster)
           /opt/gridengine/default/common/sgeexecd (execd)

        Do you want to see previous screen about using Grid Engine again (y/n) [n] >> 输入:回车

        Your execution daemon installation is now completed.

        安装结束

posted @ 2019-04-17 16:24  阿谦  阅读(2795)  评论(0编辑  收藏  举报