【RAC】RAC中的负载均衡和故障切换--TAF配置

涉及到的内容包括:





  

Oracle RAC 客户端连接负载均衡(Load Balance)

 

     实现负载均衡(Load Balance)是Oracle RAC最重要的特性之一,主要是把负载平均分配到集群中的各个节点,以提高系统的整体吞吐能力。通常情况下有两种方式来实现负载均衡,一个是基于客户端连接的负载均衡,一个是基于服务器端监听器(Listener)收集到的信息来将新的连接请求分配到连接数较少实例上的实现方式。本文主要讨论的是基于客户端连接的负载均衡,并给出演示。

        与负载均衡配置之前的监听配置请参考
              ORACLE RAC 监听配置 (listener.ora tnsnames.ora) 
              ORACLE RAC 下非缺省端口监听配置(listener.ora tnsnames.ora)

 

一、客户端的负载均衡
         客户端的负载均衡主要是通过为tnsnames.ora增加load_balance=yes条目来实现,下面看看oracle(Note:226880.1)的解释

        The client load balancing feature enables clients to randomize connection requests among the listeners. Oracle Net progresses through the list of
        protocol addresses in a random sequence, balancing the load on the various listeners. Without client load balancing, Oracle Net progresses through the
        list of protocol addresses sequentially until one succeeds.  This normally is referred to connect-time load balance.

        从上面的描述中可以得知,如果未开启load_balance=yes时,Oracle Net会根据地址列表按顺序来选择一个进行连接,直到连接成功为止。   
        如果第一个host主机连接失败,在有多个地址的情形下,接下来选择第二个地址连接,依此类推,直到连接成功为止。
        当开启了load_balance=yes时,则Oracle Net会从多个地址中随机地选择一个地址进行连接,直到连接成功为止。
        注意,此连接方式仅根据地址列表随机选择,并不考虑到各个实例上当前真正连接数量的多少,也即是没有考虑各个节点真实的连接负载情况。

 

二、服务器与客户端的配置情况

[sql] view plain copy  print?
1、服务器端监听器配置  
oracle@bo2dbp:~> more $ORACLE_HOME/network/admin/listener.ora  --#节点bo2dbp上的listener  
# listener.ora.bo2dbp Network Configuration File: /u01/oracle/db/network/admin/listener.ora.bo2dbp  
# Generated by Oracle configuration tools.  
  
LISTENER_NEW_BO2DBP =  
  (DESCRIPTION_LIST =  
    (DESCRIPTION =  
      (ADDRESS = (PROTOCOL = TCP)(HOST = bo2dbp-vip.2gotrade.com)(PORT = 1314)(IP = FIRST))  
      (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.7.51)(PORT = 1314)(IP = FIRST))  
    )  
  )  
...........................--#其余部分省略,注意此处的配置是使用了1314的非缺省监听器端口号     
   
oracle@bo2dbs:/u01/oracle/db/network/admin> more listener.ora  --#节点bo2dbs上的listener  
# listener.ora.bo2dbs Network Configuration File: /u01/oracle/db/network/admin/listener.ora.bo2dbs  
# Generated by Oracle configuration tools.  
  
LISTENER_NEW_BO2DBS =  
  (DESCRIPTION_LIST =  
    (DESCRIPTION =  
      (ADDRESS = (PROTOCOL = TCP)(HOST = bo2dbs-vip.2gotrade.com)(PORT = 1314)(IP = FIRST))  
      (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.7.52)(PORT = 1314)(IP = FIRST))  
    )  
  )  
...............  
  
2、参数配置  
-->instrance ora10g1上的参数配置  
SQL> show parameter instance_na  
  
NAME                                 TYPE        VALUE  
------------------------------------ ----------- ------------------------------  
instance_name                        string      ora10g1  
  
SQL> show parameter listener  
  
NAME                                 TYPE        VALUE  
------------------------------------ ----------- ------------------------------  
local_listener                       string      local_lsnr_ora10g1  
remote_listener                      string      remote_lsnr_ora10g  
  
-->instrance ora10g2上的参数配置  
SQL> show parameter instance_name  
  
NAME                                 TYPE        VALUE  
------------------------------------ ----------- ------------------------------  
instance_name                        string      ora10g2  
  
SQL> show parameter listener  
  
NAME                                 TYPE        VALUE  
------------------------------------ ----------- ------------------------------  
local_listener                       string      local_lsnr_ora10g2  
remote_listener                      string      remote_lsnr_ora10g  
  
-->instrance ora10g2上的监听信息  
SQL> ho ps -ef | grep lsnr  
oracle   17372     1  0 11:00 ?        00:00:00 /u01/oracle/db/bin/tnslsnr LISTENER_NEW_BO2DBS -inherit  
oracle   17502 24301  0 12:10 pts/0    00:00:00 /bin/bash -c ps -ef | grep lsnr  
oracle   17504 17502  0 12:10 pts/0    00:00:00 grep lsnr  
  
SQL> ho lsnrctl status LISTENER_NEW_BO2DBS  
Listening Endpoints Summary...  
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.7.62)(PORT=1314)))  
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.7.52)(PORT=1314)))  
Services Summary...  
Service "ora10g" has 2 instance(s).  
  Instance "ora10g1", status READY, has 1 handler(s) for this service...  
  Instance "ora10g2", status READY, has 2 handler(s) for this service...  
..........  
    
-->由于只是测试基于客户端的load balance,因此移出remote_listener参数  
SQL> alter system reset remote_listener scope=both sid='*';  
alter system reset remote_listener scope=both sid='*'  
*  
ERROR at line 1:  
ORA-32009: cannot reset the memory value for instance * from instance ora10g2  
  
SQL> alter system reset remote_listener scope=spfile sid='*';  
  
System altered.  
  
SQL> ho srvctl stop database -d ora10g  -->关闭数据库ora10g  
  
SQL> ho srvctl start database -d ora10g  -->启动数据库ora10g使得刚刚修改的remote_listener生效  
  
SQL> ho lsnrctl status LISTENER_NEW_BO2DBS  -->此时可以看到只有ora10g2注册到监听器  
  
Listening Endpoints Summary...  
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.7.62)(PORT=1314)))  
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.7.52)(PORT=1314)))  
Services Summary...  
Service "PLSExtProc" has 1 instance(s).  
  Instance "PLSExtProc", status UNKNOWN, has 1 handler(s) for this service...  
Service "ora10g" has 1 instance(s).  
  Instance "ora10g2", status READY, has 1 handler(s) for this service...  
..........    
  
oracle@bo2dbp:~> lsnrctl status LISTENER_NEW_BO2DBP  #同样在节点bo2dbp也只有ora10g1注册到监听器  
  
Listening Endpoints Summary...  
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.7.61)(PORT=1314)))  
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.7.51)(PORT=1314)))  
Services Summary...  
Service "ora10g" has 1 instance(s).  
  Instance "ora10g1", status READY, has 1 handler(s) for this service...  
.........  
  
3、客户端配置  
SZDB:~ # ifconfig eth1 | grep "inet addr"|cut -d " " -f12|cut -d : -f2  #客户端主机的ip  
192.168.7.2  
  
SZDB:~ # su - oracle  
oracle@SZDB:~> cat /etc/hosts     --#客户端主机添加了RAC上的两个虚拟节点的ip信息  
192.168.7.61   bo2dbp-vip.2gotrade.com    bo2dbp-vip  
192.168.7.62   bo2dbs-vip.2gotrade.com    bo2dbs-vip  
  
oracle@SZDB:~> tail -12 $ORACLE_HOME/network/admin/tnsnames.ora  
  
ORA10G =  
  (DESCRIPTION =  
    (ADDRESS = (PROTOCOL = TCP)(HOST = bo2dbp-vip.2gotrade.com)(PORT = 1314))  
    (ADDRESS = (PROTOCOL = TCP)(HOST = bo2dbs-vip.2gotrade.com)(PORT = 1314))  
    (LOAD_BALANCE = yes)  
    (CONNECT_DATA =  
      (SERVER = DEDICATED)  
      (SERVICE_NAME = ora10g)  
    )  
  )  

  三、测试负载均衡(load balance)

[sql] view plain copy  print?
1、启用load balance的测试  
   
oracle@SZDB:~> more load_balance.sh   
#!/bin/bash  
for i in {1..100}  
do  
echo $i  
sqlplus -S system/oracle@ORA10G <
select instance_name from v\$instance;  
EOF  
sleep 1  
done  
exit 0    
  
# Author: Robinson Cheng  
# Blog :  http://blog.csdn.net/robinson_0612    
oracle@SZDB:~> ./load_balance.sh >load_bal.log  
oracle@SZDB:~> head -20 load_bal.log  
1  
  
INSTANCE_NAME  
----------------  
ora10g2  
  
2  
  
INSTANCE_NAME  
----------------  
ora10g1  
  
3  
  
INSTANCE_NAME  
----------------  
ora10g2  
  
4  
  
oracle@SZDB:~> grep ora10g1 load_bal.log |wc -l  
47  
oracle@SZDB:~> grep ora10g2 load_bal.log |wc -l  
53  
  
从上面的log日志中可以看出启用客户端的负载均衡基本上使得从客户端发起连接的能够保持均衡。  
    
2、未启用load balance的测试    
从客户端的tnsnames.ora中移出(LOAD_BALANCE = yes)选项,然后继续使用上面的脚本来测试  
oracle@SZDB:~> grep ora10g1 no_load_bal.log |wc -l  
100  
oracle@SZDB:~> grep ora10g2 no_load_bal.log |wc -l  
0  
  
从上面的日志中可以看出当移出LOAD_BALANCE = yes项后,所有的用户连接请求都被定为到ora10g1,这是因为连接请求从tnsnames.ora中选择  
列在ADDRESS项中排在第一行的位置。  
下面我们关闭实例ora10g1,再来测试连接情形  
  
oracle@bo2dbp:~> srvctl stop instance -d ora10g -i ora10g1  
  
oracle@SZDB:~> ./load_balance.sh >no_load_bal_new.log  
oracle@SZDB:~> grep ora10g1 no_load_bal_new.log |wc -l  
0  
oracle@SZDB:~> grep ora10g2 no_load_bal_new.log |wc -l  
100    
  
由于实例ora10g1已经关闭,因此所有的连接请求都被分配到ora10g2。  

 四、总结  
1、客户端的负载均衡配置较为简单,仅仅是在客户端的tnsnames.ora添加 LOAD_BALANCE = yes |on 
2、其连接分配原则是根据tnsnames.ora中连接标识符下的ADDRESS列表随机选择来进行与服务器之间的连接
3、如果选择列表中的某个节点listener或instance不可用,则再从剩余的ADDRESS列表随机选择,直到成功为止

 



 

 

Oracle RAC 服务器端连接负载均衡(Load Balance)

 

 

 

  Oracle RAC服务器端的负载均衡是根据RAC中各节点的连接负荷数情况,将新的连接请求分配到负荷最小的节点上去。当数据库处于运行时,RAC中各节点的PMON进程每3秒会将各自节点的连接负荷数更新到service_register。而对于节点中任意监听器故障或监听器意外失败时,PMON进程会每1秒钟检查当前节点上的监听是否重启,以获得最新的负载信息来及时调整负载均衡。本文主要演示suse 10 + oracle 10g rac下的服务器端的负载均衡。

 

有关客户端的负载均衡可参考
      Oracle RAC 客户端连接负载均衡(Load Balance)
      配置RAC负载均衡与故障转移
 
有关Oracle 网络配置相关基础以及概念性的问题请参考:
        配置ORACLE 客户端连接到数据库 
        配置非默认端口的动态服务注册 
        配置sqlnet.ora限制IP访问Oracle 
       Oracle 监听器日志配置与管理 
       设置 Oracle 监听器密码(LISTENER)
       Oracle RAC 监听配置
       ORACLE RAC 下非缺省端口监听配置(listener.ora tnsnames.ora)

 

一、服务器端负载均衡配置

[sql] view plain copy  print?
1、为tnsnames.ora 添加相应的网络服务名(每个节点配置)  
  oracle@bo2dbp:~> more $ORACLE_HOME/network/admin/tnsnames.ora  
  # tnsnames.ora Network Configuration File: /u01/oracle/db/network/admin/tnsnames.ora  
  # Generated by Oracle configuration tools.  
    
  remote_lsnr_gobo4 =  
    (ADDRESS_LIST =  
      (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.7.61)(PORT = 1521))  
      (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.7.62)(PORT = 1521))  
    )  
    
  local_lsnr_gobo4a =  
    (ADDRESS_LIST =  
      (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.7.61)(PORT = 1521))  
    )  
    
  local_lsnr_gobo4b =  
    (ADDRESS_LIST =  
      (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.7.62)(PORT = 1521))  
    )  
      
2、设置remote_listener参数  
  alter system set remote_listener='' scope=both sid='*';  
    
  SQL> show parameter instance_name  
    
  NAME                                 TYPE        VALUE  
  ------------------------------------ ----------- ------------------------------  
  instance_name                        string      GOBO4A  
  SQL> show parameter listener  
    
  NAME                                 TYPE        VALUE  
  ------------------------------------ ----------- ------------------------------  
  local_listener                       string  
  remote_listener                      string  
    
  SQL> alter system set remote_listener='remote_lsnr_gobo4' scope=both sid='*';  
    
  System altered.  
  
3、配置客户端tnsnames.ora  
  -->客户端为suse 10  
  SZDB:~ # ifconfig eth1 | grep "inet addr"|cut -d " " -f12|cut -d : -f2  #--客户端主机的ip  
  192.168.7.2  
  SZDB:~ # su - oracle  
  oracle@SZDB:~> tail -10 $ORACLE_HOME/network/admin/tnsnames.ora  
  GOBO4 =  
    (DESCRIPTION =  
      (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.7.61)(PORT = 1521))  
      (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.7.62)(PORT = 1521))  
      (LOAD_BALANCE = off)   #--由于仅仅测试寄予服务器端的负载均衡,因此关闭客户端负载均衡选项  
      (CONNECT_DATA =  
        (SERVER = DEDICATED)  
        (SERVICE_NAME = GOBO4)  
      )  
    )  
  #Author : Robinson Cheng  
  #Blog :   http://blog.csdn.net/robinson_0612    
   
4、检查监听情况  
  oracle@bo2dbp:~> lsnrctl status  
  Listening Endpoints Summary...  
    (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.7.61)(PORT=1521)))  
    (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.7.51)(PORT=1521)))  
  Services Summary...  
  Service "+ASM" has 1 instance(s).  
    Instance "+ASM1", status BLOCKED, has 1 handler(s) for this service...  
  Service "GOBO4" has 2 instance(s).  
    Instance "GOBO4A", status READY, has 2 handler(s) for this service...  
    Instance "GOBO4B", status READY, has 1 handler(s) for this service..  
  .......  
    
  oracle@bo2dbs:~> lsnrctl status  
  Listening Endpoints Summary...  
    (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.7.62)(PORT=1521)))  
    (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.7.52)(PORT=1521)))  
  Services Summary...  
  Service "+ASM" has 1 instance(s).  
    Instance "+ASM2", status BLOCKED, has 1 handler(s) for this service...  
  Service "GOBO4" has 2 instance(s).  
    Instance "GOBO4A", status READY, has 1 handler(s) for this service...  
    Instance "GOBO4B", status READY, has 2 handler(s) for this service...  
  ..........  
  
  #--如果监听或数据库需要重启异常请考虑重新启动监听器或数据库  
  #--下面清空监听日志以便于后续统计连接信息  
  oracle@bo2dbp:/u01/oracle/db/network/log> cat /dev/null>listener_bo2dbp.log  
  oracle@bo2dbs:/u01/oracle/db/network/log> cat /dev/null>listener_bo2dbs.log  

  

二、测试服务器端的负载均衡

[python] view plain copy 
 
 print?
  1. 1、从客户端建立连接  
  2.   oracle@SZDB:~> more load_balance.sh  
  3.   #!/bin/bash  
  4.   for i in {1..1000}  
  5.   do  
  6.   echo $i  
  7.   sqlplus -S system/oracle@GOBO4 <<eof  < span="">
  8.   select instance_name from v\$instance;  
  9.   EOF  
  10.   sleep 1  
  11.   done  
  12.   exit 0  
  13.   oracle@SZDB:~> ./load_balance.sh >srv_load_bal.log  
  14.   
  15. 2、分析监听日志  
  16.   oracle@bo2dbp:/u01/oracle/db/network/log> more listener_bo2dbp.log  
  17.   12-OCT-2012 12:00:10 * service_update * GOBO4B * 0   #节点bo2dbs上的实例GOBO4B的更新到bo2dbp上监听器的更新信息  
  18.   12-OCT-2012 12:00:35 * service_update * GOBO4B * 0  
  19.   12-OCT-2012 12:01:04 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=GOBO4)(CID=(PROGRAM=sqlplus@SZDB)(HOST=SZDB)  
  20.   (USER=oracle))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.7.2)(PORT=50322)) * establish * GOBO4 * 0  
  21.   12-OCT-2012 12:01:05 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=GOBO4)(CID=(PROGRAM=sqlplus@SZDB)(HOST=SZDB)  
  22.   (USER=oracle))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.7.2)(PORT=50325)) * establish * GOBO4 * 0  
  23.   12-OCT-2012 12:01:07 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=GOBO4)(CID=(PROGRAM=sqlplus@SZDB)(HOST=SZDB)  
  24.   (USER=oracle))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.7.2)(PORT=50328)) * establish * GOBO4 * 0  
  25.   12-OCT-2012 12:01:08 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=GOBO4)(CID=(PROGRAM=sqlplus@SZDB)(HOST=SZDB)  
  26.   (USER=oracle))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.7.2)(PORT=50330)) * establish * GOBO4 * 0  
  27.   
  28.   #上面的日志片断中可以看出全部是客户端发起的到bo2dbp节点上的建立连接的信息  
  29.     
  30.   #下面来查看bo2dbs上的监听日志  
  31.   oracle@bo2dbs:/u01/oracle/db/network/log> more listener_bo2dbs.log  
  32.   12-OCT-2012 12:00:10 * service_update * GOBO4B * 0  
  33.   12-OCT-2012 12:00:10 * service_update * GOBO4B * 0  
  34.   12-OCT-2012 12:00:14 * (CONNECT_DATA=(CID=(PROGRAM=)(HOST=bo2dbs)(USER=oracle))(COMMAND=status)(ARGUMENTS=64)  
  35.   (SERVICE=LISTENER_BO2DBS)(VERSION=169870080)) * status * 0  
  36.   12-OCT-2012 12:00:35 * service_update * GOBO4B * 0  
  37.   12-OCT-2012 12:00:35 * service_update * GOBO4B * 0  
  38.   12-OCT-2012 12:01:04 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=GOBO4)(CID=(PROGRAM=sqlplus@SZDB)(HOST=SZDB)  
  39.   (USER=oracle))(INSTANCE_NAME=GOBO4B)) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.7.2)(PORT=61862)) * establish * GOBO4 * 0  
  40.   12-OCT-2012 12:01:07 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=GOBO4)(CID=(PROGRAM=sqlplus@SZDB)(HOST=SZDB)  
  41.   (USER=oracle))(INSTANCE_NAME=GOBO4B)) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.7.2)(PORT=61868)) * establish * GOBO4 * 0  
  42.   12-OCT-2012 12:01:09 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=GOBO4)(CID=(PROGRAM=sqlplus@SZDB)(HOST=SZDB)  
  43.   (USER=oracle))(INSTANCE_NAME=GOBO4B)) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.7.2)(PORT=61872)) * establish * GOBO4 * 0  
  44.     
  45.   #在12-OCT-2012 12:01:04时刻,连接信息中有INSTANCE_NAME=GOBO4B的连接信息,而节点bo2dbp上也有一条类似的信息,因此该条连接  
  46.   #日志是由节点bo2dbp转发过来而建立的连接请求。  
  47.   #同样在12-OCT-2012 12:01:07时刻,节点bo2dbp转发过来而建立的连接请求。  
  48.     
  49.   #小结一下,  
  50.   #对于直接连接,监听器日志中将出现establish,且不含有INSTANCE_NAME=GOBO4B 字样  
  51.   #而对于转发的连接,则转发节点与接收的节点同时存在连接信息,转发节点上存在连接信息的与普通的连接请求一样,  
  52.   #而接收的节点上存在INSTANCE_NAME=<instance_name> 信息  
  53.   
  54. 3、检查负载均衡结果  
  55.   oracle@SZDB:~> grep GOBO4A srv_load_bal.log |wc -l  
  56.   755  
  57.   oracle@SZDB:~> grep GOBO4B srv_load_bal.log |wc -l  
  58.   245  
  59.   #从上面的日志文件中可知总共有755个客户端连接到了gobo4a,有245各客户端连接到了gobo4b  
  60.     
  61.   #下面查看监听器日志来获得连接信息  
  62.   #下面的查询中在节点bo2dbp上总共有接受了1000个用户连接  
  63.   oracle@bo2dbp:/u01/oracle/db/network/log> grep establish listener_bo2dbp.log |wc -l  
  64.   1000  
  65.     
  66.   #下面的查询查看是否有从节点bo2dbs转发过来的连接,结果为0,说明没有任何连接请求从bo2dbs转发过来  
  67.   oracle@bo2dbp:/u01/oracle/db/network/log> grep INSTANCE_NAME=GOBO4 listener_bo2dbp.log |wc -l  
  68.   0  
  69.     
  70.   #接下来查看节点bo2dbs的监听日志,可以看出总共接受了245个连接请求  
  71.   oracle@bo2dbs:/u01/oracle/db/network/log> grep establish listener_bo2dbs.log |wc -l  
  72.   245  
  73.     
  74.   #下面的过滤情况也表明在节点bo2dbs上的连接是从bo2dbp上转发的连接,而非客户端直接到bo2dbs的请求连接  
  75.   oracle@bo2dbs:/u01/oracle/db/network/log> grep INSTANCE_NAME=GOBO4 listener_bo2dbs.log |wc -l  
  76.   245  
  77.     
  78.   #从监听器的日志检查可以,测试中的连接全部请求到节点bo2dbp,是由于tnsnames.ora中ADDRESS的第一个IP地址就是bo2dbp的IP  
  79.   #因此所有的连接都是请求到bo2dbp,而没有客户端发出到bo2dbs的连接请求  
  80.   #其次是尽管在bo2dbp有1000个连接请求,而真正建立连接的只有755个,有245转发到了节点bo2dbs  

三、总结
      1、服务器端的负载均衡需要配置remote_listener参数,而该参数的值依赖于tnsnames.ora的连接字符串
      2、对于基于服务器端的连接负载均衡,监听器会根据当前节点、实例上的连接负载情况进行转发到空闲的实例
      3、转发的依据仅仅是当前节点监听的连接数量的多少,而非当前实例的过度负载
      4、从上面的测试可以得出,各个节点的连接并不算均衡,是相对的均衡,因此应结合客户端连接负载协同工作
      5、对于当前实例的过度负载的情形,应结合配置service方法来实现负载均衡  



 

配置 RAC 负载均衡与故障转移

  Oracle负载均衡主要是指新会话连接到RAC数据库时,如何判定这个新的连接要连到哪个节点进行工作?通常情况下,负载均衡分为客户端负载
均衡与服务器端负载均衡。客户端负载均衡通常是在客户端的tnsnames.ora中多添加一个链接地址以及LOAD_BALANCE与failover参数。而服务器
端的负载均衡则相对复杂,下面具体描述服务器端负载均衡。
    
一、负载均衡
    注意这里的负载均衡指的是连接的负载均衡,即客户可以随机从不同的实例中连接到数据库
    1.配置tnsnames.ora使得该文件中包含如下全部内容:       

[sql] view plain copy 
 
 print?
  1. # LISTENERS_DEVDB DEVDB是数据库名,可以使用netmgr,netca编辑或直接使用Vim创建  
  2. LISTENERS_DEVDB =                                                             
  3.   (ADDRESS_LIST =                                                             
  4.     (ADDRESS = (PROTOCOL = TCP)(HOST = rac1-vip.robinson.com)(PORT = 1521))     
  5.     (ADDRESS = (PROTOCOL = TCP)(HOST = rac2-vip.robinson.com)(PORT = 1521))     
  6.   )                                                                           

    2.配置参数文件remote_listener          

[sql] view plain copy 
 
 print?
  1. SQL> alter system set remote_listener='LISTENERS_DEVDB' scope=both sid='*';   

    3.需要配置连接描述信息的两个IP地址、端口号、以及load_balance子项为yes (主要是load_balance子项)      

[sql] view plain copy 
 
 print?
  1. DEVDB =                                                                         
  2.   (DESCRIPTION =                                                                
  3.     (ADDRESS = (PROTOCOL = TCP)(HOST = rac1-vip.robinson.com)(PORT = 1521))       
  4.     (ADDRESS = (PROTOCOL = TCP)(HOST = rac2-vip.robinson.com)(PORT = 1521))       
  5.     (LOAD_BALANCE = yes)                                                          
  6.     (CONNECT_DATA =                                                               
  7.       (SERVER = DEDICATED)                                                        
  8.       (SERVICE_NAME = devdb.robinson.com)                                         
  9.     )                                                                             
  10.   )                                                                             

    4.查看侦听器的状态,从下面可以看到devdb.robinson.com服务中有两个实例为其提供服务     

[sql] view plain copy 
 
 print?
  1. [oracle@rac2 ~]$ lsnrctl status                                                  
  2.     Service "devdb.robinson.com" has 2 instance(s).                                
  3.       Instance "devdb1", status READY, has 1 handler(s) for this service...        
  4.       Instance "devdb2", status READY, has 2 handler(s) for this service...        

    5.测试负载均衡
    使用shell脚本来进行测试负载均衡      

[sql] view plain copy 
 
 print?
  1. --编辑TestLoadBalance.sh                                                                            
  2.     #!/bin/bash                                                                                       
  3.     #Usage: TestLoadBalance devdb 1000                                                                
  4.     count=0                                                                                           
  5.     while [ $count -lt $2 ]   # Set up a loop control                                                 
  6.     do                        # Begin the loop                                                        
  7.         count='expr $count + 1' # Increment the counter                                                 
  8.         sqlplus -s usr1/usr1pwd@$1 @TestLoadBalance.sql  # Connect instance and execute sql statement   
  9.         sleep 1                                                                                         
  10.     done                                                                                              
  11.                                                                                                     
  12. --TestLoadBalance.sql 脚本                                                                          
  13.     col instance_name format a30                                                                      
  14.     select instance_name from v$instance;                                                             
  15.                                                                                                     
  16. --实施测试                                                                                          
  17.     ./TestLoadBalance.sh devdb 1000                                                                   
  18.                                                                                                       
  19. --查看结果                                                                                          
  20.     SQL> select inst_id,count(1) from gv$instance group by inst_d;                                                    
  21.                                                                                                       
  22.     INST_ID       COUNT(1)                                                                            
  23.     ----------    ----------                                                                          
  24.     devdb1        446                                                                                 
  25.     devdb2        554                                                                                 

二、配置故障转移
    负载均衡是用于实现基于连接的负载均衡,但不能解决节点是否可用,一旦一个节点损坏,已成功连接的客户端并不能转移到其他正常服务的
    实例中。而故障转移功能则使得该功能得以实现。可以使用srvctl 和dbca来创建服务。下面使用dbca来创建一个新的服务,客户端连接到实
    例后,对故障实现透明切换。
    
    1.配置故障转移服务
        在节点rac1使用oracle帐户启动dbca工具,
            a.选择 Oracle Real Application Clusters database
            b.选择 Services Management
            c.集群数据库列表:单击 Next。 
            d.数据库服务:单击 Add。 
                添加服务:输入sales。 
                选择 devdb1 作为首选实例。 
                选择 devdb2 作为可用实例。 
                TAF 策略:选择 Basic。 
                单击 Finish。
            e.数据库配置助手:单击 No 退出。
        配置完毕后,在tnsnames.ora中多出了以下内容,注意是各个节点都会多出以下内容            

[sql] view plain copy 
 
 print?
  1. SALES =                                                                      
  2.   (DESCRIPTION =                                                             
  3.     (ADDRESS = (PROTOCOL = TCP)(HOST = rac1-vip.robinson.com)(PORT = 1521))    
  4.     (ADDRESS = (PROTOCOL = TCP)(HOST = rac2-vip.robinson.com)(PORT = 1521))    
  5.     (LOAD_BALANCE = yes)                                                       
  6.     (CONNECT_DATA =                                                            
  7.       (SERVER = DEDICATED)                                                     
  8.       (SERVICE_NAME = sales.robinson.com)                                      
  9.       (FAILOVER =      --failover_mode是实现故障转移的关键选项                 
  10.         (TYPE = SELECT)                                                          
  11.         (METHOD = BASIC)   --TAF 策略:此处当前为 Basic                          
  12.         (RETRIES = 180)                                                          
  13.         (DELAY = 5)                                                              
  14.       )                                                                        
  15.     )                                                                          
  16.   )                                                                          

        同时参数service_names会多出一个服务名,位于在配置数据库服务时选择的首选实例中      

[sql] view plain copy 
 
 print?
  1. SQL> select instance_name from v$instance;                                          
  2.                                                                                     
  3. INSTANCE_NAME                                                                       
  4. ----------------                                                                    
  5. devdb1                                                                              
  6.                                                                                     
  7. SQL> show parameter service_names                                                   
  8.                                                                                     
  9. NAME                                 TYPE        VALUE                              
  10. ------------------------------------ ----------- ------------------------------     
  11. service_names                        string      devdb.robinson.com, sales          
  12.                                                                                     
  13. SQL> select instance_name from v$instance;                                          
  14.                                                                                     
  15. INSTANCE_NAME                                                                       
  16. ----------------                                                                    
  17. devdb2                                                                              
  18.                                                                                     
  19. SQL> show parameter service                                                         
  20.                                                                                     
  21. NAME                                 TYPE        VALUE                              
  22. ------------------------------------ ----------- ------------------------------     
  23. service_names                        string      devdb.robinson.com                 

        使用srvctl工具也可以看到该服务已经正常开始提供服务        

[sql] view plain copy 
 
 print?
  1. SQL> ho srvctl status service -d devdb -s sales                              
  2. Service sales is running on instance(s) devdb1                               
  3.                                                                              
  4. SQL> ho lsnrctl status                                                       
  5.                                                                              
  6. Service "sales.robinson.com" has 1 instance(s).   --sales正常提供服务        
  7.   Instance "devdb1", status READY, has 2 handler(s) for this service...      

    2.实现故障转移
        下面使用帐户usr1,服务名sales从Windows客户端来登陆,注意要配置好客户端tnsnames,可以将服务器sales项内容全部复制到客户端tnsnames.ora中     

[sql] view plain copy 
 
 print?
  1. C:\>sqlplus usr1/usr1@sales                                                                         
  2. SQL> col host_name format a20                                                                       
  3. SQL> select instance_number ins_no,instance_name,host_name,status from v$instance;                  
  4.                                                                                                     
  5.     INS_NO INSTANCE_NAME    HOST_NAME            STATUS                                               
  6. ---------- ---------------- -------------------- ------------                                       
  7.          1 devdb1           rac1.robinson.com    OPEN                                                   
  8. SQL> select failover_type,failover_method,failed_over from v$session                                
  9.   2  where username='USR1';                                                                         
  10.                                                                                                     
  11. FAILOVER_TYPE FAILOVER_M FAI                                                                        
  12. ------------- ---------- ---                                                                        
  13. SELECT        BASIC      NO                                                                         
  14.                                                                                                     
  15. --从其它会话使用sys帐户登陆到crm 并关闭该实例                                                         
  16. SQL> show user;                                                                                     
  17. USER is "SYS"                                                                                       
  18. SQL> select instance_name from v$instance;                                                          
  19.                                                                                                     
  20. INSTANCE_NAME                                                                                       
  21. ----------------                                                                                    
  22. devdb1                                                                                              
  23.                                                                                                     
  24. SQL> shutdown abort                                                                                 
  25. ORACLE instance shut down.                                                                          
  26.                                                                                                     
  27. --从先前登陆到sales的会话中验证会话故障切换功能                                                     
  28. SQL> select instance_number ins_no,instance_name,host_name,status from v$instance;                  
  29.                                                                                                     
  30.     INS_NO INSTANCE_NAME    HOST_NAME            STATUS                                               
  31. ---------- ---------------- -------------------- ------------                                       
  32.                  2 devdb2           rac2.robinson.com    OPEN                                               
  33.                                                                                                     
  34. SQL> select failover_type,failover_method,failed_over from v$session                                
  35.   2  where username='USR1';  --第3列显示的为yes,也表明经过故障切换后提供的服务                     
  36.                                                                                                     
  37. FAILOVER_TYPE FAILOVER_M FAI                                                                        
  38. ------------- ---------- ---                                                                        
  39. SELECT        BASIC      YES                                                                            
  40.                                                                                                     
  41. --由下面的查询中可以看到服务名sales被添加到可用实例devdb2,节点rac2 的service_names参数中           
  42. SQL> select instance_name from v$instance;                                                          
  43.                                                                                                     
  44. INSTANCE_NAME                                                                                       
  45. ----------------                                                                                    
  46. devdb2                                                                                              
  47.                                                                                                     
  48. SQL> show parameter service                                                                         
  49.                                                                                                     
  50. NAME                                 TYPE        VALUE                                              
  51. ------------------------------------ ----------- ------------------------------                     
  52. service_names                        string      devdb.robinson.com, sales                            

    3.重新定位故障转移服务到首选实例
        对于首选实例从故障中恢复后,需要手动来重新定位到首选实例        

[sql] view plain copy 
 
 print?
  1. SQL> startup  --启动devdb1                                                           
  2. SQL> show parameter service_names                                                    
  3.                                                                                      
  4. NAME                                 TYPE        VALUE                               
  5. ------------------------------------ ----------- ------------------------------      
  6. service_names                        string      devdb.robinson.com                  
  7.                                                                                      
  8. SQL> ho srvctl relocate service -d devdb -s sales -i devdb2 -t devdb1                
  9.                                                                                      
  10. SQL> show parameter service_names                                                    
  11.                                                                                      
  12. NAME                                 TYPE        VALUE                               
  13. ------------------------------------ ----------- ------------------------------      
  14. service_names                        string      devdb.robinson.com, sales           

    4.DML故障转移(不同于DQL,因此单独列出)
        使用Windows客户端通过sales服务名登陆        

[sql] view plain copy 
 
 print?
  1. C:\>sqlplus usr1/usr1@sales                                                            
  2.                                                                                        
  3. SQL> show user;                                                                        
  4. USER is "USR1"                                                                         
  5. SQL> create table tb_temp (id int,ename varchar2(20)) tablespace tbs1;                 
  6.                                                                                          
  7. Table created.                                                                         
  8.                                                                                        
  9. SQL> insert into tb_temp                                                               
  10.   2  select 1,'Robinson' from dual                                                     
  11.   3  union all                                                                         
  12.   4  select 2,'Jackson'  from dual;                                                    
  13.                                                                                        
  14. rows created.                                                                        
  15.                                                                                        
  16. SQL> commit;                                                                           
  17.                                                                                        
  18. Commit complete.                                                                       
  19.                                                                                        
  20. SQL> select * from tb_temp;                                                            
  21.                                                                                        
  22.         ID ENAME                                                                           
  23. ---------- --------------------                                                        
  24.          1 Robinson                                                                        
  25.          2 Jackson                                                                         
  26.                                                                                        
  27. SQL> delete from tb_temp;                                                              
  28.                                                                                        
  29. rows deleted.                                                                        
  30.                                                                                        
  31. SQL> select * from tb_temp;                                                            
  32.                                                                                        
  33. no rows selected                                                                       

        从另一个会话中使用sysdba关闭devdb1(shutdown abort)
        再在刚刚执行表创建的会话中查询记录,收到提示,事务必须被回滚     

[sql] view plain copy 
 
 print?
  1. SQL> select * from tb_temp;                                                  
  2. select * from tb_temp                                                        
  3. *                                                                            
  4. ERROR at line 1:                                                             
  5. ORA-25402: transaction must roll back                                        
  6.                                                                              
  7. SQL> rollback;                                                               
  8.                                                                              
  9. Rollback complete.                                                           
  10. SQL> select * from tb_temp;                                                  
  11.                                                                              
  12.         ID ENAME                                                                 
  13. ---------- --------------------                                              
  14.          1 Robinson                                                              
  15.          2 Jackson                                                               
  16.                                                                              
  17. SQL> select failover_type,failover_method,failed_over from v$session         
  18.   2  where username='USR1';                                                  
  19.                                                                              
  20. FAILOVER_TYPE FAILOVER_M FAI                                                 
  21. ------------- ---------- ---                                                 
  22. SELECT        BASIC      YES                                                 

        总结:对于DML 操作在实现故障转移时,将严格按照ACID原则来执行,大部分情况需要回滚事务。    



 

TAF配置

 

11g RAC环境下客户端配置TAF 




TAF是Transparent Application Failover的英文缩写,顾名思义就是对应用透明的故障转移,举个例子,当应用连接某个oracle数据库的执行查询操作的时候,数据库服务器网络中断或者实例崩溃,在经过delay参数设置的值之后,将自动连接到其他可用的实例,继续进行查询。(前提是执行计划,输出结果集和输出的顺序不能发生变化)

TAF的特性:
1:TAF是ORACLE客户端提供的一项特性,使用TAF,对客户端的环境有一定的要求,比如JAVA的JDBC驱动、Oracle客户端的版本等(8i开始支持TAF);
2:大致上TAF可以分为2种,连接时的TAF和会话建立后TAF;
3:TAF本身与是否RAC环境无关,但一般都用在RAC环境,最小程度的减少最应用的影响,单实例环境下也可以使用TAF,这样使用PL/SQL developer连接数据库,即使数据库实例重启,也不需要重新连接;
4:RAC环境下,还可以把TAF配置在服务器端;
5:配置listener.ora文件的GLOBAL_DBNAME参数后将会禁用TAF。

配置如下:

1、查看服务端的service_name 和 RAC SCAN地址:

 

 

[oracle@rac122 tnslsnr]$ cat /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1               localhost.localdomain localhost
::1             localhost6.localdomain6 localhost6

#Public
10.1.252.121            rac121
10.1.252.122            rac122
#Private
192.168.123.1           rac121-priv
192.168.123.2           rac122-priv

#Virtual IP
10.1.252.75             rac121-vip
10.1.252.76             rac122-vip 10.1.252.68             rac-scan [oracle@rac122 tnslsnr]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Tue Jun 26 15:33:29 2012

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options

SQL> show parameter service_name

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
service_names                        string      racdb SQL>  

 

2、在客户端tnsnames.ora文件中添加如下内容:

 

racdb =  
  (DESCRIPTION =  
    (ADDRESS_LIST =  
      (ADDRESS = (PROTOCOL = TCP)(HOST = 10.1.252.68)(PORT = 1521))  
    )  
    (CONNECT_DATA =  
      (SERVICE_NAME = racdb)  
      (FAILOVER_MODE =  
       (TYPE = SELECT)  
       (METHOD = BASIC)  
       (RETRIES = 180)  
       (DELAY = 5))  
    )  
  ) 

         

    属性说明 :
            TYPE    type属性描述了故障转移的类型,可能的值如下:
                         SESSION  指定发生故障转换时,TAF应创建会话,不执行其他任何操作。
                         SELECT   指定执行故障转换时,除了创建会话,TAF 也应该重新启动故障转移期间运行的任何SELECT语句,
                                            当SELECT 语句执行时,ORACLE Net 将返回故障转移前没有返回给用户的行。
                          NONE     不执行TAF故障转移
             METHOD  method特征确定何时创建会话,可能的值如下 :
                         BASIC      在故障转移期间创建会话
                         PRECONNECT 与数据库的初始连接建立后,采用由BACKUP 特性指定的ORACLE Net别名创建故障转移会话。
             RETRIES retries特性指Oracle Net 返回ADDRESS_LIST 并尝试连接到幸存实例的次数。
             DELAY   delay特性指定每次重试之间等待的次数       

 

 

 

3、用客户端登录作如下操作:

 

C:\Documents and Settings\```>sqlplus /nolog
SQL*Plus: Release 10.2.0.1.0 - Production on 星期二 6月 26 15:00:56 2012
Copyright (c) 1982, 2005, Oracle.  All rights reserved.

SQL> conn sys/system@racdb as sysdba
已连接。
SQL> show parameter instance_name

NAME                      TYPE                    VALUE
------------------------- ----------------------  ---------------
instance_name             string racdb2 SQL> select sid from v$mystat where rownum=1;
       SID
----------
       163
SQL> select machine,failover_type,failover_method,failed_over from v$session where sid=163;
MACHINE           FAILOVER_TYPE       FAILOVER_METHOD       FAILED
----------------  ----------          --------------------  ------
AILK\TANWEI       SELECT              BASIC                 NO

SQL> create table taf_client_test(id int,vname varchar2(10));
表已创建。
SQL> insert into taf_client_test values(1,'taf');
已创建 1 行。
SQL> update taf_client_test set id=5;
已更新 1 行。
SQL> select * from taf_client_test;
        ID VNAME
---------- --------------------
         5 taf 

 

4、更新完毕不commit,停掉racdb2实例:

 

[oracle@rac122 tnslsnr]$ srvctl status instance -d racdb -i racdb2
Instance racdb2 is running on node rac122
[oracle@rac122 tnslsnr]$ srvctl stop instance -d racdb -i racdb2 -o abort
[oracle@rac122 tnslsnr]$ srvctl status instance -d racdb -i racdb2 Instance racdb2 is not running on node rac122 [oracle@rac122 tnslsnr]$ srvctl status instance -d racdb -i racdb1 Instance racdb1 is running on node rac121 --还有一个节点在工作

 


5、回到客户端作如下操作:

 

SQL> select * from taf_client_test;
select * from taf_client_test
*
第 1 行出现错误:
ORA-25402: 事务处理必须重新运行


SQL> rollback;


回退已完成。


SQL> select instance_name from v$instance;


INSTANCE_NAME
--------------------------------
racdb1


SQL> show parameter instance_name




NAME                TYPE            VALUE
----------------    -------------   -----------
instance_name       string          racdb1 --发现自动切换到节点1, SQL> select * from taf_client_test;
未选定行
SQL> select * from v$mystat where rownum=1;


       SID STATISTIC#      VALUE
---------- ---------- ----------
       166          0          0


SQL> select failover_type,failover_method,failed_over from v$session where sid=166;

FAILOVER_TYPE              FAILOVER_METHOD      FAILED
-------------------------- -------------------- ------
SELECT                     BASIC                YES 

 

到此,完成了TAF的配置与验证。






如何在RAC服务器端配置TAF

服务端TAF配置会覆盖客户端TNS连接串中配置的TAF。如果客户端没有配置TAF,在最小配置模式,failover类型必须设置为启用TAF。如果failover type是在服务端设置,那么faliover method默认为BASIC。DELAY和RETRIES参数是可选的,可分别指定。

1在RAC上添加一个service

注意:service name是唯一的,并且不能与默认创建服务名一致。默认情况下,Oracle RAC数据库会创建一个特殊的数据库服务。这个默认的服务在RAC环境中对所有实例总是可用,除非实例在限制模式。并且,DBA不能修改这个默认的服务及其属性。

添加service的语法:

srvctl add service -d orcl -s taf_orcl.oracle.com -r "orcl1,orcl2" -P BASIC

注:参数-d:数据库名;-s:服务名;-r:首选实例名;-P:TAF策略

2启动service

srvctl start service -d orcl -s taf_orcl.oracle.com

3检查service运行状态

srvctl config service -d orcl

4获取创建的service_id

select name,service_id from dba_services where name = 'taf_orcl.oracle.com';

5检查service的标准配置

col failover_method format a11 heading 'METHOD' 
col failover_type format a10 heading 'TYPE' 
col failover_retries format 9999999 heading 'RETRIES' 
col goal format a10 
col clb_goal format a8 
col AQ_HA_NOTIFICATIONS format a5 heading 'AQNOT' 

SQL>select name, failover_method, failover_type, failover_retries,goal, clb_goal,aq_ha_notifications  
from dba_services where service_id = X

注意:默认情况下,method,type,retries是没有值的,这些值对于服务端的TAF是必须的。这是由于srvctl命令添加服务时,没有修改DBMS_SERVICE参数,在11.2之后,该bug解决。

6添加failover参数

--11.2之前:

SQL> execute dbms_service.modify_service (service_name => 'taf_orcl.oracle.com' - 
, aq_ha_notifications => true - 
, failover_method => dbms_service.failover_method_basic - 
, failover_type => dbms_service.failover_type_select - 
, failover_retries => 180 - 
, failover_delay => 5 - 
, clb_goal => dbms_service.clb_goal_long); 

PL/SQL procedure successfully completed.

注:由于TAF的method为BASIC,BASIC是当前支持的唯一值,不再支持PRECONNECT。

--11.2开始:

srvctl add service -d db_unique_name -s service_name -t edition_name{-r preferred_list [-a available_list]} | {-g server_pool[-c {UNIFORM | SINGLETON}] [-k net_number]}
[-P {BASIC | NONE}] [-l {[PRIMARY] | [PHYSICAL_STANDBY] | [LOGICAL_STANDBY] |
[SNAPSHOT_STANDBY]}] [-y {AUTOMATIC | MANUAL}] [-q {TRUE | FALSE}]
[-x {TRUE | FALSE}] [-j {SHORT | LONG}] [-B {NONE | SERVICE_TIME | THROUGHPUT}]
 [-e {NONE |SESSION | SELECT}] [-m {NONE | BASIC}] [-z failover_retries]
 [-w failover_delay]

7检查service及method、retries

SQL>select name, failover_method, failover_type, failover_retries,goal, clb_goal,aq_ha_notifications 
from dba_services where service_id = X

8检查监听是否已注册新创建的service

lsnrctl services

当然,还可以通过DBCA或者Oracle OEM来配置RAC Server。本文不再详述。



配置Server Side TAF

 

实验环境:Oracle 11.2.0.4 RAC
参考MOS文档:
How To Configure Server Side Transparent Application Failover (文档 ID 460982.1)

1.为设置TAF在RAC集群上新建服务

eg: srvctl add service -d rac -s server_taf -r "rac1,rac2" -P BASIC

使用oracle用户在RAC集群上新建服务server_taf:

[oracle@jyrac1 ~]$ srvctl add service -d jyzhao -s server_taf -r "jyzhao1,jyzhao2" -P BASIC
[oracle@jyrac1 ~]$ 

注意不能使用grid用户操作,如果使用grid 用户执行的话,会报错:

[grid@jyrac1 ~]$ srvctl add service -d jyzhao -s server_taf -r "jyzhao1,jyzhao2" -P BASIC
PRCD-1288 : User is not authorized to create service server_taf for database jyzhao
PRKH-1014 : Current user "grid" is not the oracle owner user "oracle" of oracle home "/opt/app/oracle/product/11.2.0/dbhome_1"

2.启动server_taf服务

eg: srvctl start service -d rac -s server_taf

启动server_taf服务

[oracle@jyrac1 ~]$ srvctl start service -d jyzhao -s server_taf

3.检查确认服务正在运行

eg: srvctl config service -d rac

检查确认服务正在运行:

[oracle@jyrac1 ~]$ srvctl config service -d jyzhao
Service name: server_taf
Service is enabled
Server pool: jyzhao_server_taf Cardinality: 2 Disconnect: false Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false AQ HA notifications: false Failover type: NONE
Failover method: NONE
TAF failover retries: 0 TAF failover delay: 0 Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: BASIC Edition: Preferred instances: jyzhao1,jyzhao2
Available instances: 

4.找到刚创建服务的service_id

eg: select name,service_id from dba_services where name = 'server_taf';

找到刚创建服务的service_id

SQL> select name,service_id from dba_services where name = 'server_taf'; NAME SERVICE_ID
---------------------------------------------------------------- ----------
server_taf 7

5.根据service_id审查服务的信息

col name format a15
col failover_method format a11 heading 'METHOD'
col failover_type format a10 heading 'TYPE'
col failover_retries format 9999999 heading 'RETRIES'
col goal format a10
col clb_goal format a8
col AQ_HA_NOTIFICATIONS format a5 heading 'AQNOT'

select name, failover_method, failover_type, failover_retries,goal, clb_goal,aq_ha_notifications
from dba_services where service_id = 7;

根据service_id审查服务的信息:

SQL> col name format a15  
SQL> col failover_method format a11 heading 'METHOD' SQL> col failover_type format a10 heading 'TYPE' SQL> col failover_retries format 9999999 heading 'RETRIES' SQL> col goal format a10 
SQL> col clb_goal format a8 
SQL> col AQ_HA_NOTIFICATIONS format a5 heading 'AQNOT' SQL> select name, failover_method, failover_type, failover_retries,goal, clb_goal,aq_ha_notifications 2 from dba_services where service_id = 7 3 ; NAME METHOD TYPE RETRIES GOAL       CLB_GOAL AQNOT
--------------- ----------- ---------- -------- ---------- -------- -----
server_taf NONE NONE 0 NONE LONG     NO

SQL> 

6.给服务添加server side failover参数

execute dbms_service.modify_service (service_name => 'server_taf' -
, aq_ha_notifications => true -
, failover_method => dbms_service.failover_method_basic -
, failover_type => dbms_service.failover_type_select -
, failover_retries => 180 -
, failover_delay => 5 -
, clb_goal => dbms_service.clb_goal_long);

11.2版本可以使用srvctl 修改服务的信息:
srvctl modify service -d RAC -s server_taf -m BASIC -e SELECT -q TRUE -j LONG

给服务添加server side failover参数:

SQL> execute dbms_service.modify_service (service_name => 'server_taf' - 
> , aq_ha_notifications => true - 
> , failover_method => dbms_service.failover_method_basic - 
> , failover_type => dbms_service.failover_type_select - 
> , failover_retries => 180 - 
> , failover_delay => 5 - 
> , clb_goal => dbms_service.clb_goal_long); 

PL/SQL procedure successfully completed.

7.再次审查服务可以看到Method, Type和Retries值

select name, failover_method, failover_type, failover_retries,goal, clb_goal,aq_ha_notifications
from dba_services where service_id = 7;

再次审查服务可以看到Method, Type和Retries值:

SQL> select name, failover_method, failover_type, failover_retries,goal, clb_goal,aq_ha_notifications 2 from dba_services where service_id = 7; NAME METHOD TYPE RETRIES GOAL       CLB_GOAL AQNOT
--------------- ----------- ---------- -------- ---------- -------- -----
server_taf      BASIC SELECT 180 NONE LONG     YES

8.检查已注册的服务的监听信息

lsnrctl services

Service "server_taf.za.oracle.com" has 2 instance(s).
Instance "rac1", status READY, has 2 handler(s) for this service...
Handler(s):
"DEDICATED" established:0 refused:0 state:ready
REMOTE SERVER
(ADDRESS=(PROTOCOL=TCP)(HOST=dell01)(PORT=1521))
"DEDICATED" established:0 refused:0 state:ready
LOCAL SERVER
Instance "rac2", status READY, has 1 handler(s) for this service...
Handler(s):
"DEDICATED" established:0 refused:0 state:ready
REMOTE SERVER
(ADDRESS=(PROTOCOL=TCP)(HOST=dell02)(PORT=1521))

我这里版本差异,显示有区别,分别在不同节点显示自己的实例:

--node1: Service "server_taf" has 1 instance(s).
  Instance "jyzhao1", status READY, has 1 handler(s) for this service... Handler(s): "DEDICATED" established:0 refused:0 state:ready LOCAL SERVER The command completed successfully --node2: Service "server_taf" has 1 instance(s). Instance "jyzhao2", status READY, has 1 handler(s) for this service... Handler(s): "DEDICATED" established:0 refused:0 state:ready LOCAL SERVER The command completed successfully

9.创建网络服务名

SERVERTAF =
(DESCRIPTION =
(LOAD_BALANCE = yes)
(ADDRESS = (PROTOCOL = TCP)(HOST = dell01)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = dell02)(PORT = 1521))
(CONNECT_DATA =
(SERVICE_NAME = server_taf.za.oracle.com)
)
)

服务端RAC所有节点配置tnsnames.ora,添加内容:

SERVERTAF =
  (DESCRIPTION =
    (LOAD_BALANCE = yes)
    (ADDRESS = (PROTOCOL = TCP)(HOST = jyrac1)(PORT = 1521))
    (ADDRESS = (PROTOCOL = TCP)(HOST = jyrac2)(PORT = 1521))
    (CONNECT_DATA =
      (SERVICE_NAME = server_taf)
    )
  )

sqlplus system/oracle@192.168.56.160/server_taf

10.测试TAF功能

select host_name,instance_name from v$instance;

SQL> select instance_name from V$instance;
INSTANCE_NAME
----------------
rac2

SQL> shutdown abort;
ORACLE instance shut down.

select host_name,instance_name from v$instance;

10.1 模拟客户端使用scanVIP测试能否实现TAF
sqlplus system/oracle@192.168.56.160/server_taf

[grid@jyrac1 ~]$ sqlplus system/oracle@192.168.56.160/server_taf

SQL*Plus: Release 11.2.0.4.0 Production on Fri Mar 10 02:59:53 2017 Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options

SQL> select host_name,instance_name from v$instance;

HOST_NAME ---------------------------------------------------------------- INSTANCE_NAME ---------------- jyrac1
jyzhao1 --这里强制关掉jyzhao1实例。 SQL> /

HOST_NAME ---------------------------------------------------------------- INSTANCE_NAME ---------------- jyrac2
jyzhao2

10.1 结论: 可以实现TAF功能,相当于客户端不再需要配置,直接通过SCAN VIP连接。

10.2 模拟客户端使用Public IP测试能否实现TAF
sqlplus system/oracle@192.168.56.150/server_taf

SQL> select host_name,instance_name from v$instance;

HOST_NAME ---------------------------------------------------------------- INSTANCE_NAME ---------------- jyrac1
jyzhao1 --这里强制关掉jyzhao1实例。 SQL> / select host_name,instance_name from v$instance
*
ERROR at line 1:
ORA-12153: TNS:not connected
Process ID: 20116 Session ID: 24 Serial number: 7

如果客户端配置tnsnames.ora,将publicIP配置

TAF =
  (DESCRIPTION =
    (LOAD_BALANCE = yes)
    (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.56.150)(PORT = 1521))
    (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.56.152)(PORT = 1521))
    (CONNECT_DATA =
      (SERVICE_NAME = server_taf)
    )
  )

再次测试:

[oracle@jyrac2 admin]$ sqlplus system/oracle@taf

SQL*Plus: Release 11.2.0.4.0 Production on Fri Mar 10 05:11:30 2017 Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options

SQL> select host_name,instance_name from v$instance;

HOST_NAME ---------------------------------------------------------------- INSTANCE_NAME ---------------- jyrac2
jyzhao2 --这里强制关掉jyzhao2实例。 SQL> /

HOST_NAME ---------------------------------------------------------------- INSTANCE_NAME ---------------- jyrac1
jyzhao1

10.2 结论: 直接连接Public IP无法实现TAF功能。但客户端配置Public IP列表,可以实现。

10.3 模拟客户端使用VIP测试能否实现TAF
sqlplus system/oracle@192.168.56.151/server_taf

[grid@jyrac1 ~]$ sqlplus system/oracle@192.168.56.151/server_taf

SQL*Plus: Release 11.2.0.4.0 Production on Fri Mar 10 04:32:20 2017 Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options

SQL> select host_name,instance_name from v$instance;

HOST_NAME
----------------------------------------------------------------
INSTANCE_NAME
----------------
jyrac1
jyzhao1


SQL> / select host_name,instance_name from v$instance
* ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 32459 Session ID: 159 Serial number: 3

如果客户端配置tnsnames.ora,可以通过sqlplus system/oracle@tafvip连接

TAFVIP =
  (DESCRIPTION =
    (LOAD_BALANCE = yes)
    (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.56.151)(PORT = 1521))
    (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.56.153)(PORT = 1521))
    (CONNECT_DATA =
      (SERVICE_NAME = server_taf)
    )
  )

再次测试:

[oracle@jyrac2 admin]$ sqlplus system/oracle@tafvip

SQL*Plus: Release 11.2.0.4.0 Production on Fri Mar 10 05:15:32 2017 Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options

SQL> select host_name,instance_name from v$instance;

HOST_NAME ---------------------------------------------------------------- INSTANCE_NAME ---------------- jyrac2
jyzhao2 --这里强制关掉jyzhao2实例。 SQL> /

HOST_NAME ---------------------------------------------------------------- INSTANCE_NAME ---------------- jyrac1
jyzhao1

10.3 结论: 直接连接VIP无法实现TAF功能。但客户端配置VIP列表,可以实现。



 

官方文档:http://docs.oracle.com/cd/E11882_01/network.112/e41945/advcfg.htm#NETAG455

About Transparent Application Failover

TAF is a client-side feature that allows clients to reconnect to surviving databases in the event of a failure of a database instance. Notifications are used by the server to trigger TAF callbacks on the client-side.

TAF is configured using either client-side specified Transparent Network Substrate (TNS) connect string or using server-side service attributes. If both methods are used to configure TAF, then the server-side service attributes supersede the client-side settings. Server-side service attributes are the preferred way to set up TAF.

TAF can operate in one of two modes, Session Failover and Select Failover. Session Failover re-creates lost connections and sessions. Select Failover replays queries that were in progress.

When there is a failure, callback functions are initiated on the client-side using Oracle Call Interface (OCI) callbacks. This works with standard OCI connections as well as Connection Pool and Session Pool connections.

TAF operates with Oracle Data Guard to provide automatic failover. TAF works with the following database configurations to effectively mask a database failure:

  • Oracle Real Application Clusters

  • Replicated systems

  • Standby databases

  • Single instance Oracle database

See Also:

What Transparent Application Failover Restores

TAF automatically restores some or all of the following elements associated with active database connections. Other elements may need to be embedded in the application code to enable TAF to recover the connection.

  • Client-server database connections: TAF automatically reestablishes the connection using the same connect string or an alternate connect string that you specify when configuring failover.

  • Users' database sessions: TAF automatically logs a user in with the same user ID as was used before the failure. If multiple users were using the connection, then TAF automatically logs them in as they attempt to process database commands. Unfortunately, TAF cannot automatically restore other session properties. These properties can be restored by invoking a callback function.

  • Completed commands: If a command was completed at the time of connection failure, and it changed the state of the database, then TAF does not resend the command. If TAF reconnects in response to a command that may have changed the database, then TAF issues an error message to the application.

  • Open cursors used for fetching: TAF allows applications that began fetching rows from a cursor before failover to continue fetching rows after failover. This is called select failover. It is accomplished by re-running a SELECT statement using the same snapshot, discarding those rows already fetched and retrieving those rows that were not fetched initially. TAF verifies that the discarded rows are those that were returned initially, or it returns an error message.

  • Active transactions: Any active transactions are rolled back at the time of failure because TAF cannot preserve active transactions after failover. The application instead receives an error message until a ROLLBACK is submitted.

  • Server-side program variables: Server-side program variables, such as PL/SQL package states, are lost during failures, and TAF cannot recover them. They can be initialized by making a call from the failover callback.

About FAILOVER_MODE Parameters

The FAILOVER_MODE parameter must be included in the CONNECT_DATA section of a connect descriptor. FAILOVER_MODE can contain the parameters described in Table 13-4.

Table 13-4 Additional Parameters of the FAILOVER_MODE Parameter

FAILOVER_MODE ParametersDescription

BACKUP

A different net service name for backup connections. A backup should be specified when usingpreconnect to pre-establish connections.

DELAY

The amount of time in seconds to wait between connect attempts. If RETRIES is specified, thenDELAY defaults to one second.

If a callback function is registered, then this parameter is ignored.

METHOD

Setting for fast failover from the primary node to the backup node:

  • basic: Set to establish connections at failover time. This option requires almost no work on the backup server until failover time.

  • preconnect: Set to pre-established connections. This provides faster failover but requires that the backup instance be able to support all connections from every supported instance.

RETRIES

The number of times to attempt to connect after a failover. If DELAY is specified, then RETRIESdefaults to five retry attempts.

If a callback function is registered, then this parameter is ignored.

TYPE

The type of failover. Three types of Oracle Net failover functionality are available by default to Oracle Call Interface (OCI) applications:

  • session: Set to failover the session. If a user's connection is lost, then a new session is automatically created for the user on the backup. This type of failover does not attempt to recover select operations.

  • select: Set to enable users with open cursors to continue fetching on them after failure. However, this mode involves overhead on the client side in normal select operations.

  • none: This is the default. No failover functionality is used. This can also be explicitly specified to prevent failover from happening.

Note:

Oracle Net Manager does not provide support for TAF parameters. These parameters must be set manually.

Implementing Transparent Application Failover

Important:

Do not set the GLOBAL_DBNAME parameter in the SID_LIST_listener_name section of the listener.ora file. A statically configured global database name disables TAF.

Depending on the FAILOVER_MODE parameters, you can implement TAF in several ways. Oracle recommends the following methods:

TAF with Connect-Time Failover and Client Load Balancing

Implement TAF with connect-time failover and client load balancing for multiple addresses. In the following example, Oracle Net connects randomly to one of the protocol addresses on sales1-serveror sales2-server. If the instance fails after the connection, then the TAF application fails over to the other node's listener, reserving any SELECT statements in progress.

sales.us.example.com=
 (DESCRIPTION= (LOAD_BALANCE=on) (FAILOVER=on)  (ADDRESS=
       (PROTOCOL=tcp)  
       (HOST=sales1-server)  
       (PORT=1521)) 
  (ADDRESS=
       (PROTOCOL=tcp)  
       (HOST=sales2-server)  
       (PORT=1521)) 
  (CONNECT_DATA=
     (SERVICE_NAME=sales.us.example.com) (FAILOVER_MODE= (TYPE=select)  (METHOD=basic))))

TAF Retrying a Connection

TAF also provides the ability to automatically retry connecting if the first connection attempt fails with the RETRIES and DELAY parameters. In the following example, Oracle Net tries to reconnect to the listener on sales1-server. If the failover connection fails, then Oracle Net waits 15 seconds before trying to reconnect again. Oracle Net attempts to reconnect up to 20 times.

sales.us.example.com=
 (DESCRIPTION=
  (ADDRESS=
       (PROTOCOL=tcp)  
       (HOST=sales1-server)  
       (PORT=1521)) 
  (CONNECT_DATA=
     (SERVICE_NAME=sales.us.example.com) (FAILOVER_MODE= (TYPE=select)  (METHOD=basic) (RETRIES=20) (DELAY=15))))

TAF Pre-establishing a Connection

A backup connection can be pre-established. The initial and backup connections must be explicitly specified. In the following example, clients that use net service name sales1.us.example.com to connect to the listener on sales1-server are also preconnected to sales2-server. If sales1-server fails after the connection, then Oracle Net fails over to sales2-server, preserving anySELECT statements in progress. Similarly, Oracle Net preconnects to sales1-server for those clients that use sales2.us.example.com to connect to the listener on sales2-server.

sales1.us.example.com=
 (DESCRIPTION=
  (ADDRESS=
       (PROTOCOL=tcp)  
       (HOST=sales1-server)  
       (PORT=1521)) 
  (CONNECT_DATA=
     (SERVICE_NAME=sales.us.example.com) 
     (INSTANCE_NAME=sales1) (FAILOVER_MODE= (BACKUP=sales2.us.example.com) (TYPE=select)  (METHOD=preconnect))))
sales2.us.example.com=
 (DESCRIPTION=
  (ADDRESS=
       (PROTOCOL=tcp)  
       (HOST=sales2-server)  
       (PORT=1521)) 
  (CONNECT_DATA=
     (SERVICE_NAME=sales.us.example.com) 
     (INSTANCE_NAME=sales2) (FAILOVER_MODE= (BACKUP=sales1.us.example.com) (TYPE=select)  (METHOD=preconnect))))

Verifying Transparent Application Failover

You can query the FAILOVER_TYPE, FAILOVER_METHOD, and FAILED_OVER columns in the V$SESSION view to verify that TAF is correctly configured. To view the columns, use a query similar to the following:

SELECT MACHINE, FAILOVER_TYPE, FAILOVER_METHOD, FAILED_OVER, COUNT(*)
FROM V$SESSION
GROUP BY MACHINE, FAILOVER_TYPE, FAILOVER_METHOD, FAILED_OVER;

The output before failover looks similar to the following:

MACHINE              FAILOVER_TYPE FAILOVER_M FAI   COUNT(*)
-------------------- ------------- ---------- --- ----------
sales1               NONE          NONE       NO          11
sales2               SELECT        PRECONNECT NO           1

The output after failover looks similar to the following:

MACHINE              FAILOVER_TYPE FAILOVER_M FAI   COUNT(*)
-------------------- ------------- ---------- --- ----------
sales2               NONE          NONE       NO          10
sales2               SELECT        PRECONNECT YES          1

Note:

You can monitor each step of TAF using an appropriately configured OCI TAF CALLBACK function.

See Also:

 



 

Oracle RAC failover 测试(Server TAF方式)

 

  Oracle RAC中,除了基于客户端的TAF方式之外,还有基于服务器端的TAF方式,可以把服务端的TAF方式看作是客户端TAF方式的一个升级版吧。服务器端的TAF,当然是需要在服务器端进行配置了,这个是通过Service来完成的。本文主要描述Oracle 10g rac 下通过service方式配置服务器端的TAF。

  下面是一些关于这方面的基础参考链接:
  有关负监听配置,载均衡(load balance)请参考
    ORACLE RAC 监听配置 (listener.ora tnsnames.ora) 
    ORACLE RAC 下非缺省端口监听配置(listener.ora tnsnames.ora) 
    Oracle RAC 客户端连接负载均衡(Load Balance)
    Oracle RAC 服务器端连接负载均衡(Load Balance)
    Oracle RAC 负载均衡测试(结合服务器端与客户端) 
    
  有关Oracle RAC failover 连接时故障转移请参考
    Oracle RAC failover 测试(连接时故障转移)
    Oracle RAC failover 测试(TAF方式)
    
  有关Services的创建请参考
    Services in Oracle Database 10g 
    再说 Oracle RAC services

[sql] view plain copy 
 
 print?
  1. 1、服务器端、客户端的环境  
  2.   #服务器端环境,host信息  
  3.   oracle@bo2dbp:~> cat /etc/hosts |grep vip  
  4.   192.168.7.61   bo2dbp-vip.2gotrade.com    bo2dbp-vip  
  5.   192.168.7.62   bo2dbs-vip.2gotrade.com    bo2dbs-vip  
  6.     
  7.   #服务器端环境,集群信息  
  8.   oracle@bo2dbp:~> ./crs_stat.sh   
  9.    Resource name                                Target     State               
  10.   --------------                                ------     -----               
  11.   ora.GOBO4.GOBO4A.inst                         ONLINE     ONLINE on bo2dbp    
  12.   ora.GOBO4.GOBO4B.inst                         ONLINE     ONLINE on bo2dbs    
  13.   ora.GOBO4.db                                  ONLINE     ONLINE on bo2dbp    
  14.   ora.bo2dbp.ASM1.asm                           ONLINE     ONLINE on bo2dbp    
  15.   ora.bo2dbp.LISTENER_BO2DBP.lsnr               ONLINE     ONLINE on bo2dbp    
  16.   ora.bo2dbp.LISTENER_ORA10G_BO2DBP.lsnr        ONLINE     ONLINE on bo2dbp    
  17.   ora.bo2dbp.gsd                                ONLINE     ONLINE on bo2dbp    
  18.   ora.bo2dbp.ons                                ONLINE     ONLINE on bo2dbp    
  19.   ora.bo2dbp.vip                                ONLINE     ONLINE on bo2dbp    
  20.   ora.bo2dbs.ASM2.asm                           ONLINE     ONLINE on bo2dbs    
  21.   ora.bo2dbs.LISTENER_BO2DBS.lsnr               ONLINE     ONLINE on bo2dbs    
  22.   ora.bo2dbs.LISTENER_ORA10G_BO2DBS.lsnr        ONLINE     ONLINE on bo2dbs    
  23.   ora.bo2dbs.gsd                                ONLINE     ONLINE on bo2dbs    
  24.   ora.bo2dbs.ons                                ONLINE     ONLINE on bo2dbs    
  25.   ora.bo2dbs.vip                                ONLINE     ONLINE on bo2dbs    
  26.   ora.ora10g.db                                 ONLINE     ONLINE on bo2dbp   
  27.   
  28.   #客户端环境  
  29.   robin@SZDB:~> cat /etc/issue  
  30.     
  31.   Welcome to SUSE Linux Enterprise Server 10 SP3 (x86_64) - Kernel \r (\l).  
  32.     
  33.   robin@SZDB:~> sqlplus -v  
  34.     
  35.   SQL*Plus: Release 10.2.0.3.0 - Production  
  36.     
  37.   #客户端tnsnames配置  
  38.   GOBO4 =  
  39.     (DESCRIPTION =  
  40.       (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.7.61)(PORT = 1521))  
  41.       (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.7.62)(PORT = 1521))  
  42.       (LOAD_BALANCE = yes)  
  43.       (CONNECT_DATA =  
  44.         (SERVER = DEDICATED)  
  45.         (SERVICE_NAME = TAF)  #注意我们客户端的SERVICE_NAME,我们设置为TAF   
  46.       )  
  47.     )  
  48.   
  49. 2、在服务器端配置service  
  50.   配置service有多种方式,如dbca,oem,srvctl命令行。下面直接以命令行方式配置  
  51.   关于什么是service以及如何使用srvctl命令行创建service,请参考: http://blog.csdn.net/robinson_0612/article/details/8124232  
  52.   oracle@bo2dbp:~> srvctl add service -d GOBO4 -s TAF -r GOBO4A -a GOBO4B -P basic  
  53.   oracle@bo2dbp:~> srvctl start service -d GOBO4 -s TAF  
  54.   oracle@bo2dbp:~> ./crs_stat.sh | grep TAF  
  55.   ora.GOBO4.TAF.GOBO4A.srv                      ONLINE     ONLINE on bo2dbp    
  56.   ora.GOBO4.TAF.cs                              ONLINE     ONLINE on bo2dbp              
  57.   oracle@bo2dbp:~> srvctl config service -d GOBO4 -a  
  58.   TAF PREF: GOBO4A AVAIL: GOBO4B TAF: basic  
  59.   oracle@bo2dbp:~> export ORACLE_SID=GOBO4A  
  60.   oracle@bo2dbp:~> sqlplus / as sysdba  
  61.     
  62.   SQL*Plus: Release 10.2.0.3.0 - Production on Mon Dec 17 14:55:02 2012  
  63.     
  64.   Copyright (c) 1982, 2006, Oracle.  All Rights Reserved.  
  65.     
  66.   Connected to:  
  67.   Oracle Database 10g Release 10.2.0.3.0 - 64bit Production  
  68.   With the Real Application Clusters option  
  69.     
  70.   SQL> show parameter service  
  71.     
  72.   NAME                                 TYPE        VALUE  
  73.   ------------------------------------ ----------- ------------------------------  
  74.   service_names                        string      SYS$SYS.KUPC$S_1_2012102317304  
  75.                                                    4.GOBO4, SYS$SYS.KUPC$C_1_2012  
  76.                                                    1023173044.GOBO4, GOBO4, TAF  
  77.     
  78.   SQL> begin     
  79.     2  dbms_service.modify_service(     
  80.     3  service_name=>'TAF',     
  81.     4  failover_method =>dbms_service.failover_method_basic,     
  82.     5  failover_type =>dbms_service.failover_type_select,     
  83.     6  failover_retries =>180,     
  84.     7  failover_delay=>5);     
  85.     8  end;     
  86.     9  /  
  87.     
  88.   PL/SQL procedure successfully completed.  
  89.       
  90.   SQL> select name,failover_method,failover_type,goal,clb_goal from dba_services  
  91.     2  where name='TAF';  
  92.     
  93.   NAME                 FAILOVER_METHOD      FAILOVER_TYPE   GOAL         CLB_G  
  94.   -------------------- -------------------- --------------- ------------ -----  
  95.   TAF                  BASIC                SELECT                       LONG                                                     
  96.     
  97.   SQL> ho lsnrctl status  
  98.     ..........  
  99.   Service "TAF" has 1 instance(s).  
  100.     Instance "GOBO4A", status READY, has 2 handler(s) for this service...  
  101.   The command completed successfully  
  102.   
  103. 3、测试服务器端TAF  
  104.   robin@SZDB:~> sqlplus fail_over/fail@gobo4  
  105.   fail_over@GOBO4> get verify.sql  
  106.     1  REM the following query is for TAF connection verification  
  107.     2  col sid format 99999  
  108.     3  col serial# format 9999999  
  109.     4  col failover_type format a13  
  110.     5  col failover_method format a15  
  111.     6  col failed_over format a11  
  112.     7  Prompt  
  113.     8  Prompt Failover status for current user  
  114.     9  Prompt ============================================  
  115.    10  SELECT   sid,  
  116.    11   serial#,  
  117.    12   failover_type,  
  118.    13   failover_method,  
  119.    14   failed_over  
  120.    15   FROM   v$session  
  121.    16   WHERE   username = 'FAIL_OVER';  
  122.    17  REM the following query is for load balancing verification  
  123.    18  col host_name format a20  
  124.    19  Prompt  
  125.    20  Prompt Current instance name and host name  
  126.    21  Prompt ========================================  
  127.    22* SELECT   instance_name,host_name FROM v$instance;  
  128.    23    
  129.      
  130.   #下面的连接查询中表明客户端当前连接到了节点bo2dbp,其实例名为GOBO4A   
  131.   fail_over@GOBO4> @verify          
  132.     
  133.   Failover status for current user  
  134.   ============================================  
  135.     
  136.      SID  SERIAL# FAILOVER_TYPE FAILOVER_METHOD FAILED_OVER  
  137.   ------ -------- ------------- --------------- -----------  
  138.     1073       48 SELECT        BASIC           NO  
  139.     
  140.     
  141.   Current instance name and host name  
  142.   ========================================  
  143.     
  144.   INSTANCE_NAME    HOST_NAME  
  145.   ---------------- --------------------  
  146.   GOBO4A           bo2dbp    
  147.     
  148.   #此时停止节点bo2dbp  
  149.   oracle@bo2dbp:~> srvctl stop instance -d GOBO4 -i GOBO4A  
  150.   #查看停止节点bo2dbp即实例GOBO4A后的结果  
  151.   oracle@bo2dbp:~> ./crs_stat.sh |grep inst  
  152.   ora.GOBO4.GOBO4A.inst                         OFFLINE    OFFLINE             
  153.   ora.GOBO4.GOBO4B.inst                         ONLINE     ONLINE on bo2dbs    
  154.     
  155.   #此时回到客户端再次执行查询,FAILED_OVER的值已经变成YES,即表明当前的session为failover过来的  
  156.   #同时实例名和节点名也发生了变化  
  157.   fail_over@GOBO4> set timing on;  
  158.   fail_over@GOBO4> @verify  
  159.     
  160.   Failover status for current user  
  161.   ============================================  
  162.     
  163.      SID  SERIAL# FAILOVER_TYPE FAILOVER_METHOD FAILED_OVER  
  164.   ------ -------- ------------- --------------- -----------  
  165.     1082      396 SELECT        BASIC           YES  
  166.     
  167.   Elapsed: 00:00:04.19  
  168.     
  169.   Current instance name and host name  
  170.   ========================================  
  171.     
  172.   INSTANCE_NAME    HOST_NAME  
  173.   ---------------- --------------------  
  174.   GOBO4B           bo2dbs  
  175.     
  176.   Elapsed: 00:00:00.01  
  177.   
  178. 4、小结  
  179.   a、服务器端的TAF方式的failover通过在服务器端配置service来完成  
  180.   b、服务器端的TAF方式与客户端的TAF方式产生同样的效果  
  181.   b、一旦在服务端配置了基于服务器端的TAF,客户端再无需通过在客户端添加FAILOVER_MODE项  
  182.   c、该方式简化客户端配置,通过集中统一管理service实现failover  

 



Oracle RAC 负载均衡测试(结合服务器端与客户端)

 

      Oracle RAC 负载均衡使得从客户端发起的连接能够有效地分配到监听器负载较小的实例上。有两种方式实现客户端负载均衡,一是通过配置客户端的load_balance,一是通过配置服务器端的remote_listener参数。两种方式各有优劣,而且两者并不相互排斥,因此可以结合两种方式来更加有效的实现负载均衡。本文将描述两者结合的使用情况(oralce 10g rac)。

        有关客户端与服务端负载均衡的单独测试请参考:
              Oracle RAC 客户端连接负载均衡(Load Balance) 
              Oracle RAC 服务器端连接负载均衡(Load Balance)

        本文的测试将结合前篇文章使用的脚本与样例,是前两篇测试的一个总结。

 

一、配置需求

[sql] view plain copy 
 
 print?
  1. 1、服务器端各节点监听器正常提供服务,如果使用非缺省的1521端口,请参考 ORACLE RAC 下非缺省端口监听配置(listener.ora tnsnames.ora)   
  2.   oracle@bo2dbp:~> more $ORACLE_HOME/network/admin/listener.ora  
  3.   # listener.ora.bo2dbp Network Configuration File: /u01/oracle/db/network/admin/listener.ora.bo2dbp  
  4.   # Generated by Oracle configuration tools.  
  5.     
  6.   LISTENER_BO2DBP =  
  7.     (DESCRIPTION_LIST =  
  8.       (DESCRIPTION =  
  9.         (ADDRESS = (PROTOCOL = TCP)(HOST = bo2dbp-vip.2gotrade.com)(PORT = 1521)(IP = FIRST))  
  10.         (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.7.51)(PORT = 1521)(IP = FIRST))  
  11.       )  
  12.     )  
  13.     
  14.   SID_LIST_LISTENER_BO2DBP =  
  15.     (SID_LIST =  
  16.       (SID_DESC =  
  17.         (SID_NAME = PLSExtProc)  
  18.         (ORACLE_HOME = /u01/oracle/db)  
  19.         (PROGRAM = extproc)  
  20.       )  
  21.     )  
  22.     
  23.   oracle@bo2dbp:~> lsnrctl status  
  24.   Service "GOBO4" has 2 instance(s).  
  25.     Instance "GOBO4A", status READY, has 2 handler(s) for this service...  
  26.     Instance "GOBO4B", status READY, has 1 handler(s) for this service...  
  27.       
  28. 2、服务器端的remote_listener参数设置  
  29.   要求remote_listener参数的连接标识符在服务器端的tnsnames.ora中有对应的条目  
  30.     
  31.   SQL> show parameter listener  
  32.     
  33.   NAME                                 TYPE        VALUE  
  34.   ------------------------------------ ----------- ------------------------------  
  35.   local_listener                       string  
  36.   remote_listener                      string      remote_lsnr_gobo4  
  37.   
  38.   oracle@bo2dbp:~> more $ORACLE_HOME/network/admin/tnsnames.ora  
  39.   # tnsnames.ora Network Configuration File: /u01/oracle/db/network/admin/tnsnames.ora  
  40.   # Generated by Oracle configuration tools.  
  41.   remote_lsnr_gobo4 =  
  42.     (ADDRESS_LIST =  
  43.       (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.7.61)(PORT = 1521))  
  44.       (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.7.62)(PORT = 1521))  
  45.     )  
  46.   
  47. 3、客户端tnsnames.ora中启用load_balance  
  48.   oracle@SZDB:~> tail -11 $ORACLE_HOME/network/admin/tnsnames.ora  
  49.     
  50.   GOBO4 =  
  51.     (DESCRIPTION =  
  52.       (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.7.61)(PORT = 1521))  
  53.       (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.7.62)(PORT = 1521))  
  54.       (LOAD_BALANCE = yes)  
  55.       (CONNECT_DATA =  
  56.         (SERVER = DEDICATED)  
  57.         (SERVICE_NAME = GOBO4)  
  58.       )  
  59.     )  

二、测试Load Balance

[python] view plain copy 
 
 print?
  1. #还是使用之前的脚步来进行测试  
  2. #Author : Robinson  
  3. #Blog : http://blog.csdn.net/robinson_0612  
  4. oracle@SZDB:~> more load_balance.sh   
  5. #!/bin/bash  
  6. for i in {1..1000}  
  7. do  
  8. echo $i  
  9. sqlplus -S system/oracle@GOBO4 <<eof  < span="">
  10. select instance_name from v\$instance;  
  11. EOF  
  12. sleep 1  
  13. done  
  14. exit 0  
  15.   
  16. oracle@SZDB:~> ./load_balance.sh >load_bal.log     
  17.   
  18. #查看日志  
  19. oracle@SZDB:~> grep GOBO4A load_bal.log |wc -l  
  20. 750  
  21. oracle@SZDB:~> grep GOBO4B load_bal.log |wc -l  
  22. 250  
  23.   
  24. #查看监听器的日志  
  25. oracle@bo2dbp:/u01/oracle/db/network/log> grep establish listener_bo2dbp.log |wc -l  
  26. 894  
  27.   
  28. oracle@bo2dbp:/u01/oracle/db/network/log> grep INSTANCE_NAME=GOBO4 listener_bo2dbp.log |wc -l  
  29. 415  
  30.   
  31. #从上面的查询中可以得知,节点bo2dbp总共接受了894个连接请求,而有415连接请求是由bo2dbs转发过来的  
  32. #因此,实际上从客户端发起到bo2dbp的实际连接请求数为894-415=479  
  33.   
  34. #下面来看在节点bo2dbs上的监听日志  
  35. oracle@bo2dbs:/u01/oracle/db/network/log> grep establish listener_bo2dbs.log |wc -l  
  36. 665  
  37.   
  38. oracle@bo2dbs:/u01/oracle/db/network/log> grep INSTANCE_NAME=GOBO4 listener_bo2dbs.log |wc -l   
  39. 144  
  40. #从上面的查询中可知,节点bo2dbs总共接受了665个连接请求,而有144个连接请求是由bo2dbp转发过来的  
  41. #因此,实际上从客户端发起到bo2dbs的实际连接请求数为655-144=511  
  42.   
  43. #从上面的结果可知,  
  44. #基于客户端的连接请求数为节点bo2dbp为479,节点bo2dbs为511  
  45. #监听器路由到本地实例数目为,节点bo2dbp,479-144=335,节点bo2dbs,511-415=96  
  46. #远程监听器路由道本地实例的数据为,节点bo2dpb为415,节点bo2dbs为144  
  47. #监听器路由的概念是指基于服务器端的负载均衡  
  48. #即服务器端的监听器根据自身以及远程监听器的负载情况来确定将当前的连接请求转发到本地或远程,此即为路由。  

 



 

 

Oracle RAC failover 测试(连接时故障转移)

 

 

  Oracle RAC 集群最突出的表现就是高可用性,这些内容主要包括load balance以及failover,通过这些技术使得单点故障不影响客户端端应用程序对数据库的正常访问,以及通过创建service实现节点间负载均衡。本文主要描述Oracle 10g rac环境下的Oracle failover测试。
  
  下面是一些关于这方面的基础参考或相关链接:
  有关负监听配置,载均衡(load balance)以及Oracle service请参考
    ORACLE RAC 监听配置 (listener.ora tnsnames.ora) 
    ORACLE RAC 下非缺省端口监听配置(listener.ora tnsnames.ora) 
    Oracle RAC 客户端连接负载均衡(Load Balance)
    Oracle RAC 服务器端连接负载均衡(Load Balance)
    Oracle RAC 负载均衡测试(结合服务器端与客户端)
    Oracle RAC failover 测试(TAF方式) 
    Oracle RAC failover 测试(Server TAF方式)

[python] view plain copy 
 
 print?
  1. 1、Oracle failover的几种方式  
  2.   Oracle failover也叫故障转移,从Oracle 10g开始,分为3种方式:  
  3.   a. Client-Side Connect time Failover  
  4.     客户端连接failover模式,此方式较为简单,只要安装了rac集群,缺省情况下即被启用。  
  5.   b. TAF  
  6.     透明故障转移,此方式同样基于客户端完成,需要配置客户端tnsnames.ora,连接故障发生时,无须重新连接  
  7.   c. Service-Side TAF  
  8.     服务器端透明故障转移,通过配置service来实现,客户端无须任何配置。  
  9.     
  10.   本文主要演示第一种情形,即客户端在发起连接请求时如何实现故障转移      
  11.   注意事项: 不能在listener.ora 文件中设置GLOBAL_NAME  
  12.          该参数会禁用Connect-time Failover 和 Transparent Application Failover  
  13.   
  14. 2、Client-Side Connect time Failover  
  15.   下面关于Client-Side Connect time Failover来自Oracle 的官方描述 ID 453293.1  
  16.   The connect-time failover enables clients to connect to another listener if the initial connection to the first   
  17.   listener fails. The number of listener protocol addresses determines how many listeners are tried. Without   
  18.   connect-time failover, Oracle Net attempts a connection with only one listener. The default is on.   
  19.   
  20.   Tnsnames Parameter: FAILOVER   
  21.     
  22.   (failover=on) is default for ADDRESS_LISTs, DESCRIPTION_LISTs, and a set of   
  23.   DESCRIPTIONs., therefore, you do not have to specify it explicitly.   
  24.   
  25.   基于客户端的failover比较好理解。对于在客户端tnsnames.ora有多个VIP的情形,客户端会首先请求定位到第一个VIP,如果第一个VIP不  
  26.   可达,则继续尝试使用下一个VIP,直到成功建立连接,如果所有的VIP无法连接将收到错误消息。  
  27.   通常情况下,我们使用vip作为tnsnames.ora中的连接地址  
  28.   
  29. 3、服务器端、客户端的环境  
  30.   #服务器端环境,host信息  
  31.   oracle@bo2dbp:~> cat /etc/hosts |grep vip  
  32.   192.168.7.61   bo2dbp-vip.2gotrade.com    bo2dbp-vip  
  33.   192.168.7.62   bo2dbs-vip.2gotrade.com    bo2dbs-vip  
  34.     
  35.   #服务器端环境,集群信息  
  36.   oracle@bo2dbp:~> ./crs_stat.sh   
  37.    Resource name                                Target     State               
  38.   --------------                                ------     -----               
  39.   ora.GOBO4.GOBO4A.inst                         ONLINE     ONLINE on bo2dbp    
  40.   ora.GOBO4.GOBO4B.inst                         ONLINE     ONLINE on bo2dbs    
  41.   ora.GOBO4.db                                  ONLINE     ONLINE on bo2dbp    
  42.   ora.bo2dbp.ASM1.asm                           ONLINE     ONLINE on bo2dbp    
  43.   ora.bo2dbp.LISTENER_BO2DBP.lsnr               ONLINE     ONLINE on bo2dbp    
  44.   ora.bo2dbp.LISTENER_ORA10G_BO2DBP.lsnr        ONLINE     ONLINE on bo2dbp    
  45.   ora.bo2dbp.gsd                                ONLINE     ONLINE on bo2dbp    
  46.   ora.bo2dbp.ons                                ONLINE     ONLINE on bo2dbp    
  47.   ora.bo2dbp.vip                                ONLINE     ONLINE on bo2dbp    
  48.   ora.bo2dbs.ASM2.asm                           ONLINE     ONLINE on bo2dbs    
  49.   ora.bo2dbs.LISTENER_BO2DBS.lsnr               ONLINE     ONLINE on bo2dbs    
  50.   ora.bo2dbs.LISTENER_ORA10G_BO2DBS.lsnr        ONLINE     ONLINE on bo2dbs    
  51.   ora.bo2dbs.gsd                                ONLINE     ONLINE on bo2dbs    
  52.   ora.bo2dbs.ons                                ONLINE     ONLINE on bo2dbs    
  53.   ora.bo2dbs.vip                                ONLINE     ONLINE on bo2dbs    
  54.   ora.ora10g.db                                 ONLINE     ONLINE on bo2dbp   
  55.   
  56.   #客户端环境  
  57.   robin@SZDB:~> cat /etc/issue  
  58.     
  59.   Welcome to SUSE Linux Enterprise Server 10 SP3 (x86_64) - Kernel \r (\l).  
  60.     
  61.   robin@SZDB:~> sqlplus -v  
  62.     
  63.   SQL*Plus: Release 10.2.0.3.0 - Production  
  64.     
  65.   #客户端tnsnames配置  
  66.   GOBO4 =  
  67.     (DESCRIPTION =  
  68.       (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.7.61)(PORT = 1521))  
  69.       (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.7.62)(PORT = 1521))  
  70.       (LOAD_BALANCE = yes)  
  71.       (CONNECT_DATA =  
  72.         (SERVER = DEDICATED)  
  73.         (SERVICE_NAME = GOBO4)  
  74.       )  
  75.     )  
  76.   
  77. 4、连接测试    
  78.   #首次建立连接,此时客户端从tnsnames配置的第一个IP建立连接,即192.168.7.61  
  79.   #VIP 192.168.7.61对应的hostname以及instance_name分别为bo2dbp,GOBO4A,所以我们获得如下返回结果  
  80.   robin@SZDB:~> sqlplus fail_over/fail@gobo4  
  81.   
  82.   fail_over@GOBO4> get verify.sql  
  83.     1  REM the following query is for TAF connection verification  
  84.     2  col sid format 99999  
  85.     3  col serial# format 9999999  
  86.     4  col failover_type format a13  
  87.     5  col failover_method format a15  
  88.     6  col failed_over format a11  
  89.     7  Prompt  
  90.     8  Prompt Failover status for current user  
  91.     9  Prompt ============================================  
  92.    10  SELECT   sid,  
  93.    11   serial#,  
  94.    12   failover_type,  
  95.    13   failover_method,  
  96.    14   failed_over  
  97.    15   FROM   v$session  
  98.    16   WHERE   username = 'FAIL_OVER';  
  99.    17  REM the following query is for load balancing verification  
  100.    18  col host_name format a20  
  101.    19  Prompt  
  102.    20  Prompt Current instance name and host name  
  103.    21  Prompt ========================================  
  104.    22* SELECT   instance_name,host_name FROM v$instance;  
  105.    23    
  106.      
  107.   fail_over@GOBO4> @verify.sql  
  108.     
  109.   Failover status for current user  
  110.   ============================================  
  111.      SID  SERIAL# FAILOVER_TYPE FAILOVER_METHOD FAILED_OVER  
  112.   ------ -------- ------------- --------------- -----------  
  113.     1071      249 NONE          NONE            NO  
  114.     
  115.     
  116.   Current instance name and host name  
  117.   ========================================  
  118.   INSTANCE_NAME    HOST_NAME  
  119.   ---------------- --------------------  
  120.   GOBO4A           bo2dbp  
  121.   
  122.   #停止集群数据库的第一个instance,即GOBO4A  
  123.   oracle@bo2dbp:~> srvctl stop instance -d GOBO4 -i GOBO4A  
  124.   #校验结果  
  125.   oracle@bo2dbp:~> ./crs_stat.sh | grep inst  
  126.   ora.GOBO4.GOBO4A.inst                         OFFLINE    OFFLINE             
  127.   ora.GOBO4.GOBO4B.inst                         ONLINE     ONLINE on bo2dbs   
  128.     
  129.   #回到客户端原来的session,此时出现ORA-03114  
  130.   fail_over@GOBO4> /  
  131.   SELECT   instance_name,host_name FROM v$instance  
  132.   *  
  133.   ERROR at line 1:  
  134.   ORA-03113: end-of-file on communication channel  
  135.     
  136.   fail_over@GOBO4> /  
  137.   ERROR:  
  138.   ORA-03114: not connected to ORACLE  
  139.     
  140.   ERROR:  
  141.   ORA-03114: not connected to ORACLE  
  142.   
  143.   #下面尝试重新建立连接  
  144.   fail_over@GOBO4> conn fail_over/fail@gobo4  
  145.   Connected.  
  146.   fail_over@GOBO4> @verify  
  147.     
  148.   Failover status for current user  
  149.   ============================================  
  150.      SID  SERIAL# FAILOVER_TYPE FAILOVER_METHOD FAILED_OVER  
  151.   ------ -------- ------------- --------------- -----------  
  152.     1062       94 NONE          NONE            NO  
  153.     
  154.     
  155.   Current instance name and host name  
  156.   ========================================  
  157.   INSTANCE_NAME    HOST_NAME  
  158.   ---------------- --------------------  
  159.   GOBO4B           bo2dbs  
  160.     
  161.   #Author : Robinson  
  162.   #Blog   : http://blog.csdn.net/robinson_0612  
  163.     
  164.   #从上面的查询可知,当前的session已经连接到第二个实例。且FAILOVER_TYPE,FAILOVER_METHOD,FAILED_OVER几个至没有发生任何变化  
  165.   #因为这几个参数主要是针对TAF。  
  166.   
  167. 5、小结  
  168.   a、客户端连接时的故障转移,服务器端和客户端无需任何配置,缺省情况下即被开启,即failover=on  
  169.   b、只要集群环境存在(非单节点RAC),客户端的连接请求会逐个尝试列出的VIP,直到连接成功为止,如果所有不可连接,返回错误  
  170.   c、客户端已经建立后,服务器端实例或节点故障,都将导致客户端必须重新发起新的连接请求  

 

 



Oracle RAC failover 测试(TAF方式)

 

 Oracle RAC 客户端故障转移(failover),当采用TAF方式时,对于已经建立连接的客户端,在连接的实例或节点出现故障时,客户端无需再次发出连接请求,仍然可以继续之前的数据库操作,此称之为透明故障转移。本文描述基于Oracle 10g rac,客户端TAF方式的故障转移并给出示例。

  下面是一些关于这方面的基础参考链接:
  有关负监听配置,载均衡(load balance)以及Oracle service请参考
    ORACLE RAC 监听配置 (listener.ora tnsnames.ora) 
    ORACLE RAC 下非缺省端口监听配置(listener.ora tnsnames.ora) 
    Oracle RAC 客户端连接负载均衡(Load Balance)
    Oracle RAC 服务器端连接负载均衡(Load Balance)
    Oracle RAC 负载均衡测试(结合服务器端与客户端) 
    
  有关Oracle RAC failover 连接时故障转移请参考
    Oracle RAC failover 测试(连接时故障转移)
    Oracle RAC failover 测试(Server TAF方式)

[python] view plain copy 
 
 print?
  1. 1、TAF描述      
  2.   #下面关于TAF来自Oracle 的官方描述 ID 453293.1  
  3.   Transparent Application Failover (TAF) is a feature of the Oracle Call Interface (OCI) driver at client side.   
  4.   It enables the application to automatically reconnect to a database, if the database instance to which the connection is   
  5.   made fails. In this case, the active transactions roll back.   
  6.   Tnsnames Parameter: FAILOVER_MODE   
  7.   
  8.      When an instance to which a connection is established fails or is shutdown, the connection on the client side becomes   
  9.   stale and would throw exceptions to the caller trying to use it. TAF enables the application to transparently reconnect   
  10.   to a preconfigured secondary instance creating a fresh connection, but identical to the connection that was established   
  11.   on the first original instance.  
  12.     
  13.   #简单一点来说,就是说对于那些已经成功连接到特定实例的客户端,如果该实例或节点异常宕机,客户端会自动重新发出到剩余实例的连  
  14.   #接请求。使得客户端感觉不到它所连接的实例或节点已经出现故障,这个就称之为透明转移。但其间的活动事务将被回滚。  
  15.   #通过在客户端的tnsnames.ora中配置FAILOVER_MODE项实现TAF  
  16.   
  17. 2、服务器端、客户端的环境  
  18.   #服务器端环境,host信息  
  19.   oracle@bo2dbp:~> cat /etc/hosts |grep vip  
  20.   192.168.7.61   bo2dbp-vip.2gotrade.com    bo2dbp-vip  
  21.   192.168.7.62   bo2dbs-vip.2gotrade.com    bo2dbs-vip  
  22.     
  23.   #服务器端环境,集群信息  
  24.   oracle@bo2dbp:~> ./crs_stat.sh   
  25.    Resource name                                Target     State               
  26.   --------------                                ------     -----               
  27.   ora.GOBO4.GOBO4A.inst                         OFFLINE    OFFLINE on bo2dbp  #此时节点1上的实例被关闭  
  28.   ora.GOBO4.GOBO4B.inst                         ONLINE     ONLINE on bo2dbs    
  29.   ora.GOBO4.db                                  ONLINE     ONLINE on bo2dbp    
  30.   ora.bo2dbp.ASM1.asm                           ONLINE     ONLINE on bo2dbp    
  31.   ora.bo2dbp.LISTENER_BO2DBP.lsnr               ONLINE     ONLINE on bo2dbp    
  32.   ora.bo2dbp.LISTENER_ORA10G_BO2DBP.lsnr        ONLINE     ONLINE on bo2dbp    
  33.   ora.bo2dbp.gsd                                ONLINE     ONLINE on bo2dbp    
  34.   ora.bo2dbp.ons                                ONLINE     ONLINE on bo2dbp    
  35.   ora.bo2dbp.vip                                ONLINE     ONLINE on bo2dbp    
  36.   ora.bo2dbs.ASM2.asm                           ONLINE     ONLINE on bo2dbs    
  37.   ora.bo2dbs.LISTENER_BO2DBS.lsnr               ONLINE     ONLINE on bo2dbs    
  38.   ora.bo2dbs.LISTENER_ORA10G_BO2DBS.lsnr        ONLINE     ONLINE on bo2dbs    
  39.   ora.bo2dbs.gsd                                ONLINE     ONLINE on bo2dbs    
  40.   ora.bo2dbs.ons                                ONLINE     ONLINE on bo2dbs    
  41.   ora.bo2dbs.vip                                ONLINE     ONLINE on bo2dbs    
  42.   ora.ora10g.db                                 ONLINE     ONLINE on bo2dbp   
  43.   
  44.   #客户端环境  
  45.   robin@SZDB:~> cat /etc/issue  
  46.     
  47.   Welcome to SUSE Linux Enterprise Server 10 SP3 (x86_64) - Kernel \r (\l).  
  48.     
  49.   robin@SZDB:~> sqlplus -v  
  50.     
  51.   SQL*Plus: Release 10.2.0.3.0 - Production  
  52.     
  53.   #客户端tnsnames配置    
  54.   GOBO4_TAF =  
  55.    (DESCRIPTION =  
  56.      (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.7.61)(PORT = 1521))  
  57.      (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.7.62)(PORT = 1521))  
  58.      (LOAD_BALANCE = yes)  
  59.      (CONNECT_DATA =  
  60.        (SERVER = DEDICATED)  
  61.        (SERVICE_NAME = GOBO4)  
  62.       (FAILOVER_MODE =       #FAILOVER_MODE项参数   
  63.        (TYPE = session)  
  64.        (METHOD = basic)  
  65.        (RETRIES = 180  
  66.        (DELAY = 5)  
  67.       )  
  68.      )  
  69.    )  
  70.     
  71. 3、FAILOVER_MODE项分析  
  72.   FAILOVER_MODE项是实现TAF的主要配置内容,下面对其进行描述.  
  73.     
  74.   METHOD: 用户定义何时创建到其实例的连接,有BASIC 和 PRECONNECT 两种可选值  
  75.     BASIC: 客户端通过地址列表成功建立连接后,即仅当客户端感知到节点故障时才创建到其他实例的连接  
  76.     PRECONNECT: 预连接模式,是在最初建立连接时就同时建立到所有实例的连接,当发生故障时,立刻就可以切换到其他链路上  
  77.       
  78.     上述两种方式各有优劣,前者建立连接的开销相对较小,但failover时会产生延迟,而后者正好与前者相反  
  79.       
  80.   TYPE: 用于定义发生故障时对完成的SQL 语句如何处理,其中有2种类型:session 和select  
  81.     select:使用select方式,Oracle net会跟踪事务期间的所有select语句,并跟踪每一个与当前select相关的游标已返回多少行给客户  
  82.       端。此时,假定select查询已返回500行,客户端当前连接的节点出现故障,Oracle Net自动建立连接到幸存的实例上并继续返回  
  83.       剩余的行数给客户端。假定总行数为1500,行,则1000行从剩余节点返回。  
  84.     session: 使用session方式,所有select查询相关的结果在重新建立新的连接后将全部丢失,需要重新发布select命令。  
  85.       
  86.     上述两种方式适用于不同的情形,对于select方式,通常使用与OLAP数据库,而对于session方式则使用与OLTP数据库。因为select   
  87.     方式,Oracle 必须为每个session保存更多的内容,包括游标,用户上下文等,需要更多的资源。  
  88.       
  89.     其次,两种方式期间所有未提交的DML事务将自动回滚且必须重启启动。alter session语句不会failover。  
  90.     临时对象不会failover也不能被重新启动。  
  91.       
  92.   RETRIES: 表示重试的次数  
  93.   DELAY:表示重试的间隔时间  
  94.     
  95. 4、测试TAF      
  96.   #首次建立连接,此时客户端从tnsnames配置的第一个IP建立连接,由于第一个VIP所在的实例已经关闭,故连接到192.168.7.62  
  97.   #VIP 192.168.7.62对应的hostname以及instance_name分别为bo2dbs,GOBO4B,所以我们获得如下返回结果  
  98.   #其次我们可以看到当前session failover的相关参数  
  99.   robin@SZDB:~> sqlplus <a href="mailto:fail_over/fail@gobo4_taf"
  100.   
  101.   fail_over@GOBO4> get verify.sql  
  102.     1  REM the following query is for TAF connection verification  
  103.     2  col sid format 99999  
  104.     3  col serial# format 9999999  
  105.     4  col failover_type format a13  
  106.     5  col failover_method format a15  
  107.     6  col failed_over format a11  
  108.     7  Prompt  
  109.     8  Prompt Failover status for current user  
  110.     9  Prompt ============================================  
  111.    10  SELECT   sid,  
  112.    11   serial#,  
  113.    12   failover_type,  
  114.    13   failover_method,  
  115.    14   failed_over  
  116.    15   FROM   v$session  
  117.    16   WHERE   username = 'FAIL_OVER';  
  118.    17  REM the following query is for load balancing verification  
  119.    18  col host_name format a20  
  120.    19  Prompt  
  121.    20  Prompt Current instance name and host name  
  122.    21  Prompt ========================================  
  123.    22* SELECT   instance_name,host_name FROM v$instance;  
  124.    23    
  125.      
  126.   fail_over@GOBO4> @verify  
  127.     
  128.   Failover status for current user  
  129.   ============================================  
  130.     
  131.      SID  SERIAL# FAILOVER_TYPE FAILOVER_METHOD FAILED_OVER  
  132.   ------ -------- ------------- --------------- -----------  
  133.     1063     1175 SESSION       BASIC           NO  
  134.     
  135.     
  136.   Current instance name and host name  
  137.   ========================================  
  138.     
  139.   INSTANCE_NAME    HOST_NAME  
  140.   ---------------- --------------------  
  141.   GOBO4B           bo2dbs  
  142.     
  143.   #此时启动第一个实例GOBO4A,并停止第二个实例  
  144.   oracle@bo2dbp:~> srvctl start instance -d GOBO4 -i GOBO4A    
  145.   oracle@bo2dbp:~> srvctl stop instance -d GOBO4 -i GOBO4B  
  146.   #查看两个实例的状态  
  147.   oracle@bo2dbp:~> ./crs_stat.sh | grep inst  
  148.   ora.GOBO4.GOBO4A.inst                         ONLINE     ONLINE on bo2dbp    
  149.   ora.GOBO4.GOBO4B.inst                         OFFLINE    OFFLINE   
  150.     
  151.   #在客户端的session再次检查连接状态,即执行查询,结果如下,我们收到了ORA-25408  
  152.   fail_over@GOBO4> @verify  
  153.     
  154.   Failover status for current user  
  155.   ============================================  
  156.   SELECT   sid,  
  157.   *  
  158.   ERROR at line 1:  
  159.   ORA-25408: can not safely replay call  
  160.     
  161.   Current instance name and host name  
  162.   ========================================  
  163.     
  164.   INSTANCE_NAME    HOST_NAME  
  165.   ---------------- --------------------  
  166.   GOBO4A           bo2dbp  
  167.     
  168.   #再次执行查询,此时客户端已经自动实现了重新连接,从查询返回得到的INSTANCE_NAME与HOST_NAME可知。  
  169.   #最重要的一个FAILED_OVER值为YES,表明当前的session是一个failover来的session。  
  170.   #关于METHOD使用PRECONNECT与TYPE使用SELECT的方式在此不作演示  
  171.   fail_over@GOBO4> @verify  
  172.     
  173.   Failover status for current user  
  174.   ============================================  
  175.     
  176.      SID  SERIAL# FAILOVER_TYPE FAILOVER_METHOD FAILED_OVER  
  177.   ------ -------- ------------- --------------- -----------  
  178.     1073       29 SESSION       BASIC           YES  
  179.     
  180.     
  181.   Current instance name and host name  
  182.   ========================================  
  183.     
  184.   INSTANCE_NAME    HOST_NAME  
  185.   ---------------- --------------------  
  186.   GOBO4A           bo2dbp   
  187.   
  188.   #Author : Robinson  
  189.   #Blog   : http://blog.csdn.net/robinson_0612  
  190.     
  191. 5、小结:  
  192.   a、客户端TAF方式实现了Oracle客户端到服务器透明故障转移  
  193.   b、主要在客户端tnsnames.ora配置FAILOVER_MODE来实现基于客户端的TAF  
  194.   c、FAILOVER_MODE中基于连接方式(METHOD)可以分为BASIC与PRECONNECT两种方式,后者开销更大,延迟小,与前者相反  
  195.   d、FAILOVER_MODE中TYPE可以分为select与session两种方式,两者所有未提交的事务全部回滚,select方式会failover查询,  
  196.      session方式不会。select方式多用在OLAP类型数据库,而session多用在OLTP类型数据库  
  197.   e、一旦所在的实例发生故障,会自动failover,无需手动重新连接,这就是与连接时故障转移所不同的。     

 



出处:http://www.cnblogs.com/lhrbest

posted @ 2018-03-18 19:21  dion至君  阅读(1177)  评论(0编辑  收藏  举报