spark并行度加载关系数据库
方法一:针对整形字段ECI进行并行度加载:并行度为3
1 SparkConf sparkConf = new SparkConf(); 2 sparkConf.setAppName("jdbc").setMaster("local[4]"); 3 JavaSparkContext jsc = new JavaSparkContext(sparkConf); 4 SQLContext sc = new SQLContext(jsc); 5 String url ="jdbc:sqlserver://192.168.1.101;DatabaseName=database;user=user;password=123456"; 6 String tableName = "tb_city"; 7 Properties connectionProperties = new Properties(); 8 connectionProperties.put("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver"); 9 DataFrame table = sc.read().jdbc(url,tableName,"ECI",125883650,263780907,3,connectionProperties).select("CityID","IMSI","ECI");
方法二:针对varchar字段IMSI进行并行度加载:并行度为3
1 SparkConf sparkConf = new SparkConf(); 2 sparkConf.setAppName("jdbc").setMaster("local[4]"); 3 JavaSparkContext jsc = new JavaSparkContext(sparkConf); 4 SQLContext sc = new SQLContext(jsc); 5 String url ="jdbc:sqlserver://192.168.1.101;DatabaseName=database;user=user;password=123456"; 6 String tableName = "tb_city"; 7 Properties connectionProperties = new Properties(); 8 connectionProperties.put("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver"); 9 String[] predicates = new String[]{ 10 "IMSI >='105156335255615' AND IMSI <='145437785776944'", 11 "IMSI >='145441560321876' AND IMSI <='145441636521493'", 12 "IMSI >'145441636521493' AND IMSI <='145464988025176'", 13 }; 14 DataFrame table = sc.read().jdbc(url,tableName,predicates,connectionProperties).select("CityID","IMSI");
predicates内是筛选条件。三个筛选条件对应三个分区。