阿飞飞飞

学而时习之

导航

关于Scala中正则表达式的几种用法

正则表达式是一种针对于字符串的操作,主要功能有匹配、切割、替换和获取的作用,在Scala中正则也是被频繁使用的方法(regex.r表示为正则表达式)

1、匹配

Scala支持多种正则表达式解析,主要包括下面三种:

  1.   String.matches()方法
  2.   正则表达式模式匹配
  3.   scala.util.matching.Regex API

//String.matches

val a = "studying83"
println(a.matches("[a-z0-9]+")) //true
println(a.matches("[a-z0-9]{4}"))//false

//正则表达式模式匹配

val b = """([a-z0-9]+)"""".r
"studying83" match {
    case b => println("匹配成功")
    case _ => println("匹配失败")
}
//匹配成功

//scala.util.matching.Regex API

其中有三种匹配:

  findFirstMatchIn()返回第一个匹配(Option[match])

  findAllMatchIn()返回所有匹配(regex.match)

  findAllIn()返回所有匹配结果(String)

//findFirstMatchIn()
val reg = "[0-9]".r
reg.findFirstMatchIn("abc3d2gf") match { 
    case Some(x) => println(x)  
    case None => println("no")
}   //3

//findAllMatchIn()
val reg = "[0-9]".r
println(reg.findAllMatchIn("abc3d2gf").toList)
//List(3, 2)

2、捕获分组

val str =  "{\"id\":\"123456\",\"friends\":{\"name\":\"zs\",\"age\":\"40\"}}"
val reg = "\\{\"id\":\"([0-9]+)\",\"friends\":\\{\"name\":\"([a-z]+)\",\"age\":\"([0-9]+)\"}}".r
reg.findAllMatchIn(str).foreach(x=>println(x.group(1),x.group(2),x.group(3)))
//(123456,zs,40)

val input="name:Jason,age:19,weight:100"
val studentPattern="([0-9a-zA-Z-#() ]+):([0-9a-zA-Z-#() ]+)".r
studentPattern.findAllMatchIn(input).foreach(x=>println(x.group(1),x.group(2)))
//(name,Jason)
(age,19)
(weight,100)

//实用性 例如某一日志文件内容如:INFO 2000-01-07 requestURI:/c?app=0&p=1 路径为path 对其进行解析
import scala.io.Source
val source = Source.fromFile("path","UTF-8")
val lines = source.getLines.toArray
val reg = """([A-Z]+) ([0-9]{4}-[0-9]{2}-[0-9]{1,2}) requestURI:(.*)""".r
1## lines.map(line => reg.findAllMatchIn(line).toList.map(x => (x.group(1),x.group(2),x.group(3)))).foreach(println)
//List((INFO,2020-01-07,/c?app=0&p=1))
2## lines.map(line => line match{case reg(le,ld,ad) => (le,ld,ad)})
// Array[(String)] = Array((INFO,2000-01-07,/c?app=0&p=1))

 

3、替换

//replaceFirstIn
val a = """([0-9]+)""".r
a.replaceFirstIn("123,go! 666","run")
// run,go! 666

//replaceAllIn
val a = """([0-9]+)""".r
a.replaceAllIn("123 you are the best!","come on!")
//come on! you are the best!

 

4、查找

val date = """([0-9]{4})-([0-9]{1,2})-([0-9]{1,2})""".r
"2020-5-18" match {case date(year, _*) => println((year))}
//2020
"2020-5-18" match {case date(_,mon,_*) => println(mon)}
//5

 

posted on 2020-09-18 13:06  阿飞飞飞  阅读(4973)  评论(0编辑  收藏  举报