第十三章 字符串
第十三章 字符串
可以证明,字符串操作是计算机程序设计中
13.1 不可变String
String对象是可变的。String类中每一个看起来会修改String值得方法,实际上都是创建了一个String对象,而该引用指向得对象其实以指待在单一得物理位置上。以包含修改后得字符串内容。而最初String对象则丝毫未动。
13.2 重载 ”+“ 与StringBuilder
String对象是不可变的,因此你可以给String对象加任何别名。因为String有只读特性,任何指向它的应用都不可以改变它的值,因此也就不会对其他的引用有什么影响。
为String对象重载的”+“操作符就是一个例子。重载的意思是,一个操作符在应用于特定的类时,被赋予了特殊意义。”+“ 和 ”+=“是Java中仅有的两个重载过的操作符,而Java不允许程序员重载任何操作符。
当你为一个类编写toString()方法时,如果字符串操作比较简单,那就可以信赖编译器,它会为你合理的构造最终的字符串结果。但是,如果你要在toString()方法中使用循环,那么最好的创建一个StringBulider对象,用来构造最终结果。
StringBulider:
- append()
- toString()
- delete()
- insert()
- repleace()
- reverse()
练习1
/* Analyze SprinklerSystem.toString() in reusing/SprinklerSystem.java to discover * whether writing the toString() with an explicit StringBuilder will save any * StringBuilder creations. */ class WaterSource { private String s; WaterSource() { System.out.println("WaterSource()"); s = "Constructed"; } public String toString() { return s; } } public class SprinklerSystem1 { private String valve1, valve2, valve3, valve4; private WaterSource source = new WaterSource(); private int i; private float f; // appears to create only one StringBuilder: (using javap -c) public String toString() { return "valve1 = " + valve1 + " " + "valve2 = " + valve2 + " " + "valve3 = " + valve3 + " " + "valve4 = " + valve4 + " " + "i = " + i + " " + "f = " + f + " " + "source = " + source; } public static void main(String[] args) { SprinklerSystem1 sprinklers = new SprinklerSystem1(); System.out.println(sprinklers); } } ======================================================================= WaterSource() valve1 = null valve2 = null valve3 = null valve4 = null i = 0 f = 0.0 source = Constructed
13.3 无意识的递归
Java没个类从根本上都是继承自Object,标准容器类也自然不例外。因此容器都有toString()方法,并且覆写了该方法,使得它生成的String结果能够表达容器自身,以及容器包含的对象。
如果你想打印对象的内存地址,应该调用 Object.toString()方法,这才是负责任的方法。
练习2
// Repair InfiniteRecursion.java import java.util.*; public class InfiniteRecursion2 { public String toString() { return " InfiniteRecursion address: " + super.toString() + "\n"; } public static void main(String[] args) { List<InfiniteRecursion2> v = new ArrayList<InfiniteRecursion2>(); for(int i = 0; i < 10; i++) v.add(new InfiniteRecursion2()); System.out.println(v); } } ================================================== [ InfiniteRecursion address: 第十三章字符串.InfiniteRecursion2@74a14482]
13.4 String上的操作
当需要改变字符串内容的时候,String类的方法都会返回一个新的String对象。同时,如果内容没有改变,String的方法只是返回指向原对象的引用而已。可以节约存储空间以及避免额外的开销
13.5 格式化输出
在Java中,所有新的格式化功能都有Java.util.Formatter类处理。可以将Formater看作一个翻译器,他将你的格式化字符与数据翻译成需要的结果。
练习3
// Modify Turtle.java so that it sends all output to System.err. import java.io.*; import java.util.*; public class Turtle3 { private String name; private Formatter f; public Turtle3(String name, Formatter f) { this.name = name; this.f = f; } public void move(int x, int y) { f.format("%s The Turtle is at (%d,%d)\n", name, x, y); } public static void main(String[] args) { PrintStream outAlias = System.err; Turtle3 tommy = new Turtle3("Tommy", new Formatter(System.err)); Turtle3 terry = new Turtle3("Terry", new Formatter(outAlias)); tommy.move(0,0); terry.move(4,8); tommy.move(3,4); terry.move(2,5); tommy.move(3,3); terry.move(3,3); } } ==================================================================== Tommy The Turtle is at (0,0) Terry The Turtle is at (4,8) Tommy The Turtle is at (3,4) Terry The Turtle is at (2,5) Tommy The Turtle is at (3,3) Terry The Turtle is at (3,3)
13.5.1 格式化说明符
%[argument_index$] [flags] [width] [.percision] conversion
默认数据右对齐,不过可以通过”-“标志来改变对齐方向
- width 各种类型的数据转换
- percision 尺寸,String 最大输出数量 float 精度
练习4
import java.util.*; public class Receipt4 { private double total = 0; private Formatter f = new Formatter(System.out); private static final int W1 = 15; private static final int W2 = 5; private static final int W3 = 10; private String s1 = "%-" + W1 + "s %" + W2 + "s %" + W3 + "s\n"; private String s2 = "%-" + W1 + ".15s %" + W2 + "d %" + W3 + ".2f\n"; private String s3 = "%-" + W1 + "s %" + W2 + "s %" + W3 + ".2f\n"; public void printTitle() { f.format(s1, "Item", "Qty", "Price"); f.format(s1, "----", "---", "-----"); } public void print(String name, int qty, double price) { f.format(s2, name, qty, price); total += price; } public void printTotal() { f.format(s3, "Tax", "", total * 0.06); f.format(s1, "", "", "-----"); f.format(s3, "Total", "", total * 1.06); } public static void main(String[] args) { Receipt4 receipt = new Receipt4(); receipt.printTitle(); receipt.print("Jack's Magic Beans", 4, 4.25); receipt.print("Princess Peas", 3, 5.1); receipt.print("Three Bears Porridge", 1, 14.29); receipt.printTotal(); } } ======================================================================= Item Qty Price ---- --- ----- Jack's Magic Be 4 4.25 Princess Peas 3 5.10 Three Bears Por 1 14.29 Tax 1.42 ----- Total 25.06
13.5.5 Formatter转换
程序中每个变量都可以用到b,置于不是null就是true
练习5
/* For each of the basic conversion types in the above table, write the * most complex formatting expression possible. That is, use all the possible * format specifiers available for that conversion type. */ import java.math.*; import java.util.*; public class Ex5 { public static void main(String[] args) { Formatter f = new Formatter(System.out); char u = 'a'; System.out.println("char u = \'a\'"); f.format("%-2s%-2S%-2c%-2C%-5b%-5B%-3h%-3H%%\n", u,u,u,u,u,u,u,u); int v = 121; System.out.println("int v = 121"); f.format("%-4s%-4S%-4d%-4c%-4C%-5b%-5B%-4x%-4X%-4h%-4H%%\n", v,v,v,v,v,v,v,v,v,v,v); BigInteger w = new BigInteger("50000000000000"); System.out.println("BigInteger w = 50000000000000"); f.format("%-15s%-15S%-5b%-5B%-15x%-15X%-5h%-5H%%\n", w,w,w,w,w,w,w,w); double x = 179.543; System.out.println("double x = 179.543"); f.format("%-8s%-8S%-5b%-5B%-15f%-15e%-15E%-12h%-12H%%\n", x,x,x,x,x,x,x,x,x); boolean z = false; System.out.println("boolean z = false"); f.format("%-7s%-7S%-7b%-7B%-7h%-7H%%\n", z,z,z,z,z,z); } } char u = 'a' a A a A true TRUE 61 61 % int v = 121 121 121 121 y Y true TRUE 79 79 79 79 % BigInteger w = 50000000000000 50000000000000 50000000000000 true TRUE 2d79883d2000 2D79883D2000 8842a1a78842A1A7% double x = 179.543 179.543 179.543 true TRUE 179.543000 1.795430e+02 1.795430E+02 1ef462c 1EF462C % boolean z = false false FALSE false FALSE 4d5 4D5 %
13.5.6 String.format
生成格式化的String对象
练习6
/* Create a class that contains int, long, float, and double fields. Create * a toString() method for this class that uses String.format(), and demonstrate * that your class works correctly. */ import java.util.*; public class Ex6 { int i = 0; long l = 0; float f = 0.0f; double d = 0.0; Ex6(int i, long l, float f, double d) { this.i = i; this.l = l; this.f = f; this.d = d; } public String toString() { return String.format("i = %d\nl = %d\nf = %.16g\nd = %.16g\n", i, l, f, d); } public static void main(String[] args) { Ex6 x = new Ex6(2, 45l, 1.2f, 2.7182818289); Ex6 ex = new Ex6(-2147483648, -9223372036854775808L, 1.1754943508222875E-38f, 2.2250738585072014E-308); Ex6 exMax = new Ex6(2147483647, 9223372036854775807L, 3.4028234663852886E38f, 1.7976931348623157E308); System.out.println(x); System.out.println(ex); System.out.println(exMax); } } ============================================================== i = 2 l = 45 f = 1.200000047683716 d = 2.718281828900000 i = -2147483648 l = -9223372036854775808 f = 1.175494350822288e-38 d = 2.225073858507201e-308 i = 2147483647 l = 9223372036854775807 f = 3.402823466385289e+38 d = 1.797693134862316e+308
13.6 正则表达式
强大而且灵活的文本处理工具。使用正则表达式,我能能以编程的方式构造复杂的文本模式,并对输入的字符串进行搜索。简洁的动态语言。能够解决各种字符串处理相关的问题:匹配,选择,编辑以及验证。
13.6.1 基础
正则表达式就是以某种东西来描述字符串
String类自带正则表达式的工具是
spilt()
replace()
- 负号在前面 -?
- \\d 表示一位数字
- 可能有一个负号后面跟着多个数字 -?\\d+
- 加或者减或二者都没有 (-|\\+)?
- \W 非单词字符
- \w 表示一个单词字符
练习7
/* Using the documentation for java.util.regex.Pattern as a resource, * write and test a regular expression that checks a sentence to see * that it begins with a captial letter and ends with a period. */ import java.util.regex.*; public class Sentence7 { public static void main(String[] args) { // starts with any capital A through Z // then zero or more of any char except endline // ends with . String sen = "^[A-Z].*[\\.]$"; String s1 = "Once upon a time."; String s2 = "abcd."; String s3 = "Abcd?"; String s4 = "An easy way out."; String s5 = "Zorro."; String s6 = "X."; System.out.println(s1.matches(sen)); System.out.println(s2.matches(sen)); System.out.println(s3.matches(sen)); System.out.println(s4.matches(sen)); System.out.println(s5.matches(sen)); System.out.println(s6.matches(sen)); } } ===================================================== true false false true true true
练习8
// Split the string Splitting.knights on the words "the" or "you." import java.util.*; public class Splitting8 { public static String knights = "Then, when you have found the shrubbery, you must " + "cut down the mightiest tree in the forest... " + "with... a herring!"; public static void split(String regex) { System.out.println(Arrays.toString(knights.split(regex))); } public static void main(String[] args) { split("the|you"); } } ===============================================================[Then, when , have found , shrubbery, , must cut down , mightiest tree in , forest... with... a herring!]
练习9
public class Replacing9 { public static String knights = "Then, when you have found the shrubbery, you must " + "cut down the mightiest tree in the forest... " + "with... a herring!"; public static void main(String[] args) { System.out.println(knights.replaceAll("[aeiouAEIOU]", "_")); } } ====================================================== Th_n, wh_n y__ h_v_ f__nd th_ shr_bb_ry, y__ m_st c_t d_wn th_ m_ght__st tr__ _n th_ f_r_st... w_th... _ h_rr_ng!
13.6.2 创建正则表达式
正则表达式完整构造子列表,参考 JDK文档, java.util.regex.Pattern
13.6.3 量词
量词描述了一个模式吸收输入文本的方式:
- 贪婪型:贪婪表达式会为所有可能的模式发现尽可能多的匹配。我们的模式仅能匹配第一个可能的字符组,如果它是贪婪的,他就会继续向下匹配
- 勉强型:用问好来指定,这个量词匹配满足模式所需的最少字符数。最少匹配。
- 占有型:只在Java理由,正则表达式在应用于字符串时,会产生很多中间状态,以便匹配失败而回溯,而占有量词并不保存中间态,防止回溯
表达式X应该用园括号括起来
(abc)+ 匹配一个或多个abc序列
abc+ 匹配ab,后面跟随一个或多个c
字符序列的一般化定义
接口CharSequence
interface CharSequence{ charAt(int i); length(); subSequence(int start, int end); toString(); }
13.6.4 Pattern 和 Matcher
功能更强大的正则表达式对象。
Java.util.regex包
static Pattern.compile()方法来编译正则表达式,根据String类型的正则表达式生成一个Pattern对象
把你想要检索的字符串传入Pattern对象的matcher()方法,会生成一个matcher对象。它有很多功能
- boolean matches() 整个输入字符串是否匹配正则表达式模式
- boolean loonkingAt() 判断该字符串的始部分是否能够匹配模式
- boolean find()
- boolean find(int start)
练习10
/* For the phrase "Java now has regular expressions" evaluate whether the following * expressions will find a match: * ^Java * \Berg.* * n.w\s+h(a|i)s * S? * S+ * s{4} * s{1}. * s{0,3} */ // Use args: "Java now has regular expressions", "^Java", "\Berg.*", "n.w\s+h(a|i)s", // "s?", "s+", "s{4}", "s{1}.", "s{0,3}" import java.util.regex.*; public class TestRegularExpression10 { public static void main(String[] args) { if(args.length < 2) { System.out.print("Usage:\njava TestRegularExpression " + "characterSequence regularExpression+"); System.exit(0); } System.out.print("Input: \"" + args[0] + "\""); for(String arg : args) { System.out.print("Regular expression: \"" + arg + "\""); Pattern p = Pattern.compile(arg); Matcher m = p.matcher(args[0]); if(!m.find()) System.out.print("No match found for " + "\"" + arg + "\""); m.reset(); while(m.find()) { System.out.print("Match \"" + m.group() + "\" at position " + m.start() + ((m.end() - m.start() < 2) ? "" : ("-" + (m.end() - 1)))); } } } } ==============================================================================Usage: java TestRegularExpressioncharacterSequence regularExpression+
Matcher.find()方法可以用来在CharSequecne中查找多个匹配
find像迭代器那样向前遍历字符串,find()能够接受一个整数做为参数,不断重新定义搜索起始位置
Groups
组是用括号划分的正则表达式,可以根据组的编号来引用组。组号0表示整个表达式,组号1表示第一个被括号括起来的组。
A(B(C))D
0:ABCD
1:BC
2:C
练习12
// TIJ4 Chapter Strings, Exercise 12, page 536 /* Modify Groups.java to count all of the unique words that do not start with a * capital letter. */ import java.util.regex.*; import java.util.*; public class Groups12 { static public final String POEM = "Twas brillig, and the slithy toves\n" + "Did gyre and gimble in the wabe.\n" + "All mimsy were the borogoves,\n" + "And the mome raths outgrabe.\n\n" + "Beware the Jabberwock, my son,\n" + "The jaws that bite, the claws that catch,\n" + "Beware the Jubjub bird, and shun\n" + "The frumious Bandersnatch."; public static void main(String[] args) { Matcher m = Pattern.compile("([a-z]|\\s+[a-z])\\w+").matcher(POEM); Set<String> words = new TreeSet<String>(); while(m.find()) { words.add(m.group()); } System.out.println("Number of unique non-cap words = " + words.size()); System.out.print(words); } } ======================================================================= Number of unique non-cap words = 25 [ and, bird, bite, borogoves, brillig, catch, claws, frumious, gimble, gyre, in, jaws, mimsy, mome, my, outgrabe, raths, shun, slithy, son, that, the, toves, wabe, were]
Start()与end()
start()返回先前匹配的起始位置索引,end()返回所匹配的最后字符的索引加一的值。
group 只返回匹配好的值
练习13
// Modify StartEnd.java so that it uses Groups.POEM as input, but still produces positive // outputs for find(), lookingAt() and matches(). import java.util.regex.*; public class StartEnd13 { public static String input = "Twas brillig, and the slithy toves\n" + "Did gyre and gimble in the wabe.\n" + "All mimsy were the borogoves,\n" + "And the mome raths outgrabe.\n\n" + "Beware the Jabberwock, my son,\n" + "The jaws that bite, the claws that catch,\n" + "Beware the Jubjub bird, and shun\n" + "The frumious Bandersnatch."; private static class Display { private boolean regexPrinted = false; private String regex; Display(String regex) { this.regex = regex; } void display(String message) { if(!regexPrinted) { System.out.print(regex); regexPrinted = true; } System.out.println(message); } } static void examine(String s, String regex) { Display d = new Display(regex); Pattern p = Pattern.compile(regex); Matcher m = p.matcher(s); while(m.find()) d.display("find() '" + m.group() + "' start = " + m.start() + " end = " + m.end()); if(m.lookingAt()) // No reset() necessary d.display("lookingAt() start = " + m.start() + " end = " + m.end()); if(m.matches()) // No reset() necessary d.display("matches() start = " + m.start() + " end = " + m.end()); } public static void main(String[] args) { for(String in : input.split("\n")) { System.out.println("input : " + in); for(String regex : new String[]{"\\w*are\\w*", "A\\w*", "T\\w+", "Did.*"}) examine(in, regex); } } } =========================================================================== input : Twas brillig, and the slithy toves T\w+find() 'Twas' start = 0 end = 4 lookingAt() start = 0 end = 4 input : Did gyre and gimble in the wabe. Did.*find() 'Did gyre and gimble in the wabe.' start = 0 end = 32 lookingAt() start = 0 end = 32 matches() start = 0 end = 32 input : All mimsy were the borogoves, A\w*find() 'All' start = 0 end = 3 lookingAt() start = 0 end = 3 input : And the mome raths outgrabe. A\w*find() 'And' start = 0 end = 3 lookingAt() start = 0 end = 3 input : input : Beware the Jabberwock, my son, \w*are\w*find() 'Beware' start = 0 end = 6 lookingAt() start = 0 end = 6 input : The jaws that bite, the claws that catch, T\w+find() 'The' start = 0 end = 3 lookingAt() start = 0 end = 3 input : Beware the Jubjub bird, and shun \w*are\w*find() 'Beware' start = 0 end = 6 lookingAt() start = 0 end = 6 input : The frumious Bandersnatch. T\w+find() 'The' start = 0 end = 3 lookingAt() start = 0 end = 3
Pattern 标记
接受标记参数,以调整匹配行为:
Pattern Pattern.compile(String regex,int flag)
flag来自以下Pattern类中的常量
13.6.5 spilt()
spilt()方法将输入字符串断开成字符串对象数组,断开边界又下列正则表达式确定
- String[] split(CharSequence input)
- String[] split(CharSequence input,int limit) 限制字符串的数量
13.6.6 替换操作
replaceFirst(String replacement)
replaceAll(String replacement)
appendReplacement(StringBuffer sbuf,Stringreplacement) 允许在执行替换的过程中,操作来替换的字符串。
13.6.7 reset()
可以将现有的Matcher对象应用于一个新的字符序列,不带参数的reset()方法,可以将Matcher对象重新设置到当前字符序列的起始位置
13.6.8 正则表达式与Java I/O
应用正则表达式在一个文件中进行搜索匹配操作 。
练习15
// Modify JGrep.java to accept flags as arguments (e.g., Pattern.CASE_INSENSITIVE, // Pattern.MULTILINE). // {Args: JGrep15.java "\b[Ssct]\w+", Pattern.CASE_INSENSITIVE} import java.util.regex.*; public class JGrep15 { public static void main(String[] args) throws Exception { if(args.length < 3) { System.out.println("Usage: java JGrep file regex flag"); System.exit(0); } int flag = 0; if(args[2].equals("Pattern.CASE_INSENSITIVE")) flag = Pattern.CASE_INSENSITIVE; else if(args[2].equals("Pattern.CANON_EQ")) flag = Pattern.CANON_EQ; else if(args[2].equals("Pattern.COMMENTS")) flag = Pattern.COMMENTS; else if(args[2].equals("Pattern.DOTALL")) flag = Pattern.DOTALL; else if(args[2].equals("Pattern.LITERAL")) flag = Pattern.LITERAL; else if(args[2].equals("Pattern.MULTILINE")) flag = Pattern.MULTILINE; else if(args[2].equals("Pattern.UNICODE_CASE")) flag = Pattern.UNICODE_CASE; else if(args[2].equals("Pattern.UNIX_LINES")) flag = Pattern.UNIX_LINES; Pattern p = Pattern.compile(args[1], flag); // Iterate through the lines of the input file: int index = 0; Matcher m = p.matcher(""); // creates empty Matcher object for(String line : new TextFile(args[0])) { m.reset(line); while(m.find()) System.out.println(index++ + ": " + m.group() + ": " + m.start()); } } }
练习17
/* Write a program that reads a Java source-code file (you provide the * file name on the command line) and displays all the comments. */ // {Args: fileName} import java.util.regex.*; import java.io.*; public class Ex17 { public static void main(String[] args) throws Exception { if(args.length < 1) { System.out.println("Usage: fileName"); System.exit(0); } Pattern p = Pattern.compile("(//\\s.+)|(/\\*\\s+.+)|(\\*\\s+.+)"); // Iterate through the lines of the input file: int index = 0; Matcher m = p.matcher(""); // creates emtpy Matcher object System.out.println(args[0] + " comments: "); for(String line : new TextFile(args[0])) { m.reset(line); while(m.find()) System.out.println(index++ + ": " + m.group()); } } }
练习18
/* Write a program that reads a Java source-code file (you provide the * file name on the command line) and displays all the string literals * in the code. */ // {Args: fileName} import java.util.regex.*; import net.mindview.util.*; import java.io.*; public class Ex18 { public static void main(String[] args) throws Exception { if(args.length < 1) { System.out.println("Usage: fileName"); System.exit(0); } Pattern p = Pattern.compile("\".*\""); System.out.println(args[0] + " string literals:"); // Iterate through the lines of the input file: int index = 0; Matcher m = p.matcher(""); // creates emtpy Matcher object for(String line : new TextFile(args[0])) { m.reset(line); while(m.find()) System.out.println(index++ + ": " + m.group()); } } }
练习19
/* Building on the previous two exercises, write a program that examines * Java source-code and produces all the class names used in a particular * program. */ // {Args: fileName} import java.util.regex.*; import net.mindview.util.*; import java.io.*; public class Ex19 { public static void main(String[] args) throws Exception { if(args.length < 1) { System.out.println("Usage: fileName"); System.exit(0); } // we want all class names: Pattern p = Pattern.compile("class \\w+\\s+"); // not including those in comment lines: Pattern q = Pattern.compile("^(//|/\\*|\\*)"); System.out.println("classes in " + args[0] + ":"); // Iterate through the lines of the input file: int index = 0; Matcher m = p.matcher(""); // creates emtpy Matcher object Matcher n = q.matcher(""); for(String line : new TextFile(args[0])) { m.reset(line); n.reset(line); while(m.find() && !n.find()) System.out.println(index++ + ": " + m.group()); } } }
练习20
/* Create a class that contains int, long, float and double and String fields. * Create a constructor for this class that takes a single String argument, and * scans that string into the various fields. Add a toString(0 method and * demonstrate that your class works correctly. */ import java.util.*; public class Scanner20 { int i; long L; float f; double d; String s; Scanner20(String s) { Scanner sc = new Scanner(s); i = sc.nextInt(); L = sc.nextLong(); f = sc.nextFloat(); d = sc.nextDouble(); this.s = sc.next(); } public String toString() { return i + " " + L + " " + f + " " + d + " " + s; } public static void main(String[] args) { Scanner20 s20 = new Scanner20("17 56789 2.7 3.61412 hello"); System.out.println(s20); } }
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· Linux系列:如何用 C#调用 C方法造成内存泄露
· AI与.NET技术实操系列(二):开始使用ML.NET
· 记一次.NET内存居高不下排查解决与启示
· 探究高空视频全景AR技术的实现原理
· 理解Rust引用及其生命周期标识(上)
· 物流快递公司核心技术能力-地址解析分单基础技术分享
· .NET 10首个预览版发布:重大改进与新特性概览!
· 单线程的Redis速度为什么快?
· 展开说说关于C#中ORM框架的用法!
· Pantheons:用 TypeScript 打造主流大模型对话的一站式集成库