Thinking in Java——笔记(13)

Strings


Immutable Strings

  • Objects of the String class are immutable.
  • Every method in the class that appears to modify a String actually creates and returns a brand new String object containing the modification.
  • To the reader of the code, an argument usually looks like a piece of information provided to the method, not something to be modified.

Overloading ‘+’ vs. StringBuilder

  • Because a String is read-only, there’s no possibility that one reference will change something that will affect the other references.
  • There was no mention of StringBuilder in the source code, but the compiler decided to use it anyway, because it is much more efficient.
  • (Using overloading) you’re going to get a new StringBuilder object every time you pass through the loop.
  • If looping is involved, you should explicitly use a StringBuilder in your toString( ).
  • Prior to Java SE5, Java used StringBuffer, which ensured thread safety and so was significantly more expensive.

Unintended recursion

  • If you really do want to print the address of the object, the solution is to call the Object toString( ) method.

Operations on Strings

  • Every String method carefully returns a new String object when it’s necessary to change the contents.
  • If the contents don’t need changing, the method will just return a reference to the original String.

Formatting output

printf()

  • format specifier: in addition to telling where to insert the value, they also tell what kind of variable is to be inserted and how to format it.

System.out.format()

  • Java SE5 introduced the format( ) method, available to PrintStream or PrintWriter objects , which includes System.out.
  • The format( ) method is modeled after C’s printf( ).
  • format( ) and printf( ) are equivalent.

The Formatter class

  • You can think of Formatter as a translator that converts your format string and data into the desired result.
  • The most useful are PrintStreams, OutputStreams, and Files.

Format specifiers

  • Specifying a width to control the minimum size of a field.
  • precision is used to specify a maximum.

Formatter conversions

  • The ‘b’ conversion works for each variable above. Although it’s valid for any argument type, it might not behave as you’d expect.
  • For any other argument, as long as the argument type is not null the result is always true.

String.format()

  • String.format( ) is a static method which takes all the same arguments as Formatter’s format( ) but returns a String.

Regular expressions

  • They allow you to specify, programmatically, complex patterns of text that can be discovered in an input string.
  • Although the syntax of regular expressions can be intimidating at first, they provide a compact and dynamic language that can be employed to solve all sorts of string processing, matching and selection, editing, and verification problems in a completely general way.

Basics

  • In Java, '\' means "I’m inserting a regular expression backslash, so that the following character has special meaning."
  • The simplest way to use regular expressions is to use the functionality built into the String class.
  • A regular expression doesn’t have to contain special characters.
  • The non-String regular expressions have more powerful replacement tools.
  • Non-String regular expressions are also significantly more efficient if you need to use the regular expression more than once.

Creating regular expressions

  • Once you start writing regular expressions, you’ll often use your code as a reference when writing new regular expressions.

Quantifiers

  • Greedy: A greedy expression finds as many possible matches for the pattern as possible.
  • Reluctant: this quantifier matches the minimum number of characters necessary to satisfy the pattern.
  • Possessive: it generates many states so that it can backtrack if the match fails.

Pattern and Matcher

  • In general, you’ll compile regular expression objects rather than using the fairly limited String utilities.
  • A Pattern object represents the compiled version of a regular expression.
  • You can use the matcher( ) method and the input string to produce a Matcher object from the compiled Pattern object.
  • The Matcher object is then used to access the results, using methods to evaluate the success or failure of different types of matches.

find()

  • Matcher.find( ) can be used to discover multiple pattern matches in the CharSequence to which it is applied.
  • find( ) can be given an integer argument that tells it the character position for the beginning of the search.

Groups

  • Groups are regular expressions set off by parentheses that can be called up later with their group number.

start() and end()

  • Following a successful matching operation, start( ) returns the start index of the previous match, and end( ) returns the index of the last character matched, plus one.
  • While matches( ) only succeeds if the entire input matches the regular expression, lookingAt( ) succeeds if only the first part of the input matches.

Scanning input

  • The usual solution is to read in a line of text, tokenize it, and then use the various parse methods of Integer, Double, etc., to parse the data.
  • With Scanner, the input, tokenizing, and parsing are all ensconced in various different kinds of "next" methods.

Scanner delimiters

  • You can also specify your own delimiter pattern in the form of a regular expression.

Scanning with regular expressions

  • You can also scan for your own user- defined patterns, which is helpful when scanning more complex data.
  • The pattern is matched against the next input token only, so if your pattern contains a delimiter it will never be matched.
posted @ 2016-12-13 17:05  玄天强  阅读(347)  评论(0编辑  收藏  举报