读后笔记 -- Java核心技术（第11版卷 II） Chapter2 输入与输出

2.1 输入 / 输出流

No relationship with java.util.stream.

2.1.1-2.1.3 读写字节

1) Easiest to use static methods from the java.nio.file.Files class:

1 Path path = Path.of(filenameString);            // better than Paths.get()，其实 Paths.get() 调用的就是 Path.of()
2 InputStream in = Files.newInputStream(path);
3 OutputStream out = Files.newOutputStream(path);

2) Get an input stream from any URL:

1 URL url = new URL("http://horstmann.com/index.html");
2 InputStream in = url.openStream();

3) Get an input stream from a byte[] array or write to a byte[] array:

// get an input stream from a byte[] array
byte[] bytes = ...;
InputStream in = new ByteArrayInputStream(bytes);

// Conversely, you can write to a ByteArrayOutputStream and then collect the bytes:
ByteArrayOutputStream out = new ByteArrayOutputStream();
Write to out
byte[] bytes = out.toByteArray();

4) The read method returns a single byte (as an int) or -1 at the end of input:

1 InputStream in = ...;
2 int b = in.read();
3 if (b != -1) { byte value = (byte) b; ...}

It is more common to read bytes in bulk:

1 byte[] bytes = ...;
2 int len = in.read(bytes);

5) No method for reading all bytes from a stream. Here is one solution:

1 ByteArrayOutputStream out = new ByteArrayOutputStream();
2 byte[] bytes = new byte[1024];
3 while ((len = in.read(bytes)) != -1) {out.write(bytes, 0 , len);}   // -1: end of the input stream。该方法可以一次读写指定长度的 bytes[]
4 bytes = out.toByteArray();

For files, just call:

1 byte[] bytes = Files.readAllBytes(path);    // from Java 9

6) You can write one byte or bytes from an array:

1 OutputStream out = ...;
2 int b = ...;
3 out.write(b);   // one byte

4 byte[] bytes = ...;
5 out.write(bytes);    // btyes from an array
6 out.write(bytes, start, length);

7) When writing to a stream, close it when you are done:

out.close();

Or better, use a try-with-resources block (resource will be automatically closed):

1 try (OutputStream out = ...) {
2     out.write(bytes);
3 }

8) To save an input stream to a file, call:

1 Files.copy(in, path, StandardCopyOption.REPLACE_EXISTING);

Java 9/10 new feature:

1. There is finally a method to read all bytes from an input stream（解决了上面 4) 的限制）: byte[] bytes = url.openStream().readAllBytes();
There is also readNBytes.
2. InputStream.transferTo(OutputStream) transfer all bytes from an input stream to an output stream.
3. Java 10: Reader.transferTo(Writer)
4. Java 10: Character sets in PrintWriter, Scanner, etc. can be specified as Charset instead of String. new Scanner(path, StandCharsets.UTF_8)
5. Scanner.tokens gets a stream of tokens, similiar to Pattern.splitAsStream from Java 8: Stream<String> tokens = new Scanner(path).useDelimiter("\\s*,\\s*).tokens();

2.1.4 读写文本文件

1. Summary:

InputStream/Outputstream process bytes.
Text files contain characters.
Java uses Unicode for characters.
Readers/Writers convert between bytes and characters.
Always specify the character encoding. Use StandardCharsets.UTF_8 for Charset parameters, "UTF-8" for string parameters.

2. You can obtain a Reader for any input stream:

1 InputStream inStream = ...;
2 Reader in = new InputStreamReader(inStream, charset);

The read methods reads one char value, it's too low-level for most purposes.

　　1) You can read a short file into a string:

1 String content = new String(Files.readAllBytes(path), charset);            // Files.readAllBytes(path) returns byte[], then call new String() to convert to String

　　2) You can get all lines as a list or stream:

1 List<String> lines = Files.readAllLines(path, charset);
2 
3 try (Stream<String> lines = Files.lines(path, charset)) {
4     ...
5 }

3. Use a Scanner to split input into numbers, words, and so on:

Scanner in = new Scanner(path, "UTF-8");
while (in.hasNextDouble()) {
    double value = in.nextDouble();
    ...
}

// To read words, set the delimeter to any sequence of non-letters (sample in textFile\ScannerTest.java):
// method1: in.useDelimiter
in.useDelimiter("\\PL+");     
while (in.hasNext()) {
    String word = in.next();
    ...
}

// method2: in.tokens()
Stream<String> words = in.tokens();

4. To write to a file, make one of these calls as following. Then call out.print, out.println, or out.printf to produce output.

1 PrintWriter out = new PrintWriter(Files.newBufferedWriter(path, charset));
2 
3 PrintWriter out = new PrintWriter(filenameString, charsetString);

// write data to file
out.println(data);

Remeber to close the file: try (PrintWriter out = ... ) {...}

If you already have the entire output in a string, or a collection of lines, call:

1 Files.write(path, contentString.getBytes(charset));
2 Files.write(path, lines, charset);

You can also append output to a file:

1 Files.write(path, lines, charset, StandardOpenOption.APPEND);

5. Sometimes, a library method wants a Writer object. Example:

1 Throwable.printStackTrace(PrintWriter out)

If you want to capture the output in a string, not a file, use a StringWriter:

1 StringWriter writer = new StringWriter();            // StringWriter 是将一个字符发送到字符串，而不是磁盘文件。另外，StringWriter 本身没有 print 方法，所以需要将其包装到 PrinterWriter 中
2 throwable.printStackTrace(new PrintWriter(writer));

Now you can process the stack trace as string:

1 String stackTrace = writer.toString();

下列方法适合处理中等长度的文本文件：

Files.readAllBytes()、Files.readString()、Files.readAllLines()、Files.writeString()、Files.write()、

下列方法适合处理大文件或二进制文件：

InputStream in = Files.newInputStream(path);
OutputStream out = Files.newOutputStream(path);
 
Reader in = Files.newBufferedReader(path, charset);    // 返回 BufferReader()，BufferReader类扩展了Reader类
Writer out = Files.newBufferedWriter(path, charset);

2.2/2.5 读写二进制数据

1. 处理二进制文件

　　DataInput / DataOutput interfaces have methods readInt / writeInt, readDouble / writeDouble, and so on.

　　Can wrap any stream into a DataInputStream / DataOutputStream:

1 DataInput in = new DataInputStream(new FileInputStream(path));
2 DataOutput out = new DataOutputStream(new FileOutputStream(path));

　　Reading / writing stream data is sequential.

2. 随机访问文件

2.1 方式一： RandomAccessFile （section 2.2.2）

　　"Random access file": You can jump to any file position and start reading/writing. Open with "r" for reading or "rw" for writing:

1 RandomAccessFile file = new RandomAccessFile(filenameString, "rw");

　　The getFilePointer method yields the current position (as a long).

　　The seek method moves to a new position. Example: Increment an integer that you just read:

1 int value = file.readInt(); 
2 file.seek(file.getFilePointer() - 4);   // 第1句读取一个整数，此时位置偏移。此时读取当前位置 - 4（整数长度），即回到了刚才的位置
3 file.writeInt(value + 1);

2.2 方式二：内存映射文件 Memory-Mapped Files（section 2.5）

　　A memory-mapped file provides very efficient random access for large files. (Uses operating system mechanism for virtal memory.)

// step1: Get a channel for the file:
FileChannel channel = FileChannel.open(path, StandardOpenOption.READ, StandOpenOption.WRITE);

// step2: Map an area of the file (or all of it) into memory:
ByteBuffer buff = channel.map(FileChannel.MapMode.READ_WRITE, 0, channel.size());

// step3: You use methods get, getInt, getDouble, and so on to read, and the equivalent put methods to write:
int position = ...;
int value = buffer.getInt(position);
buffer.put(position, value + 1);

The file is updated at some point, and certainly when the channel is closed (can use with try-with-resources).

2.4 操作文件（创建、访问、删除文件和目录）: Path, Files

1. Working with Path

　　Path objects specify abstract path names (which may not currently exist on disk). Sequence of directory names, optionally followed by a file name. First component may be a root component such as / or C:\.

　　Use Paths.get / Path.of to create paths:

1 Path absolute = Paths.get("/", "home", "cay");     // start with root
2 Path relative = Paths.get("myapp", "conf", "user.properties");

　　Path separator / or \ is suppiled for the default file system. If you know which platform your program is running, you can provide a string with separators:

1 Path homeDirectory = Paths.get("/home/cay");

1.1. The call p.resolve(q) computes "p then q". If q is absolute, that's just q, otherwiszie, first follow p, then follow q:

1 Path workPath = homeDirectory.resolve("myapp/work");

1.2. The oppostie of resolve is relativize, yielding "how to get from p to q".

1 Paths.get("/home/cay").relativize(Paths.get("/home/fred/myapp")) 
2 // yields "../fred/myapp"

1.3. normalize removes . or directory/../ and other redundancies.

1.4. toAbsolutePath makes a path absolute.

2. Taking Paths Apart

　　Utility methods to get at the most important parts:

1 Path p = Paths.get("/home", "cay", "myapp.properties");

2 Path parent = p.getParent();                   // The path /home/cay
3 Path file = p.getFileName();                   // The last element, myapp.properties
4 Path root = p.getRoot();                       // The initial segment / (null for a relative path)
5 Path first = p.getName(0);                     // The first element, home
6 Path dir = p.subpath(1, p.getNameCount());     // All but the first element, cay/myapp.properties

　　You can iterate over the components:

1 for (Path component : path) {
2     ...
3 }

　　To interoperate with legacy File class, use:

1 File file = path.toFile();
2 Path path = file.toPath();

3. Files

2.4.3 To create a new directory, call:

1 Files.createDirectory(path);      // All but the last component must exist。仅创建下一级目录
2 Files.createDirectories(path);    // Missing components are created. 创建路径中的中间目录即可创建多级目录

　　You can create an empty file, If the file exists, an exception occurs. Check and creation are atomic.

1 Files.createFile(path);

　　Convencience methods for creating temporary files:

1 Path tempFile = Files.createTempFile(dir, prefix, suffix);
2 Path tempFile = Files.createTempFile(prefix, suffix);
3 
4 Path tempDir = Files.createTempDirectory(dir, prefix);
5 Path tempDir = Files.createTempDirectory(prefix);

　　Files.createTempFile(null, ".txt") might return a path such as /tmp/1234405522364837194.txt.

　　Files.exists(path) checks whether a path currently exists.

　　Use Files.isDirectory(path), Files.isRegularFile(path), Files.isSymbolicLink(path) to find out whether the path is directory, file, or symlink. More infor: isHidden, isExecutable, isReadable, isWritable of the Files class.

　　Files.size(path) reports the file size as a long value.

2.4.4 Use the copy or move method:

1 Files.copy(fromPath, toPath);
2 Files.move(fromPath, toPath);

　　Can define behavior with copy options:

1 Files.copy(fromPath, toPath, StandardCopyOption.REPLACE_EXISTING, StandardCopyOption.COPY_ATTRIBUTES);
2 Files.move(fromPath, toPath, StandardCopyOption.ATOMIC_MOVE);

　　Delete a file like this:

1 Files.delete(path);    // throws exception if path doesn't exist
2 boolen deleted = Files.deleteIfExists(path);

2.4.6 Files.list(dirpath) yields a Stream<Path> of the directory entries. The directory is read lazily -- efficient for huge directories. Be sure to close the stream. （Files.list 不会进入子目录）

1 try (Stream<Path> entries = Files.list(pathToDirectory)) {...}

　　Call Files.walk(dirpath) to visit all descendants of subdirectories as well. Descendants are visited in depth-first order.

1 try (Stream<Path> entries = Files.walk(pathToRoot)) {
2　 　 entries.foreach(System.out.println);
3 }

　　If you filter results by file attributes (size, creation time, and so on), use find instead of walk for greater efficiency:

1 Files.find(path, maxDepth, (path, attr) -> attr.size() > 10000)

　　Use Files.walk to copy a directory tree: // JDK 目前没有提供方法来实现复制目录

Files.walk(source).forEach(p -> {
    try {
        Path q = target.resolve(source.relativize(p));
        if (Files.isDirectory(p)) Files.createDirectory(q);
        else Files.copy(p, q);
    catch (IOException ex) {
        throw new UncheckedIOException(ex);
    }
});

　　Unfortunately, this approach doesn't work for deleting a directory tree. Need to vist children before deleting the parent. => Use FileVisitor instead:

 // Delete the directory tree starting at root
 1 Files.walkFileTree(root, new SimpleFileVisitor<Path>() {
 2     public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) throws IOException {
 3         Files.delete(file);
 4         return FileVisitResult.CONTINUE;
 5     }
 6     public FileVisitResult postVisitDirectory(Path dir, IOException ex) throws IOException {
 7     　　if (ex != null) throw ex;
 8     　　Files.delete(dir);
 9     　　return FileVisitResult.CONTINUE;
10     }
11 });

Paths class looks up paths in the default file system.

4. ZIP file sytem

　　Can have file system for the files in a ZIP archive:

1 FileSystem zipfs = FileSystems.newFileSystem(Paths.get(zipname), (ClassLoader) null);

　　Copy out a file if you know its name:

1 Files.copy(zipfs.getPath(sourceName), targetPath);

　　To list all files in an archive, walk the file tree:

1 Files.walk(zipfs.getPath("/", forEach(p -> { Process p });

　　Here is the magic incantation for creating a zip file:

1 Path zipPath = Paths.get("myfile.zip);
2 URI uri = new URI("jar", zipPath.toUri().toString(), null);        // uri: jar:file:///C:/Users/xxxxxx/IdeaProjects/trunk/lessonlearn_coreJava/1.zip
3     // Constructs the URI jar:file://myfile.zip
4 try (FileSystem zipfs = FileSystems.newFileSystem(uri, Collections.singletonMap("create", "true"))) {
5     // To add files, copy them into the ZIP file system
6     Files.copy(sourcePath, zipfs.getPath("/").resolve(targetPath));
7 }

5. Java 11 新特性

String.lines yields a stream of all lines in a string;
String.strip trims Unicode whitespace;
Path.of does the same as Paths.get -- more consistent and shorter;
Files.readString reads a file into a string;
OutputStream nullOutputStream() provides a null stream;
Analogous methods for InputStream, Reader, Writer;

2.x 处理互联网上的数据

　　You can read data from a given URL. That gets you the contents of the URL(from the GET request).

1 URL url = new URL("http://hostmann.com/index.html");
2 InputStream in = url.openStream();

　　Sometimes, you need to use the URLConnection class for more complex cases:

Making a POST request
Setting request headers
Reading response headers

// 1. Get an URLConnection object:
    URLConnection connection = url.openConnection();

// 2. Set request properties:
    connection.setRequestProperty("Accept-Charset", "UTF-8, ISO-8859-1");

// 3. Send data to the server:
    connection.setDoOutput(true);
    try (OutputStream out = connection.getOutputStream()) { Write to out }

// 4. Read the response headers:
    connection.connect();     // If you skipped step 3
    Map<String, List<String>> headers = connection.getHeaderFields();

// 5. Read the response:
    try (InputStream in = connection.getInputStream()) { Read from in }

　　When writing to a HttpURLConnection, the default encoding is application/x-www-form-urlencoded. But you still need to encode the name/value pairs.

　　Suppose POST data are given in a map:

URLConnection connection = url.openConnection();
connection.setDoOutput(true);
try (Writer out = new OutputStreamWriter(connection.getOutputStream(), StandardCharsets.UTF_8)) {
    boolean first = true;
    for (Map.Entry<String, String> entry : postData.entrySet()) {
        if (first) first = false;
        else out.write("&");
        out.write(URLEncoder.encode(entry.getKey(), "UTF-8");
        out.write("=");
        out.write(URLEncoder.encode(entry.getValue(), "UTF-8");
    }
}

Java 9 HttpClient:

// Build a client:
HttpClient client = HttpClient.newBuilder()
    .fllowRedirects(HttpClient.Redirect.ALWAYS)
    .build();

// Build a request:
HttpRequest request = HttpRequest.newBuilder()
    .uri(URI.create("http://horstmann.com"))
    .GET()
    .build();

// Get and handle response:
HttpResponse<String> reponse = client.send(request, HttpResponse.BodyHandlers.ofString());

// Asynchronous processing:
Client.sendAsync(request, HttpResponse.BodyHandlers.ofString())
    .completeOnTimeout("<html></html>", 10, TimeUnit.SECONDS)
    .thenAccept(response -> Process response.body());

2.7 正则表达式

2.7.1 基本语法

　　Regualr expressions (regex) specify string patterns.

The regex [Jj]e?a.+ matches Java and jealous but not Jim or ja
Special characters . * + ? { | ( ) [ \ ^ $
. matches any character, * is 0 or more, + 1 or more, ? 0 or 1 repetition
Use braces for other multiplicities such as {2, 4}
| denotes alternatives: (Java|Scala)
() are used for grouping
[...] delimit character classes, such as [A-Za-z]
Useful predefined character classes such as \s (space), \pL (Unicode letters), completements（补集，即与前面相反） \S, \PL
^ and $ match the beginning and end of input
Escape special character with \ to match them literally
Caution: Must double-escape \ in Java strings

　　Two principal ways to use a regex:

应用一：Find all matches within a string;
应用二：Find whether the entire string matches

应用一：This loop iterates over all matches of a regex in a string:

1 Pattern pattern = Pattern.compile(regexString);
2 Matcher matcher = pattern.matcher(input);
3 while (matcher.find()) {
4     String match = matcher.group();
5     ...
6 }

Use matcher.start(), matcher.end() to get the position of the current match in the string.

应用二：Use the matches method to check wheter a string matches a regex:

1 String regex = "[12]?[0-9]:[0-5][0-9][ap]m";
2 if (Pattern.matches(regex, input)) { ... }

Compile the regex if you need it repeatedly:

1 Pattern pattern = Pattern.compile(regex);
2 Matcher matcher = patter.matcher(input);
3 if (matcher.matches()) ...

Can turn the pattern into a predicate:

1 Stream<String> result = streamOfStrings.filter(pattern.asPredicate());

Use groups to match subexpressions. Group index values start with 1.

// Example: Match records such as: Blackwell Toaster USD29.95

// 1. Regex with groups:
// step1: notes: \p{Alnum} 是预定义字符类，等同于 [A-Za-z0-9]
(\p{Alnum}+(\s+\p{Alnum}+)*)\s+([A-Z]{3})([0-9.]*)

// step2: Use the group method to get at each group"
Matcher matcher = pattern.matcher(input);
if (matcher.matches()) {
    item = matcher.group(1);      // Blackwell Toaster
    currency = matcher.group(3);　　// USD
    price = matcher.group(4);　　　　// 29.95
}

// 2. Clearer with named groups:
(?<item>\p{Alnum}+(\s+\p{Alnum}+)*)\s+(?<currency>[A-Z]{3})(?<price>[0-9.]*)
// then you retrive items by name:
item = matcher.group("item");

2.7.4 分隔符分割

1 // Specify the delimiter as a regex:
2 Pattern commas = Pattern.compile("\\s*,\\s*");
3 String[] tokens = commas.split(input);     // String "1, 2, 3" truns into array ["1", "2", "3"]
4 
5 // Fetch result lazily for large inputs:
6 Stream<String> tokens = commas.splitAsStream(input);
7 
8 // If you don't care about efficiency, just use the String.split method:
9 String[] tokens = input.split("\\s*,\\s*");

2.7.5 替换匹配

// To replace all matches, can replaceAll on the matcher
Matcher matcher = commas.matcher(input);
String result = matcher.replaceAll(", ");

// If you don't care about efficiency, just use the String.replaceAll method:
String result = input.replaceAll("\s*,\s*", ", ");

// Group numbers $n or names $name are replaced with the captured group:
String result = "3:45".replaceAll(
    "(\\d{1,2}):(?<minutes>\\d{2})",
    "$1 hours and ${minutes} minutes");

Java 9/10 关于正则表达式的改进：

1） Matcher.stream and Scanner.findAll gets a stream of match results:

1 Pattern pattern = Pattern.compile("[^,]");
2 Stream<String> matches = pattern.match(str).results().map(MatchResult::group);
3 
4 matches = new Scanner(path).findAll(pattern).map(MatchResult::group);

2) Matcher.replaceFirst / replaceAll now have a version with a replacement function:

1 String result = Pattern.compile("\\pL{4,}")
2     .matcher("Mary had a little lamb)
3     .replaceAll(m -> m.group().toUpperCase()); 
4     // yields "MARY had a LITTLE LAMB"

2.3 序列化

实际应用中，存储数据方式：

存储相同类型的数据 => 可用固定长度的记录格式 (如示例 randomAccess\Employee.java，需要定义固定长度的变量)
对象 => 序列化（如示例 objectStream\Employee.java，需要实现 Serializable）

Serialization ：an object -> a sequence of bytes. Deserailization：a sequence of bytes -> an object.

Useful for sending objects to a different computer and short-term storage (e.g. cache). Not intended for long-term storage.

Participating classes implement the serializable marker interface:

public class Employee implements Serializable { ... }

// 1. 输出流
// 1.1 Construct an ObjectOutputStream object:
ObjectOutputStream out = new ObjectOutputStream(Files.newOutputStream(path));

// 1.2 Call the writeObject method:
Employee peter = new Employee("Peter", 90000);
Employee paul = new Manager("Paul", 180000);
out.writeObject(peter);
out.writeObject(paul); 

// 2. 输入流
// 2.1 Construct an ObjectInputStream object:
ObjectInputStream in = new ObjectInputStream(Files.newInputStream(path));     // 对于 Employee 类，其包含字符串和浮点数，这些都是可串行化的

// 2.2 Retrieve the objects in the same order as they were saved:
Employee e1 = (Employee) in.readObject();
Employee e2 = (Employee) in.readObject();

使用 writeObject 方法写这些对象，要想正常工作，需要满足两个条件：

1. 这个类需要实现 Serializable 接口；
2. 这个类的所有实例变量也必须是可串行化的；

// Consider this network of objects, 一个对象被多个对象共享时: 需要保存这样的对象网络
Employee peter = new Employee("Peter", 40000);
Manager paul = new Manager("Paul", 105000);
Manager mary = new Manager("Mary", 180000);
paul.setAdmin(peter);
mary.setAdmin(peter);
ObjectOutputStream out = new ObjectOutputStream(Files.newOutputStream(path));
out.writeObject(peter);
out.writeObject(paul);
out.writeObject(mary);

对象序列化的算法是：

1）保存时：

对遇到的每一个对象引用都关联一个序列号（serial number）；
对于每一个对象，当第一次遇到时，保存其对象数据到输出流中；
如某个对象之前被保存过，只写出“与之前保存过的序列号为 x 的对象相同”

2）读出时：

对于对象输入流中的对象，在第一次遇到其序列号时，构建它，并使用流中数据来初始化它，然后记录这个顺序号和新对象之间的关联；
当遇到“与之前保存过的序列号为 x 的对象相同”这一标记，获取与这个序列号相关联的对象引用；

Declare fields that shouldn't be serialized with the transient modifier.

You can take over serialization of fields by implementing the readObject / writeObject methods. (Useful for saving instances of non-serializable classes.)

You can delegate serialization and deserialization to a proxy by implementing the readResolve/writeReplace methods. (Useful in rare cases when object identity needs to be preserved.)

You can declare multiple versions of serializations.

Default serialVersionUID is obtained by hashing fields names and types.
If the serialVersionUID changes, readObject throws an exception.
You can declare your own version ID and implement deserialization to conside multiple versions. private static final long serialVersionUID = 2L; // Version 2
Complex and raraly useful.

序列化的其他参考： https://www.cnblogs.com/bruce-he/p/17098132.html

posted on 2023-01-17 13:46 bruce_he 阅读(75) 评论(0) 编辑收藏举报

读后笔记 -- Java核心技术（第11版 卷 II） Chapter2 输入与输出