读后笔记 -- Java核心技术(第11版 卷 II) Chapter2 输入与输出

2.1 输入 / 输出流

No relationship with java.util.stream.

2.1.1-2.1.3 读写字节

1) Easiest to use static methods from the java.nio.file.Files class:

1 Path path = Path.of(filenameString);            // better than Paths.get(),其实 Paths.get() 调用的就是 Path.of()
2 InputStream in = Files.newInputStream(path);
3 OutputStream out = Files.newOutputStream(path);

2) Get an input stream from any URL:

1 URL url = new URL("http://horstmann.com/index.html");
2 InputStream in = url.openStream();

3) Get an input stream from a byte[] array or write to a byte[] array:

// get an input stream from a byte[] array
byte[] bytes = ...;
InputStream in = new ByteArrayInputStream(bytes);

// Conversely, you can write to a ByteArrayOutputStream and then collect the bytes:
ByteArrayOutputStream out = new ByteArrayOutputStream();
Write to out
byte[] bytes = out.toByteArray();

4) The read method returns a single byte (as an int) or -1 at the end of input:

1 InputStream in = ...;
2 int b = in.read();
3 if (b != -1) { byte value = (byte) b; ...}

   It is more common to read bytes in bulk:

1 byte[] bytes = ...;
2 int len = in.read(bytes);

5) No method for reading all bytes from a stream. Here is one solution:

1 ByteArrayOutputStream out = new ByteArrayOutputStream();
2 byte[] bytes = new byte[1024];
3 while ((len = in.read(bytes)) != -1) {out.write(bytes, 0 , len);}   // -1: end of the input stream。该方法可以一次读写指定长度的 bytes[]
4 bytes = out.toByteArray();

  For files, just call:

1 byte[] bytes = Files.readAllBytes(path);    // from Java 9

6) You can write one byte or bytes from an array:

1 OutputStream out = ...;
2 int b = ...;
3 out.write(b);   // one byte
4 byte[] bytes = ...; 5 out.write(bytes); // btyes from an array 6 out.write(bytes, start, length);

7) When writing to a stream, close it when you are done:

out.close();

    Or better, use a try-with-resources block (resource will be automatically closed):

1 try (OutputStream out = ...) {
2     out.write(bytes);
3 }

8) To save an input stream to a file, call:

1 Files.copy(in, path, StandardCopyOption.REPLACE_EXISTING);

 

Java 9/10 new feature:

  • 1. There is finally a method to read all bytes from an input stream(解决了上面 4) 的限制): byte[] bytes = url.openStream().readAllBytes();
  •      There is also readNBytes.
  • 2. InputStream.transferTo(OutputStream) transfer all bytes from an input stream to an output stream.
  • 3. Java 10: Reader.transferTo(Writer)
  • 4. Java 10: Character sets in PrintWriter, Scanner, etc. can be specified as Charset instead of String.  new Scanner(path, StandCharsets.UTF_8)
  • 5. Scanner.tokens gets a stream of tokens, similiar to Pattern.splitAsStream from Java 8: Stream<String> tokens = new Scanner(path).useDelimiter("\\s*,\\s*).tokens();

   

2.1.4 读写文本文件

1. Summary:

  • InputStream/Outputstream process bytes.
  • Text files contain characters.
  • Java uses Unicode for characters.
  • Readers/Writers convert between bytes and characters.
  • Always specify the character encoding. Use StandardCharsets.UTF_8 for Charset parameters, "UTF-8" for string parameters.

 

2. You can obtain a Reader for any input stream:

1 InputStream inStream = ...;
2 Reader in = new InputStreamReader(inStream, charset);

  The read methods reads one char value, it's too low-level for most purposes.

  1) You can read a short file into a string:

1 String content = new String(Files.readAllBytes(path), charset);            // Files.readAllBytes(path) returns byte[], then call new String() to convert to String

  2) You can get all lines as a list or stream:

1 List<String> lines = Files.readAllLines(path, charset);
2 
3 try (Stream<String> lines = Files.lines(path, charset)) {
4     ...
5 }

 

3. Use a Scanner to split input into numbers, words, and so on:

Scanner in = new Scanner(path, "UTF-8");
while (in.hasNextDouble()) {
    double value = in.nextDouble();
    ...
}

// To read words, set the delimeter to any sequence of non-letters (sample in textFile\ScannerTest.java):
// method1: in.useDelimiter
in.useDelimiter("\\PL+");     
while (in.hasNext()) {
    String word = in.next();
    ...
}

// method2: in.tokens()
Stream<String> words = in.tokens();

 

4. To write to a file, make one of these calls as following. Then call out.print, out.println, or out.printf to produce output.

1 PrintWriter out = new PrintWriter(Files.newBufferedWriter(path, charset));
2 
3 PrintWriter out = new PrintWriter(filenameString, charsetString);

// write data to file
out.println(data);

  Remeber to close the file: try (PrintWriter out = ... ) {...}

  If you already have the entire output in a string, or a collection of lines, call:

1 Files.write(path, contentString.getBytes(charset));
2 Files.write(path, lines, charset);

  You can also append output to a file:

1 Files.write(path, lines, charset, StandardOpenOption.APPEND);

 

5. Sometimes, a library method wants a Writer object. Example:

1 Throwable.printStackTrace(PrintWriter out)

   If you want to capture the output in a string, not a file, use a StringWriter:

1 StringWriter writer = new StringWriter();            // StringWriter 是将一个字符发送到字符串,而不是磁盘文件。另外,StringWriter 本身没有 print 方法,所以需要将其包装到 PrinterWriter 中
2 throwable.printStackTrace(new PrintWriter(writer));

   Now you can process the stack trace as string:

1 String stackTrace = writer.toString();

 

下列方法适合处理中等长度的文本文件:

  • Files.readAllBytes()、Files.readString()、Files.readAllLines()、Files.writeString()、Files.write()、

下列方法适合处理大文件或二进制文件:

InputStream in = Files.newInputStream(path);
OutputStream out = Files.newOutputStream(path);
 
Reader in = Files.newBufferedReader(path, charset);    // 返回 BufferReader(),BufferReader类扩展了Reader类
Writer out = Files.newBufferedWriter(path, charset);

 


2.2/2.5 读写二进制数据

1. 处理二进制文件

  DataInput / DataOutput interfaces have methods readInt / writeInt, readDouble / writeDouble, and so on.

  Can wrap any stream into a DataInputStream / DataOutputStream:

1 DataInput in = new DataInputStream(new FileInputStream(path));
2 DataOutput out = new DataOutputStream(new FileOutputStream(path));

  Reading / writing stream data is sequential.

 

2. 随机访问文件

2.1 方式一: RandomAccessFile (section 2.2.2)

  "Random access file": You can jump to any file position and start reading/writing. Open with "r" for reading or "rw" for writing:

1 RandomAccessFile file = new RandomAccessFile(filenameString, "rw");

  The getFilePointer method yields the current position (as a long).

  The seek method moves to a new position. Example: Increment an integer that you just read:

1 int value = file.readInt(); 
2 file.seek(file.getFilePointer() - 4);   // 第1句读取一个整数,此时位置偏移。此时读取当前位置 - 4(整数长度),即回到了刚才的位置
3 file.writeInt(value + 1);

 

2.2 方式二:内存映射文件 Memory-Mapped Files(section 2.5)

  A memory-mapped file provides very efficient random access for large files. (Uses operating system mechanism for virtal memory.)

// step1: Get a channel for the file:
FileChannel channel = FileChannel.open(path, StandardOpenOption.READ, StandOpenOption.WRITE);

// step2: Map an area of the file (or all of it) into memory:
ByteBuffer buff = channel.map(FileChannel.MapMode.READ_WRITE, 0, channel.size());

// step3: You use methods get, getInt, getDouble, and so on to read, and the equivalent put methods to write:
int position = ...;
int value = buffer.getInt(position);
buffer.put(position, value + 1);

The file is updated at some point, and certainly when the channel is closed (can use with try-with-resources).

 


2.4 操作文件(创建、访问、删除文件和目录): Path, Files

1.  Working with Path

  Path objects specify abstract path names (which may not currently exist on disk). Sequence of directory names, optionally followed by a file name. First component may be a root component such as / or C:\.

  Use Paths.get / Path.of to create paths:

1 Path absolute = Paths.get("/", "home", "cay");     // start with root
2 Path relative = Paths.get("myapp", "conf", "user.properties");

  Path separator / or \ is suppiled for the default file system. If you know which platform your program is running, you can provide a string with separators:

1 Path homeDirectory = Paths.get("/home/cay");

  1.1. The call p.resolve(q) computes "p then q". If q is absolute, that's just q, otherwiszie, first follow p, then follow q:

1 Path workPath = homeDirectory.resolve("myapp/work");

  1.2. The oppostie of resolve is relativize, yielding "how to get from p to q". 

1 Paths.get("/home/cay").relativize(Paths.get("/home/fred/myapp")) 
2 // yields "../fred/myapp"

  1.3. normalize removes . or directory/../ and other redundancies.

  1.4. toAbsolutePath makes a path absolute.

 

2. Taking Paths Apart

  Utility methods to get at the most important parts:

1 Path p = Paths.get("/home", "cay", "myapp.properties");
2 Path parent = p.getParent(); // The path /home/cay 3 Path file = p.getFileName(); // The last element, myapp.properties 4 Path root = p.getRoot(); // The initial segment / (null for a relative path) 5 Path first = p.getName(0); // The first element, home 6 Path dir = p.subpath(1, p.getNameCount()); // All but the first element, cay/myapp.properties

  You can iterate over the components:

1 for (Path component : path) {
2     ...
3 }

  To interoperate with legacy File class, use:

1 File file = path.toFile();
2 Path path = file.toPath();

 

3. Files

  2.4.3 To create a new directory, call:

1 Files.createDirectory(path);      // All but the last component must exist。仅创建下一级目录
2 Files.createDirectories(path);    // Missing components are created. 创建路径中的中间目录即可创建多级目录

  You can create an empty file, If the file exists, an exception occurs. Check and creation are atomic.

1 Files.createFile(path);

  Convencience methods for creating temporary files:

1 Path tempFile = Files.createTempFile(dir, prefix, suffix);
2 Path tempFile = Files.createTempFile(prefix, suffix);
3 
4 Path tempDir = Files.createTempDirectory(dir, prefix);
5 Path tempDir = Files.createTempDirectory(prefix);

  Files.createTempFile(null, ".txt") might return a path such as /tmp/1234405522364837194.txt.

  Files.exists(path) checks whether a path currently exists.

  Use Files.isDirectory(path), Files.isRegularFile(path), Files.isSymbolicLink(path) to find out whether the path is directory, file, or symlink. More infor: isHidden, isExecutable, isReadable, isWritable of the Files class.

  Files.size(path) reports the file size as a long value.

 

  2.4.4 Use the copy or move method:

1 Files.copy(fromPath, toPath);
2 Files.move(fromPath, toPath);

  Can define behavior with copy options:

1 Files.copy(fromPath, toPath, StandardCopyOption.REPLACE_EXISTING, StandardCopyOption.COPY_ATTRIBUTES);
2 Files.move(fromPath, toPath, StandardCopyOption.ATOMIC_MOVE);

  Delete a file like this:

1 Files.delete(path);    // throws exception if path doesn't exist
2 boolen deleted = Files.deleteIfExists(path);

 

  2.4.6 Files.list(dirpath) yields a Stream<Path> of the directory entries. The directory is read lazily -- efficient for huge directories. Be sure to close the stream. (Files.list 不会进入子目录

1 try (Stream<Path> entries = Files.list(pathToDirectory)) {...}

  Call Files.walk(dirpath) to visit all descendants of subdirectories as well. Descendants are visited in depth-first order.

1 try (Stream<Path> entries = Files.walk(pathToRoot)) {
2    entries.foreach(System.out.println);
3 }

   If you filter results by file attributes (size, creation time, and so on), use find instead of walk for greater efficiency:

1 Files.find(path, maxDepth, (path, attr) -> attr.size() > 10000)

  Use Files.walk to copy a directory tree:  // JDK 目前没有提供方法来实现复制目录

Files.walk(source).forEach(p -> {
    try {
        Path q = target.resolve(source.relativize(p));
        if (Files.isDirectory(p)) Files.createDirectory(q);
        else Files.copy(p, q);
    catch (IOException ex) {
        throw new UncheckedIOException(ex);
    }
});

  Unfortunately, this approach doesn't work for deleting a directory tree. Need to vist children before deleting the parent. => Use FileVisitor instead:

 // Delete the directory tree starting at root
1
Files.walkFileTree(root, new SimpleFileVisitor<Path>() { 2 public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) throws IOException { 3 Files.delete(file); 4 return FileVisitResult.CONTINUE; 5 } 6 public FileVisitResult postVisitDirectory(Path dir, IOException ex) throws IOException { 7   if (ex != null) throw ex; 8   Files.delete(dir); 9   return FileVisitResult.CONTINUE; 10 } 11 });

Paths class looks up paths in the default file system.

 

4. ZIP file sytem

  Can have file system for the files in a ZIP archive:

1 FileSystem zipfs = FileSystems.newFileSystem(Paths.get(zipname), (ClassLoader) null);

  Copy out a file if you know its name:

1 Files.copy(zipfs.getPath(sourceName), targetPath);

  To list all files in an archive, walk the file tree:

1 Files.walk(zipfs.getPath("/", forEach(p -> { Process p });

  Here is the magic incantation for creating a zip file:

1 Path zipPath = Paths.get("myfile.zip);
2 URI uri = new URI("jar", zipPath.toUri().toString(), null);        // uri: jar:file:///C:/Users/xxxxxx/IdeaProjects/trunk/lessonlearn_coreJava/1.zip
3     // Constructs the URI jar:file://myfile.zip
4 try (FileSystem zipfs = FileSystems.newFileSystem(uri, Collections.singletonMap("create", "true"))) {
5     // To add files, copy them into the ZIP file system
6     Files.copy(sourcePath, zipfs.getPath("/").resolve(targetPath));
7 }

 

5. Java 11 新特性

  • String.lines yields a stream of all lines in a string;
  • String.strip trims Unicode whitespace;
  • Path.of does the same as Paths.get -- more consistent and shorter;
  • Files.readString reads a file into a string;
  • OutputStream nullOutputStream() provides a null stream;
  • Analogous methods for InputStream, Reader, Writer;

 


 2.x 处理 互联网上的数据

  You can read data from a given URL. That gets you the contents of the URL(from the GET request).

1 URL url = new URL("http://hostmann.com/index.html");
2 InputStream in = url.openStream();

  Sometimes, you need to use the URLConnection class for more complex cases:

  • Making a POST request
  • Setting request headers
  • Reading response headers
// 1. Get an URLConnection object:
    URLConnection connection = url.openConnection();

// 2. Set request properties:
    connection.setRequestProperty("Accept-Charset", "UTF-8, ISO-8859-1");

// 3. Send data to the server:
    connection.setDoOutput(true);
    try (OutputStream out = connection.getOutputStream()) { Write to out }

// 4. Read the response headers:
    connection.connect();     // If you skipped step 3
    Map<String, List<String>> headers = connection.getHeaderFields();

// 5. Read the response:
    try (InputStream in = connection.getInputStream()) { Read from in }

 

  When writing to a HttpURLConnection, the default encoding is application/x-www-form-urlencoded. But you still need to encode the name/value pairs.

  Suppose POST data are given in a map:

URLConnection connection = url.openConnection();
connection.setDoOutput(true);
try (Writer out = new OutputStreamWriter(connection.getOutputStream(), StandardCharsets.UTF_8)) {
    boolean first = true;
    for (Map.Entry<String, String> entry : postData.entrySet()) {
        if (first) first = false;
        else out.write("&");
        out.write(URLEncoder.encode(entry.getKey(), "UTF-8");
        out.write("=");
        out.write(URLEncoder.encode(entry.getValue(), "UTF-8");
    }
}

 

Java 9 HttpClient:

// Build a client:
HttpClient client = HttpClient.newBuilder()
    .fllowRedirects(HttpClient.Redirect.ALWAYS)
    .build();

// Build a request:
HttpRequest request = HttpRequest.newBuilder()
    .uri(URI.create("http://horstmann.com"))
    .GET()
    .build();

// Get and handle response:
HttpResponse<String> reponse = client.send(request, HttpResponse.BodyHandlers.ofString());

// Asynchronous processing:
Client.sendAsync(request, HttpResponse.BodyHandlers.ofString())
    .completeOnTimeout("<html></html>", 10, TimeUnit.SECONDS)
    .thenAccept(response -> Process response.body());

 


2.7 正则表达式

2.7.1 基本语法

  Regualr expressions (regex) specify string patterns.

  • The regex [Jj]e?a.+ matches Java and jealous but not Jim or ja
  • Special characters . * + ? { | ( ) [ \ ^ $
  • . matches any character, * is 0 or more, + 1 or more, ? 0 or 1 repetition
  • Use braces for other multiplicities such as {2, 4}
  • | denotes alternatives: (Java|Scala)
  • () are used for grouping
  • [...] delimit character classes, such as [A-Za-z]
  • Useful predefined character classes such as \s (space), \pL (Unicode letters), completements(补集,即与前面相反) \S, \PL
  • ^ and $ match the beginning and end of input
  • Escape special character with \ to match them literally
  • Caution: Must double-escape \ in Java strings

  Two principal ways to use a regex:

  • 应用一:Find all matches within a string;
  • 应用二:Find whether the entire string matches

应用一:This loop iterates over all matches of a regex in a string:

1 Pattern pattern = Pattern.compile(regexString);
2 Matcher matcher = pattern.matcher(input);
3 while (matcher.find()) {
4     String match = matcher.group();
5     ...
6 }

    Use matcher.start(), matcher.end() to get the position of the current match in the string.

 

应用二:Use the matches method to check wheter a string matches a regex:

1 String regex = "[12]?[0-9]:[0-5][0-9][ap]m";
2 if (Pattern.matches(regex, input)) { ... }

Compile the regex if you need it repeatedly:

1 Pattern pattern = Pattern.compile(regex);
2 Matcher matcher = patter.matcher(input);
3 if (matcher.matches()) ...

Can turn the pattern into a predicate:

1 Stream<String> result = streamOfStrings.filter(pattern.asPredicate());

 

Use groups to match subexpressions. Group index values start with 1.

// Example: Match records such as: Blackwell Toaster USD29.95

// 1. Regex with groups:
// step1: notes: \p{Alnum} 是预定义字符类,等同于 [A-Za-z0-9]
(\p{Alnum}+(\s+\p{Alnum}+)*)\s+([A-Z]{3})([0-9.]*)

// step2: Use the group method to get at each group"
Matcher matcher = pattern.matcher(input);
if (matcher.matches()) {
    item = matcher.group(1);      // Blackwell Toaster
    currency = matcher.group(3);  // USD
    price = matcher.group(4);    // 29.95
}

// 2. Clearer with named groups:
(?<item>\p{Alnum}+(\s+\p{Alnum}+)*)\s+(?<currency>[A-Z]{3})(?<price>[0-9.]*)
// then you retrive items by name:
item = matcher.group("item");

 

2.7.4 分隔符分割

1 // Specify the delimiter as a regex:
2 Pattern commas = Pattern.compile("\\s*,\\s*");
3 String[] tokens = commas.split(input);     // String "1, 2, 3" truns into array ["1", "2", "3"]
4 
5 // Fetch result lazily for large inputs:
6 Stream<String> tokens = commas.splitAsStream(input);
7 
8 // If you don't care about efficiency, just use the String.split method:
9 String[] tokens = input.split("\\s*,\\s*");

 

2.7.5 替换匹配

// To replace all matches, can replaceAll on the matcher
Matcher matcher = commas.matcher(input);
String result = matcher.replaceAll(", ");

// If you don't care about efficiency, just use the String.replaceAll method:
String result = input.replaceAll("\s*,\s*", ", ");

// Group numbers $n or names $name are replaced with the captured group:
String result = "3:45".replaceAll(
    "(\\d{1,2}):(?<minutes>\\d{2})",
    "$1 hours and ${minutes} minutes");

 

Java 9/10 关于 正则表达式的改进:

1) Matcher.stream and Scanner.findAll gets a stream of match results:

1 Pattern pattern = Pattern.compile("[^,]");
2 Stream<String> matches = pattern.match(str).results().map(MatchResult::group);
3 
4 matches = new Scanner(path).findAll(pattern).map(MatchResult::group);

2) Matcher.replaceFirst / replaceAll now have a version with a replacement function:

1 String result = Pattern.compile("\\pL{4,}")
2     .matcher("Mary had a little lamb)
3     .replaceAll(m -> m.group().toUpperCase()); 
4     // yields "MARY had a LITTLE LAMB"

 


2.3 序列化

实际应用中,存储数据方式:

  • 存储相同类型的数据 => 可用固定长度的记录格式 (如示例 randomAccess\Employee.java,需要定义固定长度的变量)
  • 对象 => 序列化(如示例 objectStream\Employee.java,需要实现 Serializable

Serialization :an object  -> a sequence of bytes. Deserailization:a sequence of bytes -> an object.

Useful for sending objects to a different computer and short-term storage (e.g. cache). Not intended for long-term storage.

Participating classes implement the serializable marker interface:

public class Employee implements Serializable { ... }
// 1. 输出流
// 1.1 Construct an ObjectOutputStream object:
ObjectOutputStream out = new ObjectOutputStream(Files.newOutputStream(path));

// 1.2 Call the writeObject method:
Employee peter = new Employee("Peter", 90000);
Employee paul = new Manager("Paul", 180000);
out.writeObject(peter);
out.writeObject(paul); 

// 2. 输入流
// 2.1 Construct an ObjectInputStream object:
ObjectInputStream in = new ObjectInputStream(Files.newInputStream(path));     // 对于 Employee 类,其包含字符串和浮点数,这些都是可串行化的

// 2.2 Retrieve the objects in the same order as they were saved:
Employee e1 = (Employee) in.readObject();
Employee e2 = (Employee) in.readObject();

使用 writeObject 方法写这些对象,要想正常工作,需要满足两个条件:

  • 1. 这个类需要实现 Serializable 接口;
  • 2. 这个类的所有实例变量也必须是可串行化的;
// Consider this network of objects, 一个对象被多个对象共享时: 需要保存这样的对象网络
Employee peter = new Employee("Peter", 40000);
Manager paul = new Manager("Paul", 105000);
Manager mary = new Manager("Mary", 180000);
paul.setAdmin(peter);
mary.setAdmin(peter);
ObjectOutputStream out = new ObjectOutputStream(Files.newOutputStream(path));
out.writeObject(peter);
out.writeObject(paul);
out.writeObject(mary);

对象序列化的算法是:

1)保存时:

  • 对遇到的每一个对象引用都关联一个序列号(serial number);
  • 对于每一个对象,当第一次遇到时,保存其对象数据到输出流中;
  • 如某个对象之前被保存过,只写出“与之前保存过的序列号为 x 的对象相同”

2)读出时:

  • 对于对象输入流中的对象,在第一次遇到其序列号时,构建它,并使用流中数据来初始化它,然后记录这个顺序号和新对象之间的关联;
  • 当遇到“与之前保存过的序列号为 x 的对象相同”这一标记,获取与这个序列号相关联的对象引用;

 

Declare fields that shouldn't be serialized with the transient modifier.

You can take over serialization of fields by implementing the readObject / writeObject methods. (Useful for saving instances of non-serializable classes.)

You can delegate serialization and deserialization to a proxy by implementing the readResolve/writeReplace methods. (Useful in rare cases when object identity needs to be preserved.)

You can declare multiple versions of serializations.

  • Default serialVersionUID is obtained by hashing fields names and types.
  • If the serialVersionUID changes, readObject throws an exception.
  • You can declare your own version ID and implement deserialization to conside multiple versions.  private static final long serialVersionUID = 2L;    // Version 2
  • Complex and raraly useful.

 

序列化的其他参考: https://www.cnblogs.com/bruce-he/p/17098132.html 

 

posted on 2023-01-17 13:46  bruce_he  阅读(75)  评论(0编辑  收藏  举报