分解

分解是指将字节或字符序列分割为像单词这样的逻辑块的过程。Java 提供StreamTokenizer 类, 像下面这样操作:

import java.io.*;

public class token1 {

public static void main(String args[]) {

if (args.length != 1) {

System.err.println("missing filename");

System.exit(1);

}

try {

FileReader fr = new FileReader(args[0]);

BufferedReader br = new BufferedReader(fr);

StreamTokenizer st = new StreamTokenizer(br);

st.resetSyntax();

st.wordChars('a', 'z');

int tok;

while ((tok = st.nextToken()) != StreamTokenizer.TT_EOF) {

if (tok == StreamTokenizer.TT_WORD)

;// st.sval has token

}

br.close();

} catch (IOException e) {

System.err.println(e);

}

这个例子分解小写单词 (字母a-z)。如果你自己实现同等地功能，它可能像这样：

import java.io.*;

public class token2 {

public static void main(String args[]) {

if (args.length != 1) {

System.err.println("missing filename");

System.exit(1);

}

try {

FileReader fr = new FileReader(args[0]);

BufferedReader br = new BufferedReader(fr);

int maxlen = 256;

int currlen = 0;

char wordbuf[] = new char[maxlen];

int c;

do {

c = br.read();

if (c >= 'a' && c <= 'z') {

if (currlen == maxlen) {

maxlen *= 1.5;

char xbuf[] = new char[maxlen];

System.arraycopy(wordbuf, 0, xbuf, 0, currlen);

wordbuf = xbuf;

}

wordbuf[currlen++] = (char) c;

} else if (currlen > 0) {

String s = new String(wordbuf, 0, currlen); // do something

// with s

currlen = 0;

}

} while (c != -1);

br.close();

} catch (IOException e) {

System.err.println(e);

}

第二个程序比前一个运行快大约 20%，代价是写一些微妙的底层代码。

StreamTokenizer 是一种混合类，它从字符流(例如 BufferedReader)读取, 但是同时以字节的形式操作，将所有的字符当作双字节(大于 0xff) ，即使它们是字母字符。

posted @ 2018-08-06 23:42 Borter 阅读(231) 评论(0) 收藏举报

刷新页面返回顶部

Borter

Begin here!

分解

公告