网易邮件采集器(5)
去重:获取最新的邮件mid并保存,以后每采集一次得到邮件mid与存储的mid比较,若不相等则为新邮件,采集并保存,若相等,则当前邮件及之后的邮件均为已经采集的邮件,当前采集可直接结束!
(1)获取最新mid
int i = 0; Elements links = d.select("[name=\"id\"]"); for (Element link : links) { MailGet mg = new MailGet(link.text()); ls.add(mg); if (i == 0) { mid = mg.getId(); }
(2)读取文件中存储的mid信息,比较mid,并在控制台输出新邮件
for (MailGet mg : ls) { // 存 try { File fa = new File(Constants.midPath, Constants.midFileName); if (!fa.exists()) { fa.createNewFile(); } FileReader fr = new FileReader(fa); char[] a = new char[1024]; String str = ""; int j = 0; while ((j = fr.read(a)) > 0) { str += new String(a, 0, j); } // System.out.println(str + mg.getId()); fr.close(); if (str.equals(mg.getId())) { break; } else { System.out.println(mg); mailDetail(mg); File fi = new File(Constants.youjianPath, mg.getReceivedDate().replace(":", "").replace("-", "") + ".json"); if (!fi.exists()) { fi.createNewFile(); } FileWriter fw = new FileWriter(fi); fw.write(JSON.toJSONString(mg)); fw.close(); i++; } } catch (Exception e) { e.printStackTrace(); } }
(3)重新存一次mid的信息
try { File fl = new File(Constants.midPath, Constants.midFileName); FileWriter fwm = new FileWriter(fl); fwm.write(mid); fwm.close(); } catch (Exception e) { e.printStackTrace(); }