2025 年 1月随笔档案 - 雨愈

20250131

摘要：学习JavaWeb开发基础。学习Servlet和JSP。 @WebServlet("/HotWordsServlet") public class HotWordsServlet extends HttpServlet { protected void doGet(HttpServletReque 阅读全文

posted @ 2025-01-31 20:57 雨愈阅读(2) 评论(0) 推荐(0) 编辑

20250130

摘要：实践将Hive分析结果导出到MySQL。验证数据一致性。 -- 在 MySQL 中查询数据 SELECT * FROM hot_words_analysis LIMIT 10; 阅读全文

posted @ 2025-01-30 20:51 雨愈阅读(3) 评论(0) 推荐(0) 编辑

20250129

摘要：学习Sqoop的安装与配置。学习使用Sqoop将Hive数据导出到MySQL。 # 导出数据到 MySQL sqoop export --connect jdbc:mysql://localhost:3306/hot_words_db \ --username root --password ro 阅读全文

posted @ 2025-01-29 19:57 雨愈阅读(3) 评论(0) 推荐(0) 编辑

20250128

摘要：实践使用Hive进行数据离线分析。存储分析结果。 -- 创建分析表 CREATE TABLE IF NOT EXISTS hot_words_analysis AS SELECT word, SUM(frequency) AS total_frequency FROM hot_words GROU 阅读全文

posted @ 2025-01-28 22:51 雨愈阅读(46) 评论(0) 推荐(0) 编辑

20250127

摘要：学习Hive的安装与配置。学习Hive SQL语法。 -- 创建数据库 CREATE DATABASE IF NOT EXISTS hot_words_db; -- 创建表 USE hot_words_db; CREATE TABLE IF NOT EXISTS hot_words ( word 阅读全文

posted @ 2025-01-27 18:56 雨愈阅读(4) 评论(0) 推荐(0) 编辑

20250126

摘要：学习Hadoop集群搭建。学习MapReduce编程模型。 import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.Text; import 阅读全文

posted @ 2025-01-26 20:50 雨愈阅读(4) 评论(0) 推荐(0) 编辑

20250125

摘要：实践生成完整的热词报告。包含热词、解释、引用链接等。 def save_full_report(hot_words, explanations, references, file_path): doc = Document() doc.add_heading("信息领域热词报告", level=1 阅读全文

posted @ 2025-01-25 20:49 雨愈阅读(5) 评论(0) 推荐(0) 编辑

20250124

摘要：学习使用Python的python-docx库生成Word文档。理解文档结构和样式设置。 from docx import Document def create_word_report(hot_words, explanations, file_path): doc = Document() d 阅读全文

posted @ 2025-01-24 15:56 雨愈阅读(3) 评论(0) 推荐(0) 编辑

20250123

摘要：实践生成热词词云。绘制热词关系图。 import networkx as nx def draw_word_relationship(words): G = nx.Graph() for word, freq in words.items(): G.add_node(word) G.add_edg 阅读全文

posted @ 2025-01-23 23:56 雨愈阅读(4) 评论(0) 推荐(0) 编辑

20250122

摘要：学习使用Python的wordcloud库生成词云。学习使用networkx库绘制关系图。 from wordcloud import WordCloud import matplotlib.pyplot as plt def generate_word_cloud(words): wordclo 阅读全文

posted @ 2025-01-22 20:50 雨愈阅读(3) 评论(0) 推荐(0) 编辑

20250121

摘要：实践为热词生成引用链接。存储引用链接到本地文件。 def save_references(hot_words, file_path): references = {} for word in hot_words: references[word] = search_word_references( 阅读全文

posted @ 2025-01-21 23:56 雨愈阅读(4) 评论(0) 推荐(0) 编辑

20250120

摘要：学习使用搜索引擎API（如百度搜索）查找热词引用。理解如何从搜索结果中提取链接。 def search_word_references(word): url = f"https://www.baidu.com/s?wd={word}" response = requests.get(url) so 阅读全文

posted @ 2025-01-20 23:56 雨愈阅读(2) 评论(0) 推荐(0) 编辑

20250119

摘要：实践为热词生成解释。存储解释到本地文件。 def save_explanations(hot_words, file_path): explanations = {} for word in hot_words: explanations[word] = get_word_explanation( 阅读全文

posted @ 2025-01-19 20:49 雨愈阅读(3) 评论(0) 推荐(0) 编辑

20250118

摘要：学习使用Python调用API（如百度百科）获取热词解释。理解API请求和响应。 import requests def get_word_explanation(word): url = f"https://baike.baidu.com/item/{word}" response = requ 阅读全文

posted @ 2025-01-18 20:49 雨愈阅读(3) 评论(0) 推荐(0) 编辑

20250117

摘要：实践对热词进行自动分类。存储分类结果。 def classify_and_save(hot_words, rules, file_path): classified_words = {category: [] for category in rules.keys()} for word in ho 阅读全文

posted @ 2025-01-17 23:56 雨愈阅读(5) 评论(0) 推荐(0) 编辑

20250116

摘要：学习基于规则的文本分类方法。使用Python实现简单分类器。 def classify_word(word, rules): for category, keywords in rules.items(): if any(keyword in word for keyword in keyword 阅读全文

posted @ 2025-01-16 23:55 雨愈阅读(3) 评论(0) 推荐(0) 编辑

20250115

摘要：实践清洗爬取的热词数据。存储清洗后的数据到本地文件。 def save_cleaned_data(hot_words, file_path): cleaned_words = [clean_text(word) for word in hot_words] with open(file_path, 阅读全文

posted @ 2025-01-15 18:57 雨愈阅读(7) 评论(0) 推荐(0) 编辑

20250114

摘要：学习正则表达式的基本用法。使用Python进行文本清洗。 import re def clean_text(text): text = re.sub(r'\s+', ' ', text) # 去除多余空格 text = re.sub(r'[^\w\s]', '', text) # 去除标点符号 r 阅读全文

posted @ 2025-01-14 19:51 雨愈阅读(10) 评论(0) 推荐(0) 编辑

20250113

摘要：实践从信息领域网站爬取热词。学习定时任务（cron）设置。 # 使用 schedule 库设置定时任务 import schedule import time def job(): print("Fetching hot words...") hot_words = fetch_hot_words 阅读全文

posted @ 2025-01-13 19:22 雨愈阅读(10) 评论(0) 推荐(0) 编辑

20250112

摘要：学习使用Python的requests和BeautifulSoup库爬取网页数据。理解HTTP请求和HTML解析。 import requests from bs4 import BeautifulSoup def fetch_hot_words(url): response = requests 阅读全文

posted @ 2025-01-12 18:59 雨愈阅读(16) 评论(0) 推荐(0) 编辑

20250111

摘要：学习内容：理解项目需求，规划项目架构。搭建开发环境（Python、Java、MySQL、Hadoop等）。 # 安装 Python 环境 sudo apt-get update sudo apt-get install python3 python3-pip # 安装 Java 环境 sudo 阅读全文

posted @ 2025-01-11 18:25 雨愈阅读(4) 评论(0) 推荐(0) 编辑

01 2025 档案

公告

搜索

常用链接

我的标签

随笔档案

阅读排行榜