【Linux】处理数据文件

当存在大量数据的时候，通常很难处理这些信息及提取有用信息。Linux提供了一系列的命令行工具来处理这些数据。

1.排序数据

Linux:/usr/local/sbin # cat file2
1
0.3
2015
100
290
10
Linux:/usr/local/sbin # sort file2
0.3
1
10
100
2015
290
Linux:/usr/local/sbin #

从上述情况来看，sort并没有对数字进行进行排序，是由于在默认情况下，sort命令会把数字当做字符来执行标准的字符排序。如果需要当做数字排序，需要添加参数n

Linux:/usr/local/sbin # sort -n file2
0.3
1
10
100
290
2015

1.1 sort命令参数

-d 排序时忽略起始空白

-C 不排序，如果数据无序也不要报告

-c 不排序，但检查输入的数据是否已排序，未排序的话，报告

-d 仅考虑空白和字母，不考虑特殊字符

-f 默认情况下，会将大写字母排在前面，这个参数会忽略大小写

-g 按通用数值来排序

-i 在排序时忽略不可打印字符

-k 排序从POS1位置开始，如果指定了POS2的话，到POS2结束

-M 用三字符月份名按月份排序

-m 将两个已排序数据文件合并

-n 按字符串数值来排序

-o 将排序结果写到指定的文件中

-R 按随机生成的散列表的键值排序

-r 反序排序

-S 指定内存的大小

-s 禁用最后重排序比较

-T 指定一个位置来存储临时文件夹

-t 指定一个用来区分键位置的字符

-u 和-c参数一起使用，检查严格排序

-z 用NULL字符作为行尾，而不是用换行符

1.2 案例

Linux:/usr/local/sbin # du -sh * | sort -nr
4.0K    third.sh
4.0K    test2.sh
4.0K    test1.sh
4.0K    sum.sh
4.0K    second.sh
4.0K    param_v.sh
4.0K    out1.txt
4.0K    out.txt
4.0K    input_param_sum.sh
4.0K    first.sh
4.0K    file2
4.0K    file1
0    test_two
0    test_one

2.查找数据

2.1 grep进行搜索

Linux:/usr/local/sbin # cat file1
one
two
three
four
five
six
Linux:/usr/local/sbin # grep three file1
three

如果需要反向搜索，添加-v参数即可(输出不匹配该模式的行)

Linux:/usr/local/sbin # grep -v three file1
one
two
four
five
six

显示匹配模式所在行号 -n参数

Linux:/usr/local/sbin # grep -n three file1
3:three

只显示被匹配到的行的数量

Linux:/usr/local/sbin # grep -c three file1
1

指定多个匹配模式

Linux:/usr/local/sbin # grep -e three -e two file1
two
three

posted @ 2018-12-18 16:20 OLIVER_QIN 阅读(503) 评论(0) 编辑收藏举报

刷新页面返回顶部

【Linux】处理数据文件

公告