linux 系统中sed预存储技术 示例

 

1、直接测试

复制代码
root@PC1:/home/test# ls
a.fna
root@PC1:/home/test# cat a.fna
>NC_019458.2 Ovis aries breed Texel chromosome 1, Oar_v4.0, [whole genome shotgun sequence] ## 利用sed实现仅保留红色背景部分
gataaaaaataaatagaaacaaaatcactgaagaaCCAGTGTGCCTGCTCAGGTCAGATGAAGCCAGAGGGCTGCCAGAG
GGCAAGCGAGCTGCGTTGCCTGGAAAAAGTTAAACACACAGAGAGCATGGTGGCTCTGATACTTTCTAGAAGGATTAAAG
TCACTTTCCCAGTCTTTATGAGAATTGGGCCGAAGCTTAGCTGGTGCAACGAATTTAGAAATGAATGCACTTGCATTTGA
>NC_019458.2 Ovis aries breed Texel chromosome 1, Oar_v4.0, [whole genome shotgun sequence]
AGATGATGTGTCTTTGCCTTGAgctaaaaattttagaataatctgaACGTCATCTGAGGAACCTGCTTCTGGCGTGGTTT
TGGTGTCAGCATCTTCTCACCCTCTCTAGTAATTTTCAGTATGCATTTCTATTTTCGTGTAGTTATTTACAGGAGCATTT
TATGGAAAACCGGCTCAAATCTTTTTGGGTGCAGGGGTAGTTCAAATGCACTGAGACCCTCAGTTTCACTTGCTAATCTC
>NC_019458.2 Ovis aries breed Texel chromosome 1, Oar_v4.0, [whole genome shotgun sequence]
CTCCAGAAACCCTGTTCTCCTCGAGTGACAAGGTCAGCAGGGCAGCACGTGTGTTCCTGTCACTGCCAACTCAAGAATAT
GAAGTTTAAAGAGTTTCACCATCAAATGCAGTGTCGTGGACTGCCCCTGAACAGGTGTTTATAATCACGTGTGCAAGTGA
AGCAAGCACAAATCCTCAGTGGAAAACGGGCAGAGGACACGAGCagacaattctttttaaaaactgcacaaATTAGCACA
>NC_019458.2 Ovis aries breed Texel chromosome 1, Oar_v4.0, [whole genome shotgun sequence]
CTAGGCACGGATGAGCGTGCCTACCGTGTTGCATGGAGGTAACAGATGCCAGAGCCCGGAGGAGGCGCAAAGCTCACAAA
CAGATGCGGACCGCAGGAAGCCGGGACGGCCTTCCTCCCCTGAAGCAGGAGGACGCGCCCTACAGAAAGCCGCTCGATCC
TCCAGGCATTTGTTGTGAGCACTTAATCATCATTCGATCATTTGACGTGTACTCACTAGTAAAAGGCAGGACTGTGTCCC
root@PC1:/home/test# sed 's/\(>[^\s]\+\)\s.*\(\[.*\]\)/\1 \2/' a.fna  ##使用正则表达式,第一个小括号被保存在\1中,第二个小括号被保存在\2中
>NC_019458.2 [whole genome shotgun sequence]
gataaaaaataaatagaaacaaaatcactgaagaaCCAGTGTGCCTGCTCAGGTCAGATGAAGCCAGAGGGCTGCCAGAG
GGCAAGCGAGCTGCGTTGCCTGGAAAAAGTTAAACACACAGAGAGCATGGTGGCTCTGATACTTTCTAGAAGGATTAAAG
TCACTTTCCCAGTCTTTATGAGAATTGGGCCGAAGCTTAGCTGGTGCAACGAATTTAGAAATGAATGCACTTGCATTTGA
>NC_019458.2 [whole genome shotgun sequence]
AGATGATGTGTCTTTGCCTTGAgctaaaaattttagaataatctgaACGTCATCTGAGGAACCTGCTTCTGGCGTGGTTT
TGGTGTCAGCATCTTCTCACCCTCTCTAGTAATTTTCAGTATGCATTTCTATTTTCGTGTAGTTATTTACAGGAGCATTT
TATGGAAAACCGGCTCAAATCTTTTTGGGTGCAGGGGTAGTTCAAATGCACTGAGACCCTCAGTTTCACTTGCTAATCTC
>NC_019458.2 [whole genome shotgun sequence]
CTCCAGAAACCCTGTTCTCCTCGAGTGACAAGGTCAGCAGGGCAGCACGTGTGTTCCTGTCACTGCCAACTCAAGAATAT
GAAGTTTAAAGAGTTTCACCATCAAATGCAGTGTCGTGGACTGCCCCTGAACAGGTGTTTATAATCACGTGTGCAAGTGA
AGCAAGCACAAATCCTCAGTGGAAAACGGGCAGAGGACACGAGCagacaattctttttaaaaactgcacaaATTAGCACA
>NC_019458.2 [whole genome shotgun sequence]
CTAGGCACGGATGAGCGTGCCTACCGTGTTGCATGGAGGTAACAGATGCCAGAGCCCGGAGGAGGCGCAAAGCTCACAAA
CAGATGCGGACCGCAGGAAGCCGGGACGGCCTTCCTCCCCTGAAGCAGGAGGACGCGCCCTACAGAAAGCCGCTCGATCC
TCCAGGCATTTGTTGTGAGCACTTAATCATCATTCGATCATTTGACGTGTACTCACTAGTAAAAGGCAGGACTGTGTCCC
复制代码

 

2、简单示例

root@PC1:/home/test# echo "hello world" | sed 's/\(hello\).*/world \1/'  ##hello存储在\1中,匹配了完整的hello world,因此替换为 world hello
world hello

 

3、示例3

root@PC1:/home/test# cat mysed.txt    ## 在第一个Beijing和第4个Beijing后面添加2008
Beijing Beijing Beijing Beijing
London London London London
root@PC1:/home/test# sed 's/\(^Beijing\)\(.*\)\(Beijing$\)/\1 2008\2\3 2008/' mysed.txt 
Beijing 2008 Beijing Beijing Beijing 2008
London London London London

 

 参考:http://c.biancheng.net/linux/sed.html

 

posted @   小鲨鱼2018  阅读(45)  评论(0编辑  收藏  举报
编辑推荐:
· 基于Microsoft.Extensions.AI核心库实现RAG应用
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
· Linux系列:如何用 C#调用 C方法造成内存泄露
阅读排行:
· 震惊!C++程序真的从main开始吗?99%的程序员都答错了
· 【硬核科普】Trae如何「偷看」你的代码?零基础破解AI编程运行原理
· 单元测试从入门到精通
· 上周热点回顾(3.3-3.9)
· winform 绘制太阳,地球,月球 运作规律
历史上的今天:
2020-11-21 影响cpu性能的因素有哪些?
点击右上角即可分享
微信分享提示