boost库之字符串处理(三)

一、字符串大小写转换

Boost字符算法库Boost.StringAlgorithms提供了很多字符串操作函数，字符串的类型可以是std::string,std::wstring,或者是任何模板类std::basic_string的实例。这些函数分类别在不同的头文件定义，例如大小写转函数定义在文件boost/algorithm/string/case_conv.hpp,为了方便起见，头文件boost/algorithm/string.hpp包含了所有其他的头文件，后面的所有实例都会使用这个头文件。

#include <iostream>
#include <boost/algorithm/string.hpp>

int main() {
    std::string s = "Hello World";
    std::cout << boost::algorithm::to_upper_copy(s) << std::endl;
    std::cout << boost::algorithm::to_lower_copy(s) << std::endl;

    return 0;
}

函数boost::algorithm::to_upper_copy用于转换一个字符串为大写形式，boost::algorithm::to_lower_copy用于转换一个字符串为小写形式。

打印结果：

二、字符串删除

#include <iostream>
#include <boost/algorithm/string.hpp>

int main() {
    std::string s = "Hello World";
    /* 删除首次匹配的字符 */
    std::cout << "删除首次匹配的字符:" << boost::algorithm::erase_first_copy(s, "l") << std::endl;
    /* 删除第n个匹配项的字符 */
    std::cout << "删除第n个匹配项的字符:" << boost::algorithm::erase_nth_copy(s, "e", 0) << std::endl;
    /* 删除最后匹配的字符 */
    std::cout << "删除最后匹配的字符:" << boost::algorithm::erase_last_copy(s, "l") << std::endl;
    /*删除所有匹配的字符*/
    std::cout << "删除所有匹配的字符:" << boost::algorithm::erase_all_copy(s, "l") << std::endl;
    /*从头部删除若干字符串*/
    std::cout << "从头部删除若干字符串:" <<  boost::algorithm::erase_head_copy(s, 2) << std::endl;
    /*从尾部删除若干字符串*/
    std::cout << "从尾部删除若干字符串:" << boost::algorithm::erase_tail_copy(s, 2) << std::endl;

    return 0;
}

打印结果：

三、字符串查找

#include <boost/algorithm/string.hpp>
#include <iostream>

int main() {
    std::string input = "abc123def456";
    /*从首位查找子字符串，找到则返回子字符串，否则返回空 */
    std::cout << boost::algorithm::find_first(input, "abc")  << std::endl;
    /*从末尾查找子字符串，找到则返回子字符串，否则返回空*/
    std::cout << boost::algorithm::find_last(input, "abc") << std::endl;
    /*查找指定位置是否有符合的子字符串，找到则返回子字符串否则返回空*/
    std::cout << boost::algorithm::find_nth(input, "abc", 0) << std::endl;
    /*从头部截取两个字符并返回*/
    std::cout << boost::algorithm::find_head(input, 2) << std::endl;
    /*从尾部截取3个字符并返回*/
    std::cout << boost::algorithm::find_tail(input, 3) << std::endl;
    
    return 0;
}

打印结果：

四、字符串拼接

boost::algorithm::join是Boost库中的一个函数，用于将一个容器中的元素连接成一个字符串。

参数说明：

range：要连接的容器或范围，例如std::vector、std::list、std::set等。
separator：连接元素之间的分隔符。

下面是一个使用boost::algorithm::join函数的示例：

#include <iostream>
#include <vector>
#include <string>
#include <boost/algorithm/string.hpp>

int main() {
    std::vector<std::string> words = { "Hello", "world", "Boost", "C++", "Library" };
    std::string result = boost::algorithm::join(words, ", ");

    std::cout << result << std::endl;

    return 0;
}

打印结果：

五、字符串替换

#include <boost/algorithm/string.hpp>
#include <iostream> 

int main() {
    std::string input = "ABrBs Schaling";
    /*从首部用字符"D"替换input中首次出现的字符"B"*/
    std::cout << boost::algorithm::replace_first_copy(input, "B", "D") << std::endl;
    /*用字符"D"替换第n个匹配的字符"A"*/
    std::cout << boost::algorithm::replace_nth_copy(input, "A", 0, "D") << std::endl;
    /*从尾部用字符"D"替换input中首次出现的字符"S"*/
    std::cout << boost::algorithm::replace_last_copy(input, "S", "D") << std::endl;
    /*用字符"D"替换input中所有的"B"*/
    std::cout << boost::algorithm::replace_all_copy(input, "B", "D") << std::endl;
    /*从input的头部使用"Dorie"替换5个字符*/
    std::cout << boost::algorithm::replace_head_copy(input, 5, "Dorie") << std::endl;
    /*从input的尾部使用"Becker"替换8个字符*/
    std::cout << boost::algorithm::replace_tail_copy(input, 8, "Becker") << std::endl;

    return 0;
}

打印结果：

六、字符串修剪

可以使用修剪函数 boost::algorithm::trim_left_copy()， boost::algorithm::trim_right_copy() 以及 boost::algorithm::trim_copy() 等自动去除字符串中的空格或者字符串的结束符。什么字符是空格取决于全局区域设置。

#include <boost/algorithm/string.hpp>
#include <iostream> 

int main() {
    std::string s = "  Boris Schaling \t";
    std::cout << "." << boost::algorithm::trim_left_copy(s) << "." << std::endl;
    std::cout << "." << boost::algorithm::trim_right_copy(s) << "." << std::endl;
    std::cout << "." << boost::algorithm::trim_copy(s) << "." << std::endl;

    return 0;
}

打印结果：

Boost.StringAlgorithms 库的函数可以接受一个附加的谓词参数，以决定函数作用于字符串的哪些字符。谓词版本的修剪函数相应地被命名为 boost::algorithm::trim_left_copy_if()， boost::algorithm::trim_right_copy_if() 和 boost::algorithm::trim_copy_if() 。

#include <boost/algorithm/string.hpp>
#include <iostream> 

int main() {
    std::string input = "--Boris Schaling--";
    std::cout << "." << boost::algorithm::trim_left_copy_if(input, boost::algorithm::is_any_of("-")) << "." << std::endl;
    std::cout << "." << boost::algorithm::trim_right_copy_if(input, boost::algorithm::is_any_of("-")) << "." << std::endl;
    std::cout << "." << boost::algorithm::trim_copy_if(input, boost::algorithm::is_any_of("-")) << "." << std::endl;

    return 0;
}

打印结果：

Boost.StringAlgorithms 类也提供了众多返回通用谓词的辅助函数。函数 boost::algorithm::is_digit() 返回的谓词在字符为数字时返回布尔值 true。检查字符是否为大写或小写的辅助函数分别是 boost::algorithm::is_upper() 和 boost::algorithm::is_lower() 。所有这些函数都默认使用全局区域设置，除非在参数中指定其他区域设置。

#include <boost/algorithm/string.hpp> 
#include <iostream> 

int main() {
    std::string input = "123456789Boris Schaling123456789";
    std::cout <<  boost::algorithm::trim_left_copy_if(input, boost::algorithm::is_digit()) <<  std::endl;
    std::cout <<  boost::algorithm::trim_right_copy_if(input, boost::algorithm::is_digit())  << std::endl;
    std::cout <<  boost::algorithm::trim_copy_if(input, boost::algorithm::is_digit()) <<  std::endl;

    return 0;
}

打印结果：

七、字符串处理

#include <boost/algorithm/string.hpp> 
#include <iostream> 

int main() {
    std::string input = "Boris Schaling";
    /*用于检查一个字符串是否以指定的前缀开头*/
    std::cout << boost::algorithm::starts_with(input, "Boris") << std::endl;
    /*用于检查一个字符串是否以指定的后缀结尾*/
    std::cout << boost::algorithm::ends_with(input, "Schaling") << std::endl;
    /*用于检查一个字符串是否包含指定的子串。*/
    std::cout << boost::algorithm::contains(input, "is") << std::endl;
    
    return 0;
}

打印结果：

另外，boost::algorithm::lexicographical_compare是Boost库中的一个函数，用于比较两个范围（Range）的字典序关系。

函数的声明如下：

template<typename Range1T, typename Range2T>
bool lexicographical_compare(const Range1T& range1, const Range2T& range2);

参数说明：

range1：第一个范围，可以是容器（如std::vector、std::list等）或迭代器对。
range2：第二个范围，可以是容器或迭代器对。

函数返回一个bool类型的值，表示两个范围的字典序关系。返回值为true表示range1小于range2，返回值为false表示range1大于等于range2。

函数的比较规则是逐个比较两个范围中对应位置的元素，并根据它们的字典序进行判断。如果两个范围完全相等，则返回false。如果两个范围中的某个元素在字典序上小于另一个范围对应位置的元素，则返回true。

boost::algorithm::lexicographical_compare是Boost库中的一个函数，用于比较两个范围（Range）的字典序关系。

使用该函数需要包含<boost/algorithm/string.hpp>头文件。

函数的声明如下：

template<typename Range1T, typename Range2T> bool lexicographical_compare(const Range1T& range1, const Range2T& range2);

参数说明：

range1：第一个范围，可以是容器（如std::vector、std::list等）或迭代器对。
range2：第二个范围，可以是容器或迭代器对。

函数返回一个bool类型的值，表示两个范围的字典序关系。返回值为true表示range1小于range2，返回值为false表示range1大于等于range2。

下面是一个使用boost::algorithm::lexicographical_compare函数的示例：

#include <iostream>
#include <vector>
#include <boost/algorithm/string.hpp>

int main() {
    std::vector<int> range1 = { 1, 2, 3 };
    std::vector<int> range2 = { 1, 2, 4 };

    bool result = boost::algorithm::lexicographical_compare(range1, range2);

    if (result) {
        std::cout << "range1 is less than range2" << std::endl;
    }
    else {
        std::cout << "range1 is greater than or equal to range2" << std::endl;
    }

    return 0;
}

打印结果：

八、字符串切割

在给定分界符后，使用函数 boost::algorithm::split() 可以将一个字符串拆分为一个字符串容器。它需要给定一个谓词作为第三个参数以判断应该在字符串的哪个位置分割。这个例子使用了辅助函数 boost::algorithm::is_space() 创建一个谓词，在每个空格字符处分割字符串。

#include <boost/algorithm/string.hpp> 
#include <iostream> 
#include <vector> 

int main() {
    std::string s = "Boris Schaling";
    std::vector<std::string> v;
    boost::algorithm::split(v, s, boost::algorithm::is_space());
    std::cout << v.size() << std::endl;
    for (auto it = v.begin(); it != v.end(); ++it) {
        std::cout << *it << std::endl;
    }
}

打印结果：

九、词汇分割器库 Boost.Tokenizer

boost::tokenizer 是 Boost 库中的一个类模板，用于将字符串进行分词操作。它能够将字符串按照指定的分隔符进行分割，并将分割后的子串以迭代器的形式返回。

boost::tokenizer 需要包含 <boost/tokenizer.hpp> 头文件，并且需要指定分隔符类型作为模板参数。常用的分隔符类型是 boost::char_separator。

boost::char_separator 是 Boost 库中的一个类模板，用于指定字符分隔符。它接受一个字符集合（字符串）作为参数，并定义了分隔符的行为。

#include <iostream>
#include <string>
#include <boost/tokenizer.hpp>

int main() {
    std::string str = "Hello, world! Boost C++ Library";
    boost::char_separator<char> sep(" ,!");
    boost::tokenizer<boost::char_separator<char>> tokens(str, sep);

    for (const auto& token : tokens) {
        std::cout << token << std::endl;
    }

    return 0;
}

打印结果：

除了 boost::char_separator 类之外， Boost.Tokenizer 还提供了另外两个类以识别部分表达式。

#include <iostream>
#include <string>
#include <boost/tokenizer.hpp>

int main()
{
    typedef boost::tokenizer<boost::escaped_list_separator<char> > tokenizer;
    std::string s = "Boost,\"C++ libraries\"";
    tokenizer tok(s);
    for (tokenizer::iterator it = tok.begin(); it != tok.end(); ++it) {
        std::cout << *it << std::endl;
    }     
}

boost::escaped_list_separator 类用于读取由逗号分隔的多个值，它甚至还可以处理双引号以及转义序列。

打印结果：

另一个是 boost::offset_separator 类，必须用实例说明。这个类的对象必须作为第二个参数传递给 boost::tokenizer 类的构造函数。

#include <boost/tokenizer.hpp> 
#include <string> 
#include <iostream> 

int main()
{
    typedef boost::tokenizer<boost::offset_separator> tokenizer;
    std::string s = "Boost C++ libraries";
    int offsets[] = { 5, 5, 9 };
    boost::offset_separator sep(offsets, offsets + 3);
    tokenizer tok(s, sep);
    for (tokenizer::iterator it = tok.begin(); it != tok.end(); ++it) {
        std::cout << *it << std::endl;
    }      
}

打印结果：

posted @ 2023-06-14 16:54 TechNomad 阅读(1305) 评论(0) 收藏举报

刷新页面返回顶部

TechNomad

编程是一场持久战，只有坚持不懈才能取得胜利。

boost库之字符串处理(三)

公告