Google protocol buffer 常用序列化和反序列化函数

　　首先，protocol buffer（protobuf）是一种轻便高效的结构化数据存储格式，可以用于结构化数据串行化，或者说序列化。相较XML、json更加得轻便，易懂。它很适合做数据存储或 RPC 数据交换格式。可用于通讯协议、数据存储等领域的语言无关、平台无关、可扩展的序列化结构数据格式。常与google的gRPC框架配合使用。目前提供了 C++、Java、Python 三种语言的 API。

　　protobuf的基础内容网上的资料较多，了解可以看 DeveloperWorks 、官方中文文档。

最常用的数据格式是message，例如一个订单数据可以用message表示如下（这些信息将被卸载.proto文件中）：

message Order
{
    required uint64 uid = 1;
    required float cost = 2;
    optional string tag = 3;
}

它经过 protobuf 编译成 c++ 代码，会生成对应的 XXX.pb.h 和 XXX.pb.cc。message 会对应生成一个 class，里面存放对应的 data members，处理这些数据的函数，以及对应的打包和解包函数。

对于储存的message格式的数据需要知道的是：

1. 每个字段末尾赋值的 tag：该 tag 是用来标记该字段在序列化后的二进制数据中所在的 field，每个字段的 tag 在 message 内部都是独一无二的。也不能进行改变，否则数据就不能正确的解包。

2. 数据类型前面的修饰词：

required: 必须赋值，不能为空，否则该条 message 会被认为是“uninitialized”。build 一个“uninitialized” message 会抛出一个 RuntimeException 异常，解析一条 “uninitialized” message 会抛出一条 IOException 异常。除此之外，“required” 字段跟 “optional” 字段并无差别。
optional: 字段可以赋值，也可以不赋值。假如没有赋值的话，会被赋上默认值。
repeated: 该字段可以重复任意次数，包括 0 次。重复数据的顺序将会保存在 protocol buffer 中，将这个字段想象成一个可以自动设置 size 的数组就可以了。

枚举

message 数据格式在 c++ 中被 protobuf 自动编译包含一下内容：

//xxx.proto
message Order
{
    required uint64 uid = 1;
    required float cost = 2;
    optional string tag = 3;
}
 
//xxx.pb.h
class Order : public ::google::protobuf::Message {
 public:
  ...
  // accessors -------------------------------------------------------
 
  // required uint64 uid = 1;
  inline bool has_uid() const;
  inline void clear_uid();
  static const int kUidFieldNumber = 1;
  inline ::google::protobuf::uint64 uid() const;
  inline void set_uid(::google::protobuf::uint64 value);
 
  // required float cost = 2;
  inline bool has_cost() const;
  inline void clear_cost();
  static const int kCostFieldNumber = 2;
  inline float cost() const;
  inline void set_cost(float value);
 
  // optional string tag = 3;
  inline bool has_tag() const;
  inline void clear_tag();
  static const int kTagFieldNumber = 3;
  inline const ::std::string& tag() const;
  inline void set_tag(const ::std::string& value);
  inline void set_tag(const char* value);
  inline void set_tag(const char* value, size_t size);
  inline ::std::string* mutable_tag();
  inline ::std::string* release_tag();
  inline void set_allocated_tag(::std::string* tag);
 
  // @@protoc_insertion_point(class_scope:Order)
 private:
  inline void set_has_uid();
  inline void clear_has_uid();
  inline void set_has_cost();
  inline void clear_has_cost();
  inline void set_has_tag();
  inline void clear_has_tag();
 
  ::google::protobuf::uint32 _has_bits_[1];
 
  ::google::protobuf::uint64 uid_;
  ::std::string* tag_;
  float cost_;
};

对于每一个 message 的 data member，protobuf 会自动生成相关的处理函数，对于每一个字段主要的处理函数有：has_uid(), clear_uid(), uid(), set_uid()，它们分别用于判断该字段是否被设置，清除该字段设置记录，获得该字段，设置该字段。对于示例中的 uid 字段，对应函数的实现如下：

//xxx.pb.h
 
// required uint64 uid = 1;
inline bool Order::has_uid() const {
  return (_has_bits_[0] & 0x00000001u) != 0;
}
inline void Order::set_has_uid() {
  _has_bits_[0] |= 0x00000001u;
}
inline void Order::clear_has_uid() {
  _has_bits_[0] &= ~0x00000001u;
}
inline void Order::clear_uid() {
  uid_ = GOOGLE_ULONGLONG(0);
  clear_has_uid();
}
inline ::google::protobuf::uint64 Order::uid() const {
  // @@protoc_insertion_point(field_get:Order.uid)
  return uid_;
}
inline void Order::set_uid(::google::protobuf::uint64 value) {
  set_has_uid();
  uid_ = value;
  // @@protoc_insertion_point(field_set:Order.uid)
}

首先是protobuf最基础/核心的内容，就是结构数据的序列化和反序列化函数：

　　通过前面的基础介绍，相当于我们已经定义了自己的protocol buffer协议，在.proto文件中，且经过了protoc编译器的编译现在相当于我们已经有了一个协议，开始应用protobuf提供的序列化和反序列化的API。

//c数组的序列化和反序列化
bool ParseFromArray(const void* data, int size); //反序列化
bool SerializeToArray(void* data, int size) const; //序列化

//序列化
void set_people()             
{
    wp.set_name("sealyao");   
    wp.set_id(123456);        
    wp.set_email("sealyaog@gmail.com");
    wp.SerializeToArray(parray,256);  //将wp序列化，结果保存在parray中，是一个char*
}
//反序列化
void get_people()             
{
    rap.ParseFromArray(parray,256);
    cout << "Get People from Array:" << endl;
    cout << "\t Name : " <<rap.name() << endl;
    cout << "\t Id : " << rap.id() << endl;
    cout << "\t email : " << rap.email() << endl;
}

当然除了c语言的序列化/反序列化还提供了其他一些API：

//C++ String的序列化和反序列化API
bool SerializeToString(string* output) const;
bool ParseFromString(const string& data);
//序列化
void set_people()             
{
    wp.set_name("sealyao");   
    wp.set_id(123456);        
    wp.set_email("sealyaog@gmail.com");
    wp.SerializeToString(&pstring);
}
//反序列化
void get_people()             
{
    rsp.ParseFromString(pstring);  
    cout << "Get People from String:" << endl;
    cout << "\t Name : " <<rsp.name() << endl;
    cout << "\t Id : " << rsp.id() << endl;
    cout << "\t email : " << rsp.email() << endl;
}

//文件描述符的序列化和序列化API
bool SerializeToFileDescriptor(int file_descriptor) const;
bool ParseFromFileDescriptor(int file_descriptor);
//序列化
void set_people()
{
    fd = open(path,O_CREAT|O_TRUNC|O_RDWR,0644);
    if(fd <= 0){
        perror("open");
        exit(0); 
    }   
    wp.set_name("sealyaog");
    wp.set_id(123456);
    wp.set_email("sealyaog@gmail.com");
    wp.SerializeToFileDescriptor(fd);   
    close(fd);
}
//反序列化
void get_people()
{
    fd = open(path,O_RDONLY);
    if(fd <= 0){
        perror("open");
        exit(0);
    }
    rp.ParseFromFileDescriptor(fd);
    std::cout << "Get People from FD:" << endl;
    std::cout << "\t Name : " <<rp.name() << endl;
    std::cout << "\t Id : " << rp.id() << endl;
    std::cout << "\t email : " << rp.email() << endl;
    close(fd);
}

//C++ stream序列化和反序列化API
bool SerializeToOstream(ostream* output) const;
bool ParseFromIstream(istream* input);
 
//序列化
void set_people()
{
    fstream fs(path,ios::out|ios::trunc|ios::binary);
    wp.set_name("sealyaog");
    wp.set_id(123456);
    wp.set_email("sealyaog@gmail.com");
    wp.SerializeToOstream(&fs);    
    fs.close();
    fs.clear();
}
//反序列化
void get_people()
{
    fstream fs(path,ios::in|ios::binary);
    rp.ParseFromIstream(&fs);
    std::cout << "\t Name : " <<rp.name() << endl;
    std::cout << "\t Id : " << rp.id() << endl; 
    std::cout << "\t email : " << rp.email() << endl;   
    fs.close();
    fs.clear();
}

posted @ 2020-07-22 14:27 会打架的程序员不是好客服阅读(3215) 评论(0) 收藏举报

刷新页面返回顶部

Bestcoderg

浪费的时间都成了脑子里的水

Google protocol buffer 常用序列化和反序列化函数

公告