«Flexible» protocol buffer implementation.

Every body knows about protobuf from Google team. It is very useful and effective protocol for data exchanges.

«Protocol Buffers are a way of encoding structured data in an efficient yet extensible format. Google uses Protocol Buffers for almost all of its internal RPC protocols and file formats.»

For my project I need to implement specific router for messages from different clients. So my router (receiver) should be able to read (parse) body of message in order to get information about for routing. But I couldn’t predict all protobuf message format on design phase, as result I should analyze message format on runtime.

Google team suggest to use special technique «Self-describing Messages» : insert additional field to message with full protocol description. As result all messages will have information about their structue. It is good trick, but the message size will extremely big. (main advantage of protobuf in my project is small message size). So I decide to have full message format catalogue on my router and update this catalogue in runtime.

The plain-text .proto file should be prepared as message using descriptor.proto and protoc utility.

The code below shows how router will read protocol format from file system and parse incoming message.

We will use addressbook.proto from examples:

package tutorial;

option java_package = "com.example.tutorial";
option
java_outer_classname = "AddressBookProtos";

message Person {
  required string name = 1;
  required int32 id = 2;        // Unique ID number for this person.
  optional string email = 3;

  enum PhoneType {
    MOBILE = 0;
    HOME = 1;
    WORK = 2;
}

  message PhoneNumber {
    required string number = 1;
    optional PhoneType type = 2 [default = HOME];
  }

  repeated PhoneNumber phone = 4;
}

// Our address book file is just one of these.

message AddressBook {
  repeated Person person = 1;
}

Clients (sender) can implement classical way for producing message (generate java classes via protoc):

protoc —java_out=. addressbook.proto

And following example of message generate:

   Person.Builder person = Person.newBuilder();
   person.setId(Integer.valueOf(42));
   person.setEmail("test_email@gmail.com");
   person.setName("Viktor Villari");
   Person p = person.build();
   FileOutputStream fstream = new FileOutputStream(messagePath);
   CodedOutputStream outSream = CodedOutputStream.newInstance(fstream);
   p.writeTo(outSream);
   outSream.flush();

as result we will have file with message (messagePath)

After it we should generate descriptor for our addressbook.proto:

protoc --descriptor_set_out=address.proto.descriptor  addressbook.proto

I transfer address.proto.descriptor to receiver format catalogue and following code allow me to parse message:

   // read protocol Descriptor
   FileInputStream input = new FileInputStream(protoDescripter);
   DescriptorProtos.FileDescriptorSet fdsProto = DescriptorProtos.FileDescriptorSet.parseFrom(input);
// point to specific file in FileDescriptorSet
   System.out.println("File name = " + fdsProto.getFile(0).getName());
   FileDescriptor fileDescr = FileDescriptor.buildFrom(fdsProto.getFile(0), new FileDescriptor[0]);
// point to specific message type in FileDescriptor
   System.out.println("Message type = " + fileDescr.getMessageTypes().get(0).getName());
   Descriptor messageType=fileDescr.getMessageTypes().get(0);
// read and parse incomming message
   input = new FileInputStream(messagePath);
   DynamicMessage dm = DynamicMessage.parseFrom(messageType, input);
// output fields from message
   Iterator<FieldDescriptor> i = messageType.getFields().iterator();
while (i.hasNext()) {
    FieldDescriptor field = i.next();
    System.out.println(messageType.getName() + "." + field.getName() + " value="
      + dm.getField(field) + " type=" + field.getJavaType());
   }
So, output of code:
File name = addressbook.proto
Message type = Person
Person.name value=Viktor Villari type=STRING
Person.id value=42 type=INT
Person.email value=test_email@gmail.com type=STRING
Person.phone value= type=MESSAGE

Resume: using this technique you able to build dynamic message parsing without increase size of each message.