(file) Return to readme.binary_protocol_faq CVS log (file) (dir) Up to [Pegasus] / pegasus

  1 mike  1.1                     The OpenPegasus Binary Protocol FAQ
  2                               ===================================
  3           
  4           This FAQ (Frequently Asked Questions) hopefully addresses questions you might
  5           have about the the OpenPegasus Binary Protocol. If you find your question was
  6           not addressed, please ask your question on the OpenPegasus mailing list and
  7           request that the answer be included in this document.
  8           
  9           What is the Binary Protocol?
 10           ============================
 11           
 12           The binary protocol is a fast protocol for client-server communication. It
 13           allows local clients to send binary messages to the OpenPegasus server and
 14           receive binary responses. This protocol is much faster than the default
 15           XML protocol. When the binary protocol is enabled, local clients use it to
 16           communicate with the OpenPegasus server. Examples of local clients include:
 17           (1) out-of-process providers making up-calls, (2) any provider making a local
 18           connection with the CIMClient class, (2) any local process making a local
 19           connection with the CIMClient class.
 20           
 21           Why use the Binary Protocol?
 22 mike  1.1 ============================
 23           
 24           The main reason to use the binary protocol is to improve performance of the
 25           server and clients. Some operations execute as much as 4 times faster with
 26           the binary protocol.
 27           
 28           How big are binary messages?
 29           ============================
 30           
 31           Binary messages are slightly larger than XML messages for two reasons: 
 32           (1) Strings are transmitted using 2-byte characters and (2) Objects are
 33           aligned on 8-byte boundaries.
 34           
 35           What are the basic rules for encoding messages?
 36           ===============================================
 37           
 38           The encoding rules were designed for speed rather than size. The basic rules
 39           are:
 40           
 41               (1) All basic types are aligned on 8-byte boundaries. For example, a uint32
 42                   starts on an 8-byte boundary and is followed by 4 padding bytes so that
 43 mike  1.1         the next type is aligned on an 8-byte boundary. This alignment has 2
 44                   advantages: (1) it allows any basic type to be dereferenced directly in 
 45                   the buffer without having to relocate it and (2) it is inexpensive to
 46                   calculate the alignment of the next type.
 47           
 48                   There are 2 other data alignment techniques we could have chosen:
 49           
 50                       (*) Don't align at all. Just pack types into the buffer end-to-end. 
 51                           This technique yields smaller messages but sacrifices 
 52                           performance since types cannot be assigned directly to or 
 53                           from the data buffer (some operating systems generate data 
 54                           alignment errors).
 55           
 56                       (*) Align types on their "natural boundaries". This means that
 57                           a type should be aligned on boundaries divisible by its size.
 58                           For example, a 2-byte integer should be aligned on a 2-byte
 59                           boundary, or a 4-byte integer should be aligned on a 4-byte
 60                           boundary. This alignment technique allows types to be directly
 61                           assigned to or from the data buffer. However, aligning the data
 62                           buffer for the next type is slightly more expensive since it
 63                           requires a few extra instructions to compute the alignment
 64 mike  1.1                 boundary than our approach.
 65           
 66               (2) Types are serialized into the message in their native representations.
 67                   We do not change the representation to either big endian or little 
 68                   endian. Instead, the recipient of the message is responsible for
 69                   converting the data into its native representation. If both processes
 70                   have the same native representation, then the reordering of bytes can
 71                   be avoided (this is always the case with local processes). This policy
 72                   has been referred to as "reader makes right". That is, the reader is
 73                   responsible for making the incoming data into the "right" 
 74                   representation.
 75           
 76               (3) All arrays are represented by their size followed by their elements.
 77                   The size is always 4 bytes with 4 extra bytes of padding. In this way,
 78                   the elements always begin on an 8-byte boundary. The elements are packed
 79                   end to end with no padding.
 80           
 81               (4) Strings are represented like arrays (size plus elements).
 82           
 83               (5) Boolean are represented as a single byte, either 0 or 1.
 84           
 85 mike  1.1     (6) Complex objects that have optional elements or boolean elements, often 
 86                   employ a single 4-byte bit mask that indicates which flags are true
 87                   and which elements are present in the network buffer. For example,
 88                   the CIMProperty representation has a bit mask that indicates whether
 89                   the property:
 90           
 91                       (*) is an array.
 92                       (*) is propagated.
 93                       (*) has qualifiers.
 94                       (*) has a non-empty references class .
 95                       (*) has a non-empty class origin.
 96           
 97                   This save considerable space. For example, if there are no qualifiers,
 98                   then we save 8 bytes that would be needed to represent an empty 
 99                   qualifier array.
100           
101           What is the layout of a binary message?
102           =======================================
103           
104           Binary messages are comprised of a header followed by a body. The header has
105           the following elements:
106 mike  1.1 
107               (1) Magic number - contains 0xF00DFACE.
108               (2) Version number - 1 for the first version.
109               (3) Flags - flags used to represent boolean options of the message.
110               (4) Message ID - same as the message ID in a CIM message.
111               (5) Operation - an integer representing the CIM operation, given as follows:
112           
113                   (*) Invalid = 1
114                   (*) GetClass = 2
115                   (*) GetInstance = 3
116                   (*) IndicationDelivery = 4 (binary version not implemented)
117                   (*) DeleteClass = 5
118                   (*) DeleteInstance = 6
119                   (*) CreateClass = 7
120                   (*) CreateInstance = 8
121                   (*) ModifyClass = 9
122                   (*) ModifyInstance = 10
123                   (*) EnumerateClasses = 11
124                   (*) EnumerateClassNames = 12
125                   (*) EnumerateInstances = 13
126                   (*) EnumerateInstanceNames = 14
127 mike  1.1         (*) ExecQuery = 15
128                   (*) Associators = 16
129                   (*) AssociatorNames = 17
130                   (*) References = 18
131                   (*) ReferenceNames = 19
132                   (*) GetProperty = 20
133                   (*) SetProperty = 21
134                   (*) GetQualifier = 22
135                   (*) SetQualifier = 23
136                   (*) DeleteQualifier = 24
137                   (*) EnumerateQualifiers = 25
138                   (*) InvokeMethod = 26
139           
140           Does the binary protocol use HTTP?
141           ==================================
142           
143           Yes. The binary protocol uses the existing OpenPegasus HTTP infrastructure.
144           It preserve the same headers as the conventional protocol.
145           
146           Does the binary protocol define new HTTP headers?
147           =================================================
148 mike  1.1 
149           Yes. It defines two new headers:
150           
151               Content-Type: application/x-openpegasus
152               Accept: application/x-openpegasus
153           
154           The first header is borne by both binary requests and binary responses.
155           It indicates that the content (payload) contains an OpenPegasus binary messages.
156           
157           The second header is sent by a request and indicates that the client can
158           handle OpenPegasus binary responses.
159           
160           The client can combine these headers to achieve 4 different behaviors:
161           
162               (1) Binary request/Binary response:
163           
164                       Content-Type: application/x-openpegasus
165                       Accept: application/x-openpegasus
166           
167               (2) Binary request/XML response:
168           
169 mike  1.1             Content-Type: application/x-openpegasus
170           
171               (3) XML request/binary response:
172           
173                       Accept: application/x-openpegasus
174           
175               (4) XML request/XML response:
176           
177                       (omit both headers)
178           
179           Only 1 and 4 can be achieved without minor code changes to OpenPegasus.
180           
181           How does protocol versioning work?
182           ==================================
183           
184           The binary messages carries a version number in the header. This will be used
185           to support backwards compatibility with clients. The server must never be
186           modified to send version N+1 messages to version N clients.
187           
188           Does the binary protocol support remote communication?
189           ======================================================
190 mike  1.1 
191           Yes, although there is no official SDK interface for enabling it. To enable
192           it, one must obtain the CIMClientRep from the CIMClient instance and set
193           the following data members to true.
194           
195               CIMClientRep::_binaryRequest
196               CIMClientRep::_binaryResponse
197           
198           The following code fragment shows how one might do this in a program.
199           
200               static void _SetBinaryRequest(CIMClient& client, Boolean flag)
201               {
202                   CIMClientRep* rep = *(reinterpret_cast<CIMClientRep**>(&client));
203                   rep->setBinaryRequest(flag);
204               }
205           
206               static void _SetBinaryResponse(CIMClient& client, Boolean flag)
207               {
208                   CIMClientRep* rep = *(reinterpret_cast<CIMClientRep**>(&client));
209                   rep->setBinaryResponse(flag);
210               }
211 mike  1.1 
212               ...
213           
214               CIMClient client;
215               _SetBinaryRequest(client, true);
216               _SetBinaryResponse(client, true);
217           
218               client.connect("localhost", 22000, String(), String());
219           
220           This forces a remote binary connection.
221           
222           What is on-demand de-serialization?
223           ===================================
224           
225           The binary protocol supports a feature we call "on-demand de-serialization". 
226           When using out-of-process providers, data may be de-serialized unnecessarily.
227           Consider the following sequence of events.
228           
229               (1) The client sends an EnumerateInstances request to the server.
230               (2) The server de-serializes the request.
231               (3) The server serializes the request for the provider agent.
232 mike  1.1     (4) The provider agent de-serializes the request.
233               (5) The provider agent obtains response.
234               (6) The provider agent serializes the response for the server.
235               (7) The server de-serializes the request.
236               (8) The server serializes the request for the client.
237               (9) The client de-serializes the request.
238           
239           The on-demand de-serialization feature eliminates the de-serialization of the
240           returned instances, saving them in a data buffer. Then in step 8, the data 
241           buffer is sent to the client. This optimization avoids one de-serialization and
242           one serialization. For the EnumerateInstances operation, this optimization 
243           alone doubles the speed of servicing this operation.
244           
245           On-demand de-serialization is implemented for the following operations:
246           
247               (1) EnumerateInstances
248               (2) GetInstance
249           
250           Where is the binary encoding/decoding code located?
251           ===================================================
252           
253 mike  1.1 The source code for encoding and decoding binary requests and responses is
254           all included in a single module. The source files are located here:
255           
256               pegasus/src/Pegaus/Common/BinaryCodec.h
257               pegasus/src/Pegaus/Common/BinaryCodec.cpp
258           
259           The source code that implements the encoding of objects themselves is located
260           here:
261           
262               pegasus/src/Pegaus/Common/CIMBuffer.h
263               pegasus/src/Pegaus/Common/CIMBuffer.cpp
264           
265           How do you build OpenPegasus with binary protocol support?
266           ==========================================================
267           
268           To build OpenPegasus with binary protocol support, define the following
269           environment variable first:
270           
271               $ export PEGASUS_ENABLE_PROTOCOL_BINARY=true
272           
273           Are there further improvements that could be made to the protocol?
274 mike  1.1 ==================================================================
275           
276           Yes, here are a few.
277           
278               (1) Excessive copying is required to convert CIMBuffer objects into
279                   Buffer objects. CIMBuffer could be reimplemented to use Buffer
280                   as its representation. Then it would be possible to "swap" their
281                   implementations rather than copying one to the other.
282           
283               (2) It might have been better to align objects on their natural boundaries
284                   rather than on 8-byte boundaries. This would probably reduce the 
285                   message size by 25% or so.
286           
287               (3) The on-demand de-serialization should probably be extended to operations
288                   other than just GetInstance and EnumerateInstances. For example, the
289                   following operations would benefit the most.
290           
291                       (*) EnumerateInstanceNames
292                       (*) ExecQuery
293                       (*) Associators
294                       (*) AssociatorNames
295 mike  1.1             (*) References
296                       (*) ReferenceNames
297                       (*) InvokeMethod
298           
299               (4) The binary protocol should be extended to support the pull-operations
300                   whenever they are implemented for XML.

No CVS admin address has been configured
Powered by
ViewCVS 0.9.2