1 mike 1.1 The OpenPegasus Binary Protocol FAQ
2 ===================================
3
4 This FAQ (Frequently Asked Questions) hopefully addresses questions you might
5 have about the the OpenPegasus Binary Protocol. If you find your question was
6 not addressed, please ask your question on the OpenPegasus mailing list and
7 request that the answer be included in this document.
8
9 What is the Binary Protocol?
10 ============================
11
12 The binary protocol is a fast protocol for client-server communication. It
13 allows local clients to send binary messages to the OpenPegasus server and
14 receive binary responses. This protocol is much faster than the default
15 XML protocol. When the binary protocol is enabled, local clients use it to
16 communicate with the OpenPegasus server. Examples of local clients include:
17 (1) out-of-process providers making up-calls, (2) any provider making a local
18 connection with the CIMClient class, (2) any local process making a local
19 connection with the CIMClient class.
20
21 Why use the Binary Protocol?
22 mike 1.1 ============================
23
24 The main reason to use the binary protocol is to improve performance of the
25 server and clients. Some operations execute as much as 4 times faster with
26 the binary protocol.
27
28 How big are binary messages?
29 ============================
30
|
31 kumpf 1.2 Binary messages are slightly larger than XML messages for two reasons:
|
32 mike 1.1 (1) Strings are transmitted using 2-byte characters and (2) Objects are
33 aligned on 8-byte boundaries.
34
35 What are the basic rules for encoding messages?
36 ===============================================
37
38 The encoding rules were designed for speed rather than size. The basic rules
39 are:
40
41 (1) All basic types are aligned on 8-byte boundaries. For example, a uint32
42 starts on an 8-byte boundary and is followed by 4 padding bytes so that
43 the next type is aligned on an 8-byte boundary. This alignment has 2
|
44 kumpf 1.2 advantages: (1) it allows any basic type to be dereferenced directly in
|
45 mike 1.1 the buffer without having to relocate it and (2) it is inexpensive to
46 calculate the alignment of the next type.
47
48 There are 2 other data alignment techniques we could have chosen:
49
|
50 kumpf 1.2 (*) Don't align at all. Just pack types into the buffer end-to-end.
51 This technique yields smaller messages but sacrifices
52 performance since types cannot be assigned directly to or
53 from the data buffer (some operating systems generate data
|
54 mike 1.1 alignment errors).
55
56 (*) Align types on their "natural boundaries". This means that
57 a type should be aligned on boundaries divisible by its size.
58 For example, a 2-byte integer should be aligned on a 2-byte
59 boundary, or a 4-byte integer should be aligned on a 4-byte
60 boundary. This alignment technique allows types to be directly
61 assigned to or from the data buffer. However, aligning the data
62 buffer for the next type is slightly more expensive since it
63 requires a few extra instructions to compute the alignment
64 boundary than our approach.
65
66 (2) Types are serialized into the message in their native representations.
|
67 kumpf 1.2 We do not change the representation to either big endian or little
|
68 mike 1.1 endian. Instead, the recipient of the message is responsible for
69 converting the data into its native representation. If both processes
70 have the same native representation, then the reordering of bytes can
71 be avoided (this is always the case with local processes). This policy
72 has been referred to as "reader makes right". That is, the reader is
|
73 kumpf 1.2 responsible for making the incoming data into the "right"
|
74 mike 1.1 representation.
75
76 (3) All arrays are represented by their size followed by their elements.
77 The size is always 4 bytes with 4 extra bytes of padding. In this way,
78 the elements always begin on an 8-byte boundary. The elements are packed
79 end to end with no padding.
80
81 (4) Strings are represented like arrays (size plus elements).
82
83 (5) Boolean are represented as a single byte, either 0 or 1.
84
|
85 kumpf 1.2 (6) Complex objects that have optional elements or boolean elements, often
|
86 mike 1.1 employ a single 4-byte bit mask that indicates which flags are true
87 and which elements are present in the network buffer. For example,
88 the CIMProperty representation has a bit mask that indicates whether
89 the property:
90
91 (*) is an array.
92 (*) is propagated.
93 (*) has qualifiers.
94 (*) has a non-empty references class .
95 (*) has a non-empty class origin.
96
97 This save considerable space. For example, if there are no qualifiers,
|
98 kumpf 1.2 then we save 8 bytes that would be needed to represent an empty
|
99 mike 1.1 qualifier array.
100
101 What is the layout of a binary message?
102 =======================================
103
104 Binary messages are comprised of a header followed by a body. The header has
105 the following elements:
106
107 (1) Magic number - contains 0xF00DFACE.
108 (2) Version number - 1 for the first version.
109 (3) Flags - flags used to represent boolean options of the message.
110 (4) Message ID - same as the message ID in a CIM message.
111 (5) Operation - an integer representing the CIM operation, given as follows:
112
113 (*) Invalid = 1
114 (*) GetClass = 2
115 (*) GetInstance = 3
116 (*) IndicationDelivery = 4 (binary version not implemented)
117 (*) DeleteClass = 5
118 (*) DeleteInstance = 6
119 (*) CreateClass = 7
120 mike 1.1 (*) CreateInstance = 8
121 (*) ModifyClass = 9
122 (*) ModifyInstance = 10
123 (*) EnumerateClasses = 11
124 (*) EnumerateClassNames = 12
125 (*) EnumerateInstances = 13
126 (*) EnumerateInstanceNames = 14
127 (*) ExecQuery = 15
128 (*) Associators = 16
129 (*) AssociatorNames = 17
130 (*) References = 18
131 (*) ReferenceNames = 19
132 (*) GetProperty = 20
133 (*) SetProperty = 21
134 (*) GetQualifier = 22
135 (*) SetQualifier = 23
136 (*) DeleteQualifier = 24
137 (*) EnumerateQualifiers = 25
138 (*) InvokeMethod = 26
139
140 Does the binary protocol use HTTP?
141 mike 1.1 ==================================
142
143 Yes. The binary protocol uses the existing OpenPegasus HTTP infrastructure.
144 It preserve the same headers as the conventional protocol.
145
146 Does the binary protocol define new HTTP headers?
147 =================================================
148
149 Yes. It defines two new headers:
150
151 Content-Type: application/x-openpegasus
152 Accept: application/x-openpegasus
153
154 The first header is borne by both binary requests and binary responses.
155 It indicates that the content (payload) contains an OpenPegasus binary messages.
156
157 The second header is sent by a request and indicates that the client can
158 handle OpenPegasus binary responses.
159
160 The client can combine these headers to achieve 4 different behaviors:
161
162 mike 1.1 (1) Binary request/Binary response:
163
164 Content-Type: application/x-openpegasus
165 Accept: application/x-openpegasus
166
167 (2) Binary request/XML response:
168
169 Content-Type: application/x-openpegasus
170
171 (3) XML request/binary response:
172
173 Accept: application/x-openpegasus
174
175 (4) XML request/XML response:
176
177 (omit both headers)
178
179 Only 1 and 4 can be achieved without minor code changes to OpenPegasus.
180
181 How does protocol versioning work?
182 ==================================
183 mike 1.1
184 The binary messages carries a version number in the header. This will be used
185 to support backwards compatibility with clients. The server must never be
186 modified to send version N+1 messages to version N clients.
187
188 Does the binary protocol support remote communication?
189 ======================================================
190
191 Yes, although there is no official SDK interface for enabling it. To enable
192 it, one must obtain the CIMClientRep from the CIMClient instance and set
193 the following data members to true.
194
195 CIMClientRep::_binaryRequest
196 CIMClientRep::_binaryResponse
197
198 The following code fragment shows how one might do this in a program.
199
200 static void _SetBinaryRequest(CIMClient& client, Boolean flag)
201 {
202 CIMClientRep* rep = *(reinterpret_cast<CIMClientRep**>(&client));
203 rep->setBinaryRequest(flag);
204 mike 1.1 }
205
206 static void _SetBinaryResponse(CIMClient& client, Boolean flag)
207 {
208 CIMClientRep* rep = *(reinterpret_cast<CIMClientRep**>(&client));
209 rep->setBinaryResponse(flag);
210 }
211
212 ...
213
214 CIMClient client;
215 _SetBinaryRequest(client, true);
216 _SetBinaryResponse(client, true);
217
218 client.connect("localhost", 22000, String(), String());
219
220 This forces a remote binary connection.
221
222 What is on-demand de-serialization?
223 ===================================
224
|
225 kumpf 1.2 The binary protocol supports a feature we call "on-demand de-serialization".
|
226 mike 1.1 When using out-of-process providers, data may be de-serialized unnecessarily.
227 Consider the following sequence of events.
228
229 (1) The client sends an EnumerateInstances request to the server.
230 (2) The server de-serializes the request.
231 (3) The server serializes the request for the provider agent.
232 (4) The provider agent de-serializes the request.
233 (5) The provider agent obtains response.
234 (6) The provider agent serializes the response for the server.
235 (7) The server de-serializes the request.
236 (8) The server serializes the request for the client.
237 (9) The client de-serializes the request.
238
239 The on-demand de-serialization feature eliminates the de-serialization of the
|
240 kumpf 1.2 returned instances, saving them in a data buffer. Then in step 8, the data
|
241 mike 1.1 buffer is sent to the client. This optimization avoids one de-serialization and
|
242 kumpf 1.2 one serialization. For the EnumerateInstances operation, this optimization
|
243 mike 1.1 alone doubles the speed of servicing this operation.
244
245 On-demand de-serialization is implemented for the following operations:
246
247 (1) EnumerateInstances
248 (2) GetInstance
249
250 Where is the binary encoding/decoding code located?
251 ===================================================
252
253 The source code for encoding and decoding binary requests and responses is
254 all included in a single module. The source files are located here:
255
256 pegasus/src/Pegaus/Common/BinaryCodec.h
257 pegasus/src/Pegaus/Common/BinaryCodec.cpp
258
259 The source code that implements the encoding of objects themselves is located
260 here:
261
262 pegasus/src/Pegaus/Common/CIMBuffer.h
263 pegasus/src/Pegaus/Common/CIMBuffer.cpp
264 mike 1.1
265 How do you build OpenPegasus with binary protocol support?
266 ==========================================================
267
268 To build OpenPegasus with binary protocol support, define the following
269 environment variable first:
270
271 $ export PEGASUS_ENABLE_PROTOCOL_BINARY=true
272
273 Are there further improvements that could be made to the protocol?
274 ==================================================================
275
276 Yes, here are a few.
277
278 (1) Excessive copying is required to convert CIMBuffer objects into
279 Buffer objects. CIMBuffer could be reimplemented to use Buffer
280 as its representation. Then it would be possible to "swap" their
281 implementations rather than copying one to the other.
282
283 (2) It might have been better to align objects on their natural boundaries
|
284 kumpf 1.2 rather than on 8-byte boundaries. This would probably reduce the
|
285 mike 1.1 message size by 25% or so.
286
287 (3) The on-demand de-serialization should probably be extended to operations
288 other than just GetInstance and EnumerateInstances. For example, the
289 following operations would benefit the most.
290
291 (*) EnumerateInstanceNames
292 (*) ExecQuery
293 (*) Associators
294 (*) AssociatorNames
295 (*) References
296 (*) ReferenceNames
297 (*) InvokeMethod
298
299 (4) The binary protocol should be extended to support the pull-operations
300 whenever they are implemented for XML.
|