(file) Return to instances.txt CVS log (file) (dir) Up to [Pegasus] / pegasus / src / Pegasus / Repository

  1 mike  1.1 
  2 mike  1.2 /** Instance Repository
  3           
  4 mike  1.1 <h1>Synopsis</h1>
  5 mike  1.2     Describes the algorithms and architecture of the instances repository 
  6               (that part of the repository which deals with instances).
  7 mike  1.1 
  8           <h1>Layout</h1>
  9           
 10 mike  1.2 All files belonging to the instance repository are stored under the
 11 mike  1.3 repository/instances directory. For each class, two files are maintained:
 12           an index file (with a ".idx" extension) and an instance file which bears
 13 mike  1.1 the name of the class whose instances it contains. For example, suppose 
 14 mike  1.2 there is a class called "Zebra". Then two files are used to manage its 
 15           instances:
 16 mike  1.1 
 17               <pre>
 18 mike  1.2     repository/instances/Zebra.idx (called the index file)
 19               repository/instances/Zebra (called the instance file)
 20 mike  1.1     </pre>
 21           
 22           The first line of the index file indicates how many instances have been 
 23           modified or deleted since the last reorganization of the instance file.
 24           When this number--called a dirty count--reaches a configurable limit, the 
 25           index file and instance file are reorganized to reclaim unused space; 
 26 mike  1.3 unused space (or gaps) is left by delete and modify operations (discussed 
 27           later). The dirty count is expressed as eight hex digits (the number had to 
 28           have a fixed size so that it could be updated in place without having to 
 29           rewrite the index file).
 30 mike  1.1 
 31           Each subsequent line of the index file corresponds to an instance contained 
 32           in the instance file. Each line has the following fields:
 33           
 34               <ul>
 35           
 36               <li>
 37               deleted-flag - '1' if the corresponding instance was deleted, '0' otherwise.
 38           	When instances are deleted, the corresponding entry in the index file
 39           	is marked as deleted and a gap is left in the instance file until
 40           	reorganization time.
 41               </li>
 42           
 43               <li>
 44               hash-code - hash code for the key field below. This field is provided to
 45           	speed lookup of index entries. When looking up an entry, compute the 
 46           	hash code of the key and look for entries with the same hash code.
 47           	It is still necessary to compare the keys when the hash codes are
 48           	the same (since collisions are possible), but only when they are the
 49 mike  1.3 	same which is rare and hence this scheme saves many comparisons.
 50 mike  1.1     </li>
 51           
 52               <li>
 53 mike  1.3     offset - offset within the instance file to where the instance begins.
 54 mike  1.1     </li>
 55           
 56               <li>
 57               size - size in bytes of the instance itself (as it appears in the instance
 58           	file).
 59               </li>
 60           
 61               <li>
 62               key - the compound key of the instance (including all key binding pairs). 
 63           	This key is in standard form which we define as follows: all property 
 64           	names are shifted to lower case and the key-bindings are sorted by
 65           	property names (in this way it suffices to compare key expressions to
 66 mike  1.3 	determine if two compound keys refer to the same instance).
 67 mike  1.1     </li>
 68           
 69               </ul>
 70           
 71           Here is a sample index file:
 72           
 73               <pre>
 74               00000002
 75               0 02FB6B6C 0 1427 Y.key1=1001,key2="Hello World 1"
 76               0 50699A66 1427 1433 Y.key1=1002,key2="Hello World 2"
 77               1 EB45F85A 2860 1433 Y.key1=1004,key2="Hello World 4"
 78               1 38B42754 4293 1431 Y.key1=1005,key2="Hello World 5"
 79               0 F79B5B0A 8583 1427 Y.key1=1666,key2="Hello World N"
 80               0 38B42754 4293 1431 Y.key1=1005,key2="Hello World 5"
 81               </pre>
 82           
 83 mike  1.3 Notice that the dirty count is equal to two and that two entries are marked 
 84           as deleted (these quantities must be equal). This indicates that the instance 
 85           file has two instances which are no longer used. The space used by these 
 86           instances will be reclaimed during reorganization.
 87 mike  1.1 
 88           The layout of the instance file is trivial. Instances are always appended to
 89 mike  1.3 the the instance file. The instances are kept end-to-end in the file.
 90 mike  1.1 
 91           <h1>Operations</h1>
 92           
 93           There are three operations which may be performed on the instance repository.
 94           create, modify, and delete. This section describes how these operations affect 
 95           the index and instance file. Note that the process described below for 
 96           performing the three operations actually contains extra steps described in
 97 mike  1.3 the "Recovery" section.
 98 mike  1.1 
 99           Creation. During creation, the instance is appended to the instance file and
100           an entry is appended to the index file. If an instance with the same key is
101           found, then the operation is disallowed.
102           
103           Deletion. To delete an instance, the corresponding entry in the index file is 
104           marked as deleted (by changing the first column from '0' to '1'). And then the 
105           dirty count is incremented and updated. If the dirty count has reached the 
106 mike  1.3 configured threshold, the index and instance files are reogranized (see
107 mike  1.1 the section entitled "Reorganization" for an explanation of how this is 
108           done).
109           
110           Modification. To modify an instance, the new modified instance is appended to
111           the instance file. Next the old entry with the same key is marked as deleted.
112           Finally, a new entry is inserted into the index file.
113           
114           <h1>Reorganization</h1>
115           
116           To improve performance, reclamation of unused space in the instance file 
117 mike  1.2 (called gaps) is postponed until there are m gaps. Deletion and modification
118 mike  1.1 operations create gaps. A dirty-count is maintained in the index file as
119 mike  1.2 described above. When this dirty count reaches m, available space is reclaimed.
120 mike  1.1 Reorganization is expensive: the entire file must be rewritten. The time
121           complexity is O(n) where n is the number of instances in the file. By 
122           postponing reorganization, the time complexity may be reduced to O(1).
123           
124           Reorganization requires rewriting the instance file and the index file. All
125           gaps are deleted in the instance file. For each gap there is an entry in the 
126           index file which is marked as deleted. This entry indicates where the gap 
127           starts in the instance file and how long it is. The offsets in the index file
128           are adjusted accordingly. The index file is also updated: all entries marked
129           as deleted are removed from the index file.
130           
131           <h1>Recovery</h1>
132           
133           To avoid corruption of the instance repository, a simple recovery scheme is
134           provided. For all operations (described in the Operations section), the 
135           following algorithm is used to ensure recoverability.
136           
137               <ul>
138           
139               <li>Check to see if any rollback files exist for instances of the given
140           	class. If so then perform rollback (see recoverability algorithm
141 mike  1.1 	for details).
142               </li>
143           
144               <li>Create a rollback file for the instance file. The rollback file 
145           	contains the original size of the instance file.
146               </li>
147           
148               <li>Create a rollback file for the index file. The rollback file is a
149           	copy of the instance file (this can be optimized later).
150               </li>
151           
152               <li>Modify the instance file as described in the operations section.
153               </li>
154           
155               <li>Modify the index file as described in the operations section.
156               </li>
157           
158               <li>Perform reorganization as described in the reorganization section
159           	if the dirty-count has reached the threshold. Otherwise, increment
160           	the dirty count.
161               </li>
162 mike  1.1 
163               <li>Delete the rollback files.
164               </li>
165           
166               </ul>
167           
168           The recoverability algorithm itself works as follows:
169           
170               <ul>
171           
172               <li>Delete the index file.
173               </li>
174           
175               <li>Rename the index rollback file to have the same name as the
176           	index file.
177               </li>
178           
179               <li>Truncate the instance file to have the same number of bytes as
180           	indicated in the instance rollback file.
181               </li>
182           
183 mike  1.1     <li>Delete the rollback files.
184               </li>
185           
186               </ul>
187           
188 mike  1.2 */

No CVS admin address has been configured
Powered by
ViewCVS 0.9.2