Artifact d7cc99350c10134f358fe1a8997d9225b3f712b2:

File
ext/rbu/sqlite3rbu.h
— part of check-in
[1a1b69e8]
at
2016-03-18 18:56:45
on branch rbu-percent-progress
— Change the name of the new API on this branch to "sqlite3_bp_progress". Add tests and documentation for the same.
(user:
dan
size: 22635)
[more...]

1 /*
2 ** 2014 August 30
3 **
4 ** The author disclaims copyright to this source code. In place of
5 ** a legal notice, here is a blessing:
6 **
7 ** May you do good and not evil.
8 ** May you find forgiveness for yourself and forgive others.
9 ** May you share freely, never taking more than you give.
10 **
11 *************************************************************************
12 **
13 ** This file contains the public interface for the RBU extension.
14 */
15
16 /*
17 ** SUMMARY
18 **
19 ** Writing a transaction containing a large number of operations on
20 ** b-tree indexes that are collectively larger than the available cache
21 ** memory can be very inefficient.
22 **
23 ** The problem is that in order to update a b-tree, the leaf page (at least)
24 ** containing the entry being inserted or deleted must be modified. If the
25 ** working set of leaves is larger than the available cache memory, then a
26 ** single leaf that is modified more than once as part of the transaction
27 ** may be loaded from or written to the persistent media multiple times.
28 ** Additionally, because the index updates are likely to be applied in
29 ** random order, access to pages within the database is also likely to be in
30 ** random order, which is itself quite inefficient.
31 **
32 ** One way to improve the situation is to sort the operations on each index
33 ** by index key before applying them to the b-tree. This leads to an IO
34 ** pattern that resembles a single linear scan through the index b-tree,
35 ** and all but guarantees each modified leaf page is loaded and stored
36 ** exactly once. SQLite uses this trick to improve the performance of
37 ** CREATE INDEX commands. This extension allows it to be used to improve
38 ** the performance of large transactions on existing databases.
39 **
40 ** Additionally, this extension allows the work involved in writing the
41 ** large transaction to be broken down into sub-transactions performed
42 ** sequentially by separate processes. This is useful if the system cannot
43 ** guarantee that a single update process will run for long enough to apply
44 ** the entire update, for example because the update is being applied on a
45 ** mobile device that is frequently rebooted. Even after the writer process
46 ** has committed one or more sub-transactions, other database clients continue
47 ** to read from the original database snapshot. In other words, partially
48 ** applied transactions are not visible to other clients.
49 **
50 ** "RBU" stands for "Resumable Bulk Update". As in a large database update
51 ** transmitted via a wireless network to a mobile device. A transaction
52 ** applied using this extension is hence refered to as an "RBU update".
53 **
54 **
55 ** LIMITATIONS
56 **
57 ** An "RBU update" transaction is subject to the following limitations:
58 **
59 ** * The transaction must consist of INSERT, UPDATE and DELETE operations
60 ** only.
61 **
62 ** * INSERT statements may not use any default values.
63 **
64 ** * UPDATE and DELETE statements must identify their target rows by
65 ** non-NULL PRIMARY KEY values. Rows with NULL values stored in PRIMARY
66 ** KEY fields may not be updated or deleted. If the table being written
67 ** has no PRIMARY KEY, affected rows must be identified by rowid.
68 **
69 ** * UPDATE statements may not modify PRIMARY KEY columns.
70 **
71 ** * No triggers will be fired.
72 **
73 ** * No foreign key violations are detected or reported.
74 **
75 ** * CHECK constraints are not enforced.
76 **
77 ** * No constraint handling mode except for "OR ROLLBACK" is supported.
78 **
79 **
80 ** PREPARATION
81 **
82 ** An "RBU update" is stored as a separate SQLite database. A database
83 ** containing an RBU update is an "RBU database". For each table in the
84 ** target database to be updated, the RBU database should contain a table
85 ** named "data_<target name>" containing the same set of columns as the
86 ** target table, and one more - "rbu_control". The data_% table should
87 ** have no PRIMARY KEY or UNIQUE constraints, but each column should have
88 ** the same type as the corresponding column in the target database.
89 ** The "rbu_control" column should have no type at all. For example, if
90 ** the target database contains:
91 **
92 ** CREATE TABLE t1(a INTEGER PRIMARY KEY, b TEXT, c UNIQUE);
93 **
94 ** Then the RBU database should contain:
95 **
96 ** CREATE TABLE data_t1(a INTEGER, b TEXT, c, rbu_control);
97 **
98 ** The order of the columns in the data_% table does not matter.
99 **
100 ** Instead of a regular table, the RBU database may also contain virtual
101 ** tables or view named using the data_<target> naming scheme.
102 **
103 ** Instead of the plain data_<target> naming scheme, RBU database tables
104 ** may also be named data<integer>_<target>, where <integer> is any sequence
105 ** of zero or more numeric characters (0-9). This can be significant because
106 ** tables within the RBU database are always processed in order sorted by
107 ** name. By judicious selection of the the <integer> portion of the names
108 ** of the RBU tables the user can therefore control the order in which they
109 ** are processed. This can be useful, for example, to ensure that "external
110 ** content" FTS4 tables are updated before their underlying content tables.
111 **
112 ** If the target database table is a virtual table or a table that has no
113 ** PRIMARY KEY declaration, the data_% table must also contain a column
114 ** named "rbu_rowid". This column is mapped to the tables implicit primary
115 ** key column - "rowid". Virtual tables for which the "rowid" column does
116 ** not function like a primary key value cannot be updated using RBU. For
117 ** example, if the target db contains either of the following:
118 **
119 ** CREATE VIRTUAL TABLE x1 USING fts3(a, b);
120 ** CREATE TABLE x1(a, b)
121 **
122 ** then the RBU database should contain:
123 **
124 ** CREATE TABLE data_x1(a, b, rbu_rowid, rbu_control);
125 **
126 ** All non-hidden columns (i.e. all columns matched by "SELECT *") of the
127 ** target table must be present in the input table. For virtual tables,
128 ** hidden columns are optional - they are updated by RBU if present in
129 ** the input table, or not otherwise. For example, to write to an fts4
130 ** table with a hidden languageid column such as:
131 **
132 ** CREATE VIRTUAL TABLE ft1 USING fts4(a, b, languageid='langid');
133 **
134 ** Either of the following input table schemas may be used:
135 **
136 ** CREATE TABLE data_ft1(a, b, langid, rbu_rowid, rbu_control);
137 ** CREATE TABLE data_ft1(a, b, rbu_rowid, rbu_control);
138 **
139 ** For each row to INSERT into the target database as part of the RBU
140 ** update, the corresponding data_% table should contain a single record
141 ** with the "rbu_control" column set to contain integer value 0. The
142 ** other columns should be set to the values that make up the new record
143 ** to insert.
144 **
145 ** If the target database table has an INTEGER PRIMARY KEY, it is not
146 ** possible to insert a NULL value into the IPK column. Attempting to
147 ** do so results in an SQLITE_MISMATCH error.
148 **
149 ** For each row to DELETE from the target database as part of the RBU
150 ** update, the corresponding data_% table should contain a single record
151 ** with the "rbu_control" column set to contain integer value 1. The
152 ** real primary key values of the row to delete should be stored in the
153 ** corresponding columns of the data_% table. The values stored in the
154 ** other columns are not used.
155 **
156 ** For each row to UPDATE from the target database as part of the RBU
157 ** update, the corresponding data_% table should contain a single record
158 ** with the "rbu_control" column set to contain a value of type text.
159 ** The real primary key values identifying the row to update should be
160 ** stored in the corresponding columns of the data_% table row, as should
161 ** the new values of all columns being update. The text value in the
162 ** "rbu_control" column must contain the same number of characters as
163 ** there are columns in the target database table, and must consist entirely
164 ** of 'x' and '.' characters (or in some special cases 'd' - see below). For
165 ** each column that is being updated, the corresponding character is set to
166 ** 'x'. For those that remain as they are, the corresponding character of the
167 ** rbu_control value should be set to '.'. For example, given the tables
168 ** above, the update statement:
169 **
170 ** UPDATE t1 SET c = 'usa' WHERE a = 4;
171 **
172 ** is represented by the data_t1 row created by:
173 **
174 ** INSERT INTO data_t1(a, b, c, rbu_control) VALUES(4, NULL, 'usa', '..x');
175 **
176 ** Instead of an 'x' character, characters of the rbu_control value specified
177 ** for UPDATEs may also be set to 'd'. In this case, instead of updating the
178 ** target table with the value stored in the corresponding data_% column, the
179 ** user-defined SQL function "rbu_delta()" is invoked and the result stored in
180 ** the target table column. rbu_delta() is invoked with two arguments - the
181 ** original value currently stored in the target table column and the
182 ** value specified in the data_xxx table.
183 **
184 ** For example, this row:
185 **
186 ** INSERT INTO data_t1(a, b, c, rbu_control) VALUES(4, NULL, 'usa', '..d');
187 **
188 ** is similar to an UPDATE statement such as:
189 **
190 ** UPDATE t1 SET c = rbu_delta(c, 'usa') WHERE a = 4;
191 **
192 ** Finally, if an 'f' character appears in place of a 'd' or 's' in an
193 ** ota_control string, the contents of the data_xxx table column is assumed
194 ** to be a "fossil delta" - a patch to be applied to a blob value in the
195 ** format used by the fossil source-code management system. In this case
196 ** the existing value within the target database table must be of type BLOB.
197 ** It is replaced by the result of applying the specified fossil delta to
198 ** itself.
199 **
200 ** If the target database table is a virtual table or a table with no PRIMARY
201 ** KEY, the rbu_control value should not include a character corresponding
202 ** to the rbu_rowid value. For example, this:
203 **
204 ** INSERT INTO data_ft1(a, b, rbu_rowid, rbu_control)
205 ** VALUES(NULL, 'usa', 12, '.x');
206 **
207 ** causes a result similar to:
208 **
209 ** UPDATE ft1 SET b = 'usa' WHERE rowid = 12;
210 **
211 ** The data_xxx tables themselves should have no PRIMARY KEY declarations.
212 ** However, RBU is more efficient if reading the rows in from each data_xxx
213 ** table in "rowid" order is roughly the same as reading them sorted by
214 ** the PRIMARY KEY of the corresponding target database table. In other
215 ** words, rows should be sorted using the destination table PRIMARY KEY
216 ** fields before they are inserted into the data_xxx tables.
217 **
218 ** USAGE
219 **
220 ** The API declared below allows an application to apply an RBU update
221 ** stored on disk to an existing target database. Essentially, the
222 ** application:
223 **
224 ** 1) Opens an RBU handle using the sqlite3rbu_open() function.
225 **
226 ** 2) Registers any required virtual table modules with the database
227 ** handle returned by sqlite3rbu_db(). Also, if required, register
228 ** the rbu_delta() implementation.
229 **
230 ** 3) Calls the sqlite3rbu_step() function one or more times on
231 ** the new handle. Each call to sqlite3rbu_step() performs a single
232 ** b-tree operation, so thousands of calls may be required to apply
233 ** a complete update.
234 **
235 ** 4) Calls sqlite3rbu_close() to close the RBU update handle. If
236 ** sqlite3rbu_step() has been called enough times to completely
237 ** apply the update to the target database, then the RBU database
238 ** is marked as fully applied. Otherwise, the state of the RBU
239 ** update application is saved in the RBU database for later
240 ** resumption.
241 **
242 ** See comments below for more detail on APIs.
243 **
244 ** If an update is only partially applied to the target database by the
245 ** time sqlite3rbu_close() is called, various state information is saved
246 ** within the RBU database. This allows subsequent processes to automatically
247 ** resume the RBU update from where it left off.
248 **
249 ** To remove all RBU extension state information, returning an RBU database
250 ** to its original contents, it is sufficient to drop all tables that begin
251 ** with the prefix "rbu_"
252 **
253 ** DATABASE LOCKING
254 **
255 ** An RBU update may not be applied to a database in WAL mode. Attempting
256 ** to do so is an error (SQLITE_ERROR).
257 **
258 ** While an RBU handle is open, a SHARED lock may be held on the target
259 ** database file. This means it is possible for other clients to read the
260 ** database, but not to write it.
261 **
262 ** If an RBU update is started and then suspended before it is completed,
263 ** then an external client writes to the database, then attempting to resume
264 ** the suspended RBU update is also an error (SQLITE_BUSY).
265 */
266
267 #ifndef _SQLITE3RBU_H
268 #define _SQLITE3RBU_H
269
270 #include "sqlite3.h" /* Required for error code definitions */
271
272 #ifdef __cplusplus
273 extern "C" {
274 #endif
275
276 typedef struct sqlite3rbu sqlite3rbu;
277
278 /*
279 ** Open an RBU handle.
280 **
281 ** Argument zTarget is the path to the target database. Argument zRbu is
282 ** the path to the RBU database. Each call to this function must be matched
283 ** by a call to sqlite3rbu_close(). When opening the databases, RBU passes
284 ** the SQLITE_CONFIG_URI flag to sqlite3_open_v2(). So if either zTarget
285 ** or zRbu begin with "file:", it will be interpreted as an SQLite
286 ** database URI, not a regular file name.
287 **
288 ** If the zState argument is passed a NULL value, the RBU extension stores
289 ** the current state of the update (how many rows have been updated, which
290 ** indexes are yet to be updated etc.) within the RBU database itself. This
291 ** can be convenient, as it means that the RBU application does not need to
292 ** organize removing a separate state file after the update is concluded.
293 ** Or, if zState is non-NULL, it must be a path to a database file in which
294 ** the RBU extension can store the state of the update.
295 **
296 ** When resuming an RBU update, the zState argument must be passed the same
297 ** value as when the RBU update was started.
298 **
299 ** Once the RBU update is finished, the RBU extension does not
300 ** automatically remove any zState database file, even if it created it.
301 **
302 ** By default, RBU uses the default VFS to access the files on disk. To
303 ** use a VFS other than the default, an SQLite "file:" URI containing a
304 ** "vfs=..." option may be passed as the zTarget option.
305 **
306 ** IMPORTANT NOTE FOR ZIPVFS USERS: The RBU extension works with all of
307 ** SQLite's built-in VFSs, including the multiplexor VFS. However it does
308 ** not work out of the box with zipvfs. Refer to the comment describing
309 ** the zipvfs_create_vfs() API below for details on using RBU with zipvfs.
310 */
311 sqlite3rbu *sqlite3rbu_open(
312 const char *zTarget,
313 const char *zRbu,
314 const char *zState
315 );
316
317 /*
318 ** Internally, each RBU connection uses a separate SQLite database
319 ** connection to access the target and rbu update databases. This
320 ** API allows the application direct access to these database handles.
321 **
322 ** The first argument passed to this function must be a valid, open, RBU
323 ** handle. The second argument should be passed zero to access the target
324 ** database handle, or non-zero to access the rbu update database handle.
325 ** Accessing the underlying database handles may be useful in the
326 ** following scenarios:
327 **
328 ** * If any target tables are virtual tables, it may be necessary to
329 ** call sqlite3_create_module() on the target database handle to
330 ** register the required virtual table implementations.
331 **
332 ** * If the data_xxx tables in the RBU source database are virtual
333 ** tables, the application may need to call sqlite3_create_module() on
334 ** the rbu update db handle to any required virtual table
335 ** implementations.
336 **
337 ** * If the application uses the "rbu_delta()" feature described above,
338 ** it must use sqlite3_create_function() or similar to register the
339 ** rbu_delta() implementation with the target database handle.
340 **
341 ** If an error has occurred, either while opening or stepping the RBU object,
342 ** this function may return NULL. The error code and message may be collected
343 ** when sqlite3rbu_close() is called.
344 **
345 ** Database handles returned by this function remain valid until the next
346 ** call to any sqlite3rbu_xxx() function other than sqlite3rbu_db().
347 */
348 sqlite3 *sqlite3rbu_db(sqlite3rbu*, int bRbu);
349
350 /*
351 ** Do some work towards applying the RBU update to the target db.
352 **
353 ** Return SQLITE_DONE if the update has been completely applied, or
354 ** SQLITE_OK if no error occurs but there remains work to do to apply
355 ** the RBU update. If an error does occur, some other error code is
356 ** returned.
357 **
358 ** Once a call to sqlite3rbu_step() has returned a value other than
359 ** SQLITE_OK, all subsequent calls on the same RBU handle are no-ops
360 ** that immediately return the same value.
361 */
362 int sqlite3rbu_step(sqlite3rbu *pRbu);
363
364 /*
365 ** Force RBU to save its state to disk.
366 **
367 ** If a power failure or application crash occurs during an update, following
368 ** system recovery RBU may resume the update from the point at which the state
369 ** was last saved. In other words, from the most recent successful call to
370 ** sqlite3rbu_close() or this function.
371 **
372 ** SQLITE_OK is returned if successful, or an SQLite error code otherwise.
373 */
374 int sqlite3rbu_savestate(sqlite3rbu *pRbu);
375
376 /*
377 ** Close an RBU handle.
378 **
379 ** If the RBU update has been completely applied, mark the RBU database
380 ** as fully applied. Otherwise, assuming no error has occurred, save the
381 ** current state of the RBU update appliation to the RBU database.
382 **
383 ** If an error has already occurred as part of an sqlite3rbu_step()
384 ** or sqlite3rbu_open() call, or if one occurs within this function, an
385 ** SQLite error code is returned. Additionally, *pzErrmsg may be set to
386 ** point to a buffer containing a utf-8 formatted English language error
387 ** message. It is the responsibility of the caller to eventually free any
388 ** such buffer using sqlite3_free().
389 **
390 ** Otherwise, if no error occurs, this function returns SQLITE_OK if the
391 ** update has been partially applied, or SQLITE_DONE if it has been
392 ** completely applied.
393 */
394 int sqlite3rbu_close(sqlite3rbu *pRbu, char **pzErrmsg);
395
396 /*
397 ** Return the total number of key-value operations (inserts, deletes or
398 ** updates) that have been performed on the target database since the
399 ** current RBU update was started.
400 */
401 sqlite3_int64 sqlite3rbu_progress(sqlite3rbu *pRbu);
402

403 /*
404 ** Obtain permyriadage (permyriadage is to 10000 as percentage is to 100)
405 ** progress indications for the two stages of an RBU update. This API may
406 ** be useful for driving GUI progress indicators and similar.
407 **
408 ** An RBU update is divided into two stages:
409 **
410 ** * Stage 1, in which changes are accumulated in an oal/wal file, and
411 ** * Stage 2, in which the contents of the wal file are copied into the
412 ** main database.
413 **
414 ** The update is visible to non-RBU clients during stage 2. During stage 1
415 ** non-RBU reader clients may see the original database.
416 **
417 ** If this API is called during stage 2 of the update, output variable
418 ** (*pnOne) is set to 10000 to indicate that stage 1 has finished and (*pnTwo)
419 ** to a value between 0 and 10000 to indicate the permyriadage progress of
420 ** stage 2. A value of 5000 indicates that stage 2 is half finished,
421 ** 9000 indicates that it is 90% finished, and so on.
422 **
423 ** If this API is called during stage 1 of the update, output variable
424 ** (*pnTwo) is set to 0 to indicate that stage 2 has not yet started. The
425 ** value to which (*pnOne) is set depends on whether or not the RBU
426 ** database contains an "rbu_count" table. The rbu_count table, if it
427 ** exists, must contain the same columns as the following:
428 **
429 ** CREATE TABLE rbu_count(tbl TEXT PRIMARY KEY, cnt INTEGER) WITHOUT ROWID;
430 **
431 ** There must be one row in the table for each source (data_xxx) table within
432 ** the RBU database. The 'tbl' column should contain the name of the source
433 ** table. The 'cnt' column should contain the number of rows within the
434 ** source table.
435 **
436 ** If the rbu_count table is present and populated correctly and this
437 ** API is called during stage 1, the *pnOne output variable is set to the
438 ** permyriadage progress of the same stage. If the rbu_count table does
439 ** not exist, then (*pnOne) is set to -1 during stage 1. If the rbu_count
440 ** table exists but is not correctly populated, the value of the *pnOne
441 ** output variable during stage 1 is undefined.
442 */
443 void sqlite3rbu_bp_progress(sqlite3rbu *pRbu, int *pnOne, int *pnTwo);

444
445 /*
446 ** Create an RBU VFS named zName that accesses the underlying file-system
447 ** via existing VFS zParent. Or, if the zParent parameter is passed NULL,
448 ** then the new RBU VFS uses the default system VFS to access the file-system.
449 ** The new object is registered as a non-default VFS with SQLite before
450 ** returning.
451 **
452 ** Part of the RBU implementation uses a custom VFS object. Usually, this
453 ** object is created and deleted automatically by RBU.
454 **
455 ** The exception is for applications that also use zipvfs. In this case,
456 ** the custom VFS must be explicitly created by the user before the RBU
457 ** handle is opened. The RBU VFS should be installed so that the zipvfs
458 ** VFS uses the RBU VFS, which in turn uses any other VFS layers in use
459 ** (for example multiplexor) to access the file-system. For example,
460 ** to assemble an RBU enabled VFS stack that uses both zipvfs and
461 ** multiplexor (error checking omitted):
462 **
463 ** // Create a VFS named "multiplex" (not the default).
464 ** sqlite3_multiplex_initialize(0, 0);
465 **
466 ** // Create an rbu VFS named "rbu" that uses multiplexor. If the
467 ** // second argument were replaced with NULL, the "rbu" VFS would
468 ** // access the file-system via the system default VFS, bypassing the
469 ** // multiplexor.
470 ** sqlite3rbu_create_vfs("rbu", "multiplex");
471 **
472 ** // Create a zipvfs VFS named "zipvfs" that uses rbu.
473 ** zipvfs_create_vfs_v3("zipvfs", "rbu", 0, xCompressorAlgorithmDetector);
474 **
475 ** // Make zipvfs the default VFS.
476 ** sqlite3_vfs_register(sqlite3_vfs_find("zipvfs"), 1);
477 **
478 ** Because the default VFS created above includes a RBU functionality, it
479 ** may be used by RBU clients. Attempting to use RBU with a zipvfs VFS stack
480 ** that does not include the RBU layer results in an error.
481 **
482 ** The overhead of adding the "rbu" VFS to the system is negligible for
483 ** non-RBU users. There is no harm in an application accessing the
484 ** file-system via "rbu" all the time, even if it only uses RBU functionality
485 ** occasionally.
486 */
487 int sqlite3rbu_create_vfs(const char *zName, const char *zParent);
488
489 /*
490 ** Deregister and destroy an RBU vfs created by an earlier call to
491 ** sqlite3rbu_create_vfs().
492 **
493 ** VFS objects are not reference counted. If a VFS object is destroyed
494 ** before all database handles that use it have been closed, the results
495 ** are undefined.
496 */
497 void sqlite3rbu_destroy_vfs(const char *zName);
498
499 #ifdef __cplusplus
500 } /* end of the 'extern "C"' block */
501 #endif
502
503 #endif /* _SQLITE3RBU_H */

This page was generated in about
0.011s by
Fossil 2.8 [246f249e5a] 2019-01-21 20:07:41