Organization: China Interactive Publishing Network (http://www.china-pub.com/)
RFC Document Chinese Translation Program (http://www.china-pub.com/compters/emook/aboutemook.htm)
E-mail: Ouyang@china-pub.com
Translator: kenen (kenen mail@eyou.com)
Translation time: 2001-6-5
Copyright: This Chinese translation copyright belongs to China Interactive Publishing Network. Can be used for non-commercial use free reprint, but must
Keep the translation and copyright information of this document.
Network Working Group Paul R. Johnson (BBN-Tenex)
RFC # 677 Robert H. Thomas (BBN-Tenex)
NIC # 31507 January 27, 1975
Double database maintenance
(The Maintenance of Duplicate Databases)
Foreword
This RFC document discusses questions that maintain a dual database on a similar Arpanet. It concisely
There is a problem of dual databases, and the solution to a particular type of dual database is described in detail. These concepts
Used to design a user ID database for TIP user authentication and account systems. We believe these concepts are generally distributed
Database problems are also applicable.
table of Contents
Introduction ................................................................................................ 1
Model ........................................................................................................................ 2
Database ...............................................................................................
Consistency ...................................................................................................
Time stamp .........................................................................................
Assignment .......................................................................................................................... 3
Creating .............................................................................................................
Delete ............................................................................................................. ... 4
Removal of deleted entrance .................................................................................... 5
Summary of the solution .....................................................................................
Conclusion .......................................................................................................................... 6
Introduction
There are many motivations to maintain redundant dual copies of databases in distributed database network environments. Two of them
The important motives are as follows:
- Increase data access reliability
Under redundant maintenance methods, the access to key data will inevitably increase. Data library for TIP login and account management
High reliability is obtained through redundant distribution.
- Ensure efficiency of data access
The data is very close to the access process, and can be accessed quickly and efficiently. Used in TIP user ID database in support TIP
Each site for landing services saves a copy to ensure fast and efficient access. (Reliability considerations show this number
According to libraries, it is redundant, and high-efficiency considerations indicate that there is a copy at each license site. )
Redundant dual database systems design is a challenging job because it needs to be processed on the database copy.
The delay between inter-communication, the real limitations in the real world, the wrong operation, the failure of communication, and so on.
This article will discuss some of our problems we encounter when designing such a system, and describing how to design a specific type
Database systems to solve these problems.
model
A database system that supports dual copying can be used to copy this way for each processor to copy this method by means of a group of independent database management processors (DBMPS). These processors are connected to each other on the network channel
letter. Each processor DBMP is fully controlled to its database copy, process all of the other processors for responding
Data inventory and modify operations. Although the processor DBMP is only processed, we often
To see they actually cause modification operations.
An important design problem is to consider errors in the communication channel between the processor DBMP. Therefore, one
The processor DBMP can make it interrupt between it and other processor DBMP, or must wait until it
The channel re-established with other processor DBMP communications can be communicated. This article makes a hypothesis for the communication channel,
That is, the information from one processor is passed in the same order in the sequence of sequence to another process: ArpaNet meets this strip
For the network without such a guaranteed, you can use the communication protocol between the processor DBMP to correctly align the letter.
interest.
For further discussion, it is necessary to determine the nature of the dual database and the operation thereof. One extreme is
A stable read-only database, the processor DBMP is relatively simple, and they simply respond to the value search.
begging. Another extreme is, a shared database allows processing, such as modification requests, such as X: = F (X, Y, Z) functions, and / or
It is necessary to completely limit the access entry when modified. In this case, shared all databases on a single-machine system
The problem has appeared (for example, a synchronous mechanism, potential deadlock problem), there are also more than a separate computer
A database copy issue has also appeared. For example, a general system must handle communication failure and cause network segmentation into
The possibility of two or more subnets; the solution relying on locking the database unit when synchronous modification must handle without communication
The processor in the subnet attempts to lock the possibility of the same unit, or they can do this, (but this violates the lock
Rules), or they must wait until the end of the division (but this may take a long time), or use centralized or hierarchical control
(However, this results in some processor DBMP dependencies to other processor DBMPs when modifying and accessing data).
database
The database type discussed herein appears in a collection of entities ---- (selection item, value). Each selection is only
First, the value is an atomic entity for the processor DBMP. This mechanism proposed can extend to handle more complex
Structured database ---- The value itself can be (select item, value) collection form --- But this extension will not be
Further discussion.
Allow four operations to the database:
1) Select: Give a selection, return the value that matches it.
2) Assignment: Given a selection item and value, this given value replaces the previous value associated with this given selection item.
3) Create: A new selection and an initial value, then a new entity (select item, value) to the database
in.
4) Delete: give a selection, existing entities (selection, value) from the database.
Note The modification of the value is limited to assignment operations. Function Modification Request ---- For example, "Reset X into Factorial (x) ----
It is outside the rules. If these requests are allowed to enforce the system synchronous interlock.
consistency
Another problem that must be considered is the degree of consistency of database copies. Due to the communication between the processor DBMP
Delay, so it is impossible to ensure that the database is exactly the same at any time. Our goal is not to ensure that the copy is finished.
All the same, but to ensure the consistency between the copy. In this case, we can think that it is stopped when any entry is stopped.
When modifying operation, there is sufficient communication time between the processor DBMP, then the state of the entry (its presence and value) is the same in all the copies in all databases.
Timestamp
We allow any processor DBMP for creating and maintaining the database to modify the database. Of course, one
Processor DBMP performs some changes to communicate with other processor DBMP. In order to ensure consistency, all
The instrument DBMP must make the same decision, that is, which modification to a particular entry will be the last result. We greete
Hope to choose the latest changes, however, because there is no commonly accented serial number generator (a network time standard)
It is sufficient to have an absolute method to determine the order of time in the distributed system, so "the most late" only
Can be approximation. We get this approximation by attaching a timestamp to each of the modifications of each entrance. Modified
The later timestamp will be set to the current timestamp (1).
--------- Description:
(1) Time is useful in the front-rear relationship because it has the reasonableness of the desired monotonic increased attributes and accuracy.
The degree of effectiveness. Any other sorting method with these attributes can also be used, you can choose "Time time",
Because it is easy to get. Its main defect is that it is often manually set (thus easy to generate errors), and it is in the system
The service is stopped. With the computer system adjustment to adapt to the network environment, use a separate time source will become
More common. This has already appeared in the Arpanet's Tenex site.
Since the uniqueness of the timestamp on a processor DBMP cannot be guaranteed, a "source processor DBMP)
"It is included in each timestamp. Arrange the processor DBMP through (arbitrarily), and we have the only order time
stamp. Each timestamp is used (T, D) indicates that t is time, and D is the identifier of the processor DBMP. For two time stamps
(T1, D1) and (T2, D2), we can get
(T1, D1)> (T2, D2) <=> (T1> T2) or (T1 = T2 and D1> D2)
(T1, D1) can be considered more late than (T2, D2).
If D1 = D2 and T1 = T2, these two modifications are considered to be two copies of the same modified request.
In order to ensure the uniqueness of the timestamp, each independent processor DBMP is attached to different modifications.
between. This is of course possible, even if the advantage of the time unit will limit the frequency of modification operations on a single DBMP site.
Every entrance to the database is now one step:
E :: = (s, v, t),
Here
S is the selected item mentioned above,
V is the value associated with S,
T is the timestamp of the atmosphere of the entrance (one mentioned above (T, D))
The task of each processor DBMP is to keep the database copy is the latest based on the received modified operation information.
At the same time, it must ensure that each of its portions and the entrance of all other DBMPs are consistent. This must be realized, although it receives
The modification order is different from the modification sequence received by other processor DBMP. In the back of this article, we will consider
The operation of the database and their impact on the processor DBMP.
Value
First, consider the case where an exit entry is assigned. When it is assigned, the processor DBMP involved in this
Make a modification, and create a column including the modified portal and the processor DBMP that must be sent.
Copy of the table. As this modification is passed to other processor DBMP, they are removed from the list until
There is no remaining DBMP, then remove the copy of this modified operation. This distributed mechanism must have no errors (for example
The reception of the modified operation must be confirmed before the reception list is removed). (2)
---------- Description:
(2) This same process (local modification and queuing for remote distribution) may of course be other possible databases
operating. How to modify local changes, how information is arranged, how to pass, etc., although it is important but here
Not discussed. Using a address table that delivers modification information is easy to implement, and in actual applications are not very
difficult.
When a DBMP receives an assignment modification (local or from another processor DBMP), it will modify
The timestamp is compared to the timestamp in its own database, and keep this modification operation according to the above
Definition is the most late. Thus, when the assignment of a given entry is distributed on all processor DBMP, it
We can guarantee the latest in the entry related to the value.
create
The creation and deletion of the entity will result in more than one problem. Note that the previous unknown entity needs a process
The DBMP can handle the assignment of unknown entities. For example, considering the emotion of the entity XYZ created by the processor DBMP A
The order of the event is as follows: The processor dbmp A tells the processor DBMP B new entity, then processor DBMP
B assign a new value to XYZ; processor DBMP B is acquired from the processor DBMP C from the processor DBMP A
Notify DBMP C before the interest rate Processor DBMP c must save XYZ's assignment until it knows the creation of entities, or
Simple hypothesis This creation will happen and immediately use this "new" entity. The latter tries to keep the database as possible as possible
New, but will result in loss of consistency.
delete
The deletion of the entity is the main problem of the database system. If you delete the entry, you will remove it from the database.
The problem. Consider the following situation:
XYZ is the entrance to all DBMP
XYZ delete in processor DBMP a
XYZ modification in processor DBMP B (before the processor dbmp B knows it before being deleted by the processor DBMP A)
Now consider the processor DBMP C, which may get the message of the processor DBMP B before the processor DBMP A,
In this case, XYZ will be deleted at the processor DBMP C. However, the processor DBMP C may also be in the processor DBMP
B get the message from the processor DBMP A before. In this case, if the processor DBMP C is removed from the database
XYZ, then XYZ initially assigned by the processor DBMP B will recreate the XYZ of the processor DBMP C. for
Prevent this situation, processor DBMP c must remember XYZ has been deleted until it is completely forgotted XYZ is not
Perform operations related to XYZ.
The solution to this problem is that it does not have to be removed from the database, deleting just in this entrance to make a deletion
Except marker. However, the problem of receiving the assignment of the deleted entry still exists. For example, processor DBMP A may
Receive a assignment from the processor DBMP B, and this inlet processor DBMP A has been marked as deleted. processor
DBmp A cannot distinguish whether the processor DBMP B did not hear the deleted message, or heard it but has received one
Re-created requests, this request processor dbmp A does not know. In order to make the processor DBMP A can solve this
In the case, we add another timestamp in all the entrance: the timestamp created in the entrance. Therefore, in the above example,
Processor DBMP A can compare the value of the value and the creation of the entity. The most late creation timestamp is retained. This
When a modification operation with an outdated timestamp will be ignored.
Now we use a five-way group to describe inlets and modifications:
E :: = (s, v, f, ct, t)
S is the selection item
V is related value
F is deleted / unsteaded flag
CT is created timestamp
T is the last revised timestamp
Note that a modified element F, the value of CT, and T is uniquely launched the type of modified operation. Therefore, it is only a
Select the new entry of the new entry, not the type of modification, need to communicate: f is not deleted, ct = t => creation
F is not deleted, CT
F Delete => Delete
The mechanism described above processes all the operations of the distributed database to ensure consistency of all copies. A processor
DBMP will take effect for the database's modification of the database as long as the processor DBMP starts modified.
This mechanism is inefficient that the deleted entry is not removed from the database. The following will be discussed to be deleted.
Method of "garbage collection".
Removal of the entrance
The basic limit of this problem is a processor DBMP until the assignment of any of the same options (s) or
The assignment of the outdated timestamp (CT) deletes the entry. If the operation fails, you can't distinguish the same selection.
Outdated assignment and newly created entry assignment. In order to be able to do this, each processor DBMP must know
No All other processor DBMP already knows the message to delete the entry. In order to achieve it, each processor DBMP is
When you hear the delete message, you can notify the other processor DBMP if these notifications follow the normal movement of the modified operation.
If the order is transmitted, then each processor DBMP receives a notification to confirm that the sender processor DBMP has been passed.
Send the assigned entry to the deleted entry, tag it to delete and will no longer produce any new assignment. Use this way
Track and exchange each independent deleted entry is much more complicated than a general requirement.
Considering the simplicity, let each processor DBMP passed all of its modifications in the order. We make all
The processor DBMP saves a timestamp that is received by the last modified timestamp received from other processor DBMP.
table. Any processor DBMP, such as the processor DBMP A guarantees that all of the other processes have been received.
The modification caused by DBMP, for example, from the processor DBMP B, this table is in the processor dbmp A, which has an earlier than
Or equal to the timestamp of the entrance of the processor DBMP B. If this table exchanges between processor DBMP, then all
The processor DBMP will have a table, the entrance (i, j) of the second N * N (N = processor DBMP) will be a processor
DBMP i receives the last modified timestamp from the processor DBMP J. So this table is in the processor DBMPA
When all the entrances of the row are late or equal to the timestamp of the deleted entry, the processor DBMP A can remove one
The entry that causes deleted in processor dbmpk. When a processor DBMP receives a delete modification, except
For the confirmation, it should also be sent to all other processor DBMP, which is passed by one.
The timestamp information arranged in the normal order of the modification operation information is sent.
We can improve the improvement of the number of exchange information and the size of the table, making DBMP only
Exchange the first sheet (consisting of the last modified timestamp received from other processor DBMP)
mouth. These will be saved in a table of 1 * n, replacing the table of N * N. A DBMP If the receiving timestamp is equal to or
It's obvious to the timestamp of the second table, it knows this modification has been dbmp by all other processors.
Confirmed. A deleted entry is thus removal when the timestamp meets the condition. A processor DBMP reception
After a delete modification, the last modified timestamp information is arranged.
Summary of the solution
One entry in the database is a five-way group:
(S, V, F, CT, T)
S is the selection item in all references in this entrance
V is its current value
F is a delete / unselected flag
CT is the timestamp created by this entrance
T is the timestamp of the modification operation of the current V (and / or) F.
A timestamp is a (T, D) pair, is the location generated by the processor DBMP, the processor DBM is
The sort of the arms is completely sorted by timestamp. A modification operation is a (processor DBMP collection, entry), processor DBMP collection is the entrance must be passed
The collection of DBMPs.
A modified timestamp node table remains in each processor DBMP. Processor DBMP periodically trying to pass
Dressive Removal Request Give the processor DBMP that is reserved in the modified processor DBMP collection. Modify the entrance
It has been passed to all processor DBMP times to delete it.
When a processor DBMP receives another processor DBMP modification request, it compares the time of the request
The timestamp of the stamp and the relevant entrance (if any) in the database, and determines the operation according to the comparison results or ignore this new
The entrance.
Each processor DBMP saves a time for the most late modification operation received from other processor DBMP
Stamp vector.
When a processor DBMP receives a delete modification, it sends a timestamp information to all other
The amount contains the processor DBMP of the last timestamp of the last modified operation. Each processor DBMP is saved from it
The second vector of the late timestamp received by DBMP.
A delete entry If it's timestamp (T), the second vector of the timestamp received from other processor DBMP
When there is a timestamp, remove it from the database.
in conclusion
This paper proposes various techniques that make many loosely coupled processors can maintain a double copy of a database, even
The way they communicate is unreliable. The copy of the database can be consistent, but abnormal behavior may also occur.
For example, one user can use a processor DBMP to assign a value, and then use another processor DBMP
Assign a new value but found that the first value remains in the database. This situation may appear if these two processes
The server DBMP uses the clock because the timestamp of the synchronous second assignment is before the first assignment. If the communication channel can be
Relying to reach a certain extent, and the clock used by the processor process is synchronized, then the possibility of illegal behavior
Sex is small. However, the distribution attribute of the system indicates that this possibility is definitely not 0.
The main description herein is clearly explicitly indicating the time series of the modified operation by modifying the operation and the timestamp of the inlet.
This makes each processor DBMP to select the latest modifications for an entry. Moreover, the timestamp makes the processor DBMP
Can determine when a deleted entry (eg, still active) is no longer referenced and redistributed. These technologies
The technique will have a wider range of applications in systems that build and model other parallel and collaborative operations.
RFC677 Internet RFC / STD / FYI / BCP Archives Double Database Maintenance
RFC Document Chinese Translation Scheme - 1 -