Berkeley DB embedding database technical features

zhaozj2021-02-16  102

Although this C / S-based relational relational database system based on Mysql represents the mainstream of current database applications, it does not meet the needs of all applications. Sometimes we need to just a simple disk-based database system. This not only avoids the installation of large database servers, but also simplifies the design of the database application. Berkeley DB is based on this idea.

Introduction to Berkeley DB

Berkeley DB is an open source Embedded database management system that provides high-performance data management services for applications. Applying it that programmers only need to call some simple APIs to complete access and management of data. Unlike commonly used database management systems (such as mysql and oracle, etc.) vary, in Berkeley DB, there is no concept of a database server. The application does not need to establish a network connection with the database service, but through the Berkeley DB function library embedded in the program to complete the save, query, modification, and deletion of the data.

Berkeley DB provides a practical API interface for many programming languages, including C, C , Java, Perl, TCL, Python, and PHP. All the operations related to the database are all responsible for uniformity by the Berkeley DB function library. This is a function of accessing the database at the same time whether multiple processes in the system, or multiple threads in the same process. The underlying data lock, transaction logs and storage management are implemented in the Berkeley DB function library. They are completely transparent to the app. As the saying goes: "The sparrow is a small fifty and full." The Berkeley DB function library is only about 300kB, but it can manage up to 256TB of data, and in many aspects of performance can compete with commercial grade database systems. In the case of concurrent operation of the data, Berkeley DB can easily delegate thousands of users access the same database at the same time. In addition, if you want to perform database management on resource-limited embedded systems, Berkeley DB may be the only correct choice.

Berkeley DB has a unique advantage in many ways as an embedded database system. First, due to its application and database management system running in the same process space, the cumbersome process can be avoided when performing data operation, so that the overhead of the communication is naturally reduced to the extremely low degree. Second, Berkeley DB uses a simple function call interface to complete all database operations, not the SQL language that is often used in the database system. This avoids the overhead required to parse and process the structured query language.

Berkeley DB IS A FULL-Service Embedded Database System for Use by Software Developers. It is distributed in source code form, and is compiled and linked Directly Into your application.

The Sections Below Give A Detailed Description of Berkeley DB's Important Features. If you need additional detail, The Complete Documentation Suite Is Available on-line.

Berkeley DB is a source code, which can be compiled into an embedded database system in your application. The following sections will detail the main features of Berkeley DB. If you need more detailed content, you can view http://www.sleepycat.com/products/documentation.shtml.

Ease of Use

Berkeley DB is intended for use by software developers who need to embed reliable, high-performance data management services in their applications. It does not require mastery of database-specific query languages, like SQL. Instead, developers make function calls that operate directly on the database and the records that it manages.Once deployed, Berkeley DB is simple to administer. Other databases require a trained database administrator to handle backups, recovery, performance tuning, and routine maintenance. Berkeley DB uses standard operating system services, and needs no special device access. Maintenance tasks such as backup and recovery can be handled by standard operating system tools. Our goal is that end users of applications that embed Berkeley DB never be aware that they are using a database.

Easy to use

Berkeley DB is designed for developers who need reliably embedded, high-performance database management services in their application. It does not need to master a dedicated database query language (such as SQL). Programmers can call functional functions to manipulate databases and managed data records.

Once configured, Berkeley DB can manage it easily. Some database systems require system administrators to train backup, recovery, performance tuning and daily maintenance. Berkeley DB uses standard operating system features without special devices access. If the backup recovery task can use standard operating system tools. Our goal is to make the end user use the embedded Berkeley DB not to feel that it is using the database.

Open Source Distribution

Berkeley DB is an open source product, meaning that it is freely available for download in source code form, and may be freely used without commercial license under certain conditions. For information on licensing, see our Product Licensing page.

................

First, you are no longer dependent on an outside vendor for changes, performance tuning, or debugging of the software. Sleepycat offers a full range of support and consulting services for Berkeley DB, but the fact that you have the source code means you have more Control over Your Product Than Any Binary Database Product Can Give You.second, You Can INTEGRATE BERKELEY DB INTO YOUR Product's Build Environment In The Most Natural Way for you.

Finally, wide distribution of the source code means that many thousands of software developers have reviewed it. Berkeley DB's public interfaces and internal interfaces have all been carefully examined by a huge number of engineers, and their suggestions have produced a smaller, simpler, more reliable And Easier-to-use package.

Open source release

Berkeley DB is an open source product, which means free download source code and can be used for free without a business license. For information on the license, you can view the license page.

In fact, the developer's acquisition is also the reason why the product is more easily used.

First, other database vendors do not have too many modifications, performance tuning and system debugging, Sleepycat provides complete Berkeley DB support and consulting services, in fact, you have source code means you have relatively binary database products for you. More control.

Second, you can integrate Berkeley DB to your product compilation environment.

Finally, the broad release of the source code means that there is thousands of software developers to detect it. The public interface and internal interface of Berkeley DB have been examined by countless engineers, and their recommendations have been modified to develop bags that are smaller, more simple, more reliable and easier.

Small Footprint

Berkeley DB Is A Compact System. The Full Package, Including All Access Methods and Recoverability and Transaction Support, IS Roughly 375k of Text Space On Common Architectures.

Berkeley DB is a compact system. All packages, including access methods and recoverable and transaction support modules, probably occupying 375K in the general system architecture.

Choice of Several, Easy-to-Use Apis

Built for programmers, by programmers, Berkeley DB requires no special training in database access languages. Instead, the system provides an easy-to-use function-call interface for operating on databases and the records that they store. This interface supports simple record insertion and search, but also more complicated operations, including cursors, joins, management of duplicate values, and more.The C / C and Java APIs and full documentation for their use are included in the distributed system. Programmers working in other languages ​​may also choose Among Perl, Python, TCL, Ruby and Others. The Language-Specific Interfaces Make All The Power and Flexibility Of Berkeley DB Available In A Way That Is Natural for The Language of Choice.

Berkeley DB does not require a special database access language. Instead, it is to provide an easy-to-use function interface operation database and storage record. These interfaces support simple data insertion and queries, but also more complex operations such as cursors, connection, replication management.

C / C and Jave APIs and all documents are included in the published system program directory, you can also choose other languages ​​such as Perl, Python, TCL, and Ruby. These and language-related interfaces cover all available features and scalability of Berkeley DB, language is optional.

Thread-Safe Library

Because Berkeley DB can be deployed in so many different ways, Sleepycat has been careful to provide the tools that developers require, without mandating their use. A good example is Berkeley DB's support for multi-threaded operation.

The library is entirely thread-safe. As Berkeley DB itself does not mandate the use of any particular threads package, you can use the one you like best or the one most natural to your application. You can build applications that are single-threaded or Multi-Threaded, as your application request.

Berkeley DB works equally well when multiple processes operate on a single database. Whether sharing is among threads in a single process, among processes on a machine, or some hybrid of the two, the database software correctly handles caching, locking, and other core services . You can CONCENTRATE ON YOUR Application WIRYING ABOUT DATABASE Architecture. Security Gallery

Because Berkeley DB supports many different configuration methods, Sleepycat. . . . A good example is that Berkeley DB supports multi-threaded operations.

This library is fully threaded. ? ? ? ?

You can build a single-thread or multi-threaded application according to your application requirements.

When a multi-process operates a data, Berkeley DB works well. The database is also able to properly handle buffer, lock, and other core services in the same physical machine in the multi-process in the same physical machine. You can focus on your application without worry about the architecture of the database.

File system integration

Once your application is deployed and running at your customer's site, ongoing maintenance of the database is a major concern. Berkeley DB has been carefully designed to minimize and, in most cases, entirely eliminate database administration (DBA) tasks.

Other Database Systems Require Use of a Dedicated Disk (A "RAW" Device) for Data Storage. Berkeley DB Uses The Native File System on All Platforms. Using The Native File System Has Several Important Benefits.

First, Since No Special Hardware Configuration Or Support Is Required, Your Application Will Install and Operate More Easily. Your Customer Never Needs To Know That A Database System Is Running.

Second, your customer never needs to dedicate storage space to your application. Since the file system is shared by all the applications running on a system, Berkeley DB can share space with other tools. You and your customers will never need to preallocate storage to your APPLICATION.

Finally, ongoing administration of the database is much simpler. Berkeley DB uses the directory and file management services of the operating system. Moving databases from one location to another, or even from one machine to another, is simpler, since it only requires copying ordinary Files. File System Integration

Database Dump and Load Utilities

Since Berkeley DB stores data in the native OS file system, in many cases no special backup or recovery tools are required. Operating systems typically require that a file system be completely quiescent for backup. However, some database applications must run all the time.

Berkeley DB includes programmatic interfaces to identify files that need to be backed up. Because Berkeley DB uses the native file system, applications can simply open, read, and copy files, even while the database is active. As a result, programmers can embed support For Backups and Recovery Directly in Their Applications. Again, Your Users NEED NEVER KNOW THAT A DATABASE IS ISTALLED.

Data uninstall and reprint function

Because Berkeley DB data is stored in the local OS file system, there is no need for special backup and recovery tools. The operating system specifies that the file system needs to be stopped at all when the file is backup. However, some database applications must remain running.

The Berkeley DB provides an interface to identify the data file to require a backup. When the database is running, because Berkeley DB uses a local file system, the application can open, read, and copy files. That is, the programmer can provide direct data backup and recovery capabilities in his application. And your users will never know the installed database.

Power and flexibility

Berkeley DB Is A Powerful, Flexible Data Manager. The System Provides The Same Services as More Expensive Database Systems in A Smaller, Less Exensive, and Easier-To-Use Package.

Function and flexibility

Berkeley DB has a powerful, flexible data management feature. It provides a smaller, lower cost, and more easily useful development kits.

Support for Arbitrary Data Types

Most database systems are able to store and retrieve only a small set of data types. Berkeley DB can manage any data type that can be represented in a programming language. Simple scalar values ​​or complex data structures can be used as either keys or as the values stored with each key.Berkeley DB is able to store data in several different access methods. An application can use the storage structure and search strategy best suited to its needs. All of the access methods include default routines for operating on keys and values, so search and retrieval are easy to program. On the other hand, developers can override the defaults by providing management functions (for example, comparison or hash functions) specific to their data types. You can define your own keys, and define your own key ordering Using Berkeley DB.

Support any data type

Most database systems can only store the data types that returns only set only. Berkeley DB manages the data type that can be described in the program language.

Keyed and Sequential Access To Records

Berkeley DB supports both keyed and sequential access to records. Keyed access permits fast searches for records that match part or all of a specific key. Sequential access allows programs to open a database and iterate over all its records, without regard to keys.

Keyed and continuous visits

Berkeley DB supports two ways to access and sequentially on keywords. Press keyword access to quickly query record. Sequential access allows programs to open the database regardless of the keyword and traverse all records.

Store INTO Application Or Allocated Memory

Performance is the critical variable among embedded database systems. One important way that developers can control performance is by deciding whether to preallocate memory for operating on records, or to allow Berkeley DB to allocate memory for them. Function calls to fetch records, for example, Allow Programmers to Pass In a buffer for the return value, or to rel.

Evaluate the standard embedded database when performing performance. Partial-Record Data Storage and Retrieval

Berkeley DB is able to manage long records. Since the time required to fetch a record is proportional to its size, the system includes tools for operating on partial records. If only a few bytes of a multi-megabyte record are required, the application can Request Partial Record Retrieval.

Support for cursors

Cursors are a database abstraction. They allow a program to iterate over multiple rows in the database easily. Berkeley DB supports cursors for ordinary searches, and for operating on sets of duplicate keys in the database.

Support of the cursor

It allows programs to easily traverse multiple records of the database. Berkeley DB support cursor query

Support for logical joins

In database terminology, a join is a combination of related data from two or more databases. For example, one database may store information on employees by employee ID, and another may store information on departments by department ID. If each employee is assigned a department ID, THEN BERKELEY DB CAN JOIN THE THE THE TWO DATABASES, AND FETCH Department-Specific Information Via The Employee Database.

Support logical connection

In the database terms, the connection is a data record that has a relationship in two or more databases. For example, a database may store EMPLOYEES information by keyword Employee ID, while another database is stored by keyword department ID. If each EMPLOYEE is assigned a department ID, Berkeley DB can connect to these two databases and get DEPARTMENT details.

Secondary INDES

Many applications need to look up a single record by more than one key at different times. Berkeley DB provides a mechanism called secondary indices to make this easy. An application can declare that a set of tables are related, with one storing the primary record and the rest providing fast lookup by alternate keys. Berkeley DB will update all the tables automatically whenever a new record is added to the primary table.When the application wants to search for the record by one of the alternate keys, it simply searches the secondary index And askS Berkeley DB to Return The Related Record from the primary table.

Memory-mapped, read-only databases

Many of the operating systems on which Berkeley DB runs support memory-mapped operations on files. For applications that require read-only access to an existing database, using memory-mapped databases provides outstanding performance.

Memory mapping, read-only protection

Berkeley DB supports file memory mapping operations when running on many operating systems. Provide read-only protection features to read-only access to existing databases, providing higher performance using the memory map database.

Main-memory databases

As computer system memory grows, more applications can run entirely out of main memory, rather than off of disk. Berkeley DB includes special support for main-memory databases. Using this support, applications can get fast access to the data that they need.

Memory Database

Due to the growth of computer system memory, more applications can fully allow in main memory, Berkeley DB supports memory database mode. Using these features, applications can get the faster data access capabilities they need.

Architecture- Independent Databases

Applications Today Must Run ON A Variety of Hardware Platforms. Even during the life of a single application at a Customer's Site, Demand for Services May Change, Forcing The Application To Move To New, Faster Hardware.

To simplify migration across hardware platforms, Berkeley DB can support the same database from either big-endian or little-endian systems. This allows end users to copy databases from one hardware platform to another.Scalability

Large storage devices and wide-area, high-speed networking demand that applications manage more data for more users than ever before. Applications built today will see exponential growth in disk and memory sizes over their lifetimes. As a result, developers need to plan for Scalability Up Front.

Berkeley DB Was Designed To Scale Gracefully from Low-Volume, Single-User Data Management Of ENORMOUS DATABASES.

Databases Up to 256 Terabytes

Berkeley DB uses 48 bits to address individual bytes in a database. This means that the largest theoretical Berkeley DB database is 248 bytes, or 256 terabytes, in size. Berkeley DB is in regular production use today managing databases that are hundreds of gigabytes in size .

Keys and Values ​​Up to 4 Gigabytes

New applications, including multimedia storage and playback systems, must manage individual data values ​​that are large. Berkeley DB is able to store single keys and values ​​as large as 232 bytes, or four gigabytes, in size.

Support for multiple readers

Berkeley DB applications support concurrent access to data by multiple readers, from a single process or from multiple processes. The system uses shared memory for caching, so that all users can share the work of fetching data from disk.

Read-Only Applications CAN Declare The Fact That They Will Do No Database Updates, Reducing Overhead and Improving Performance.

Support for Multiple Readers and Writers

Most database applications require simultaneous access by many users, some of whom need to update records, and others who need only to view them. Berkeley DB includes support for concurrent access by readers and writers to a single database.Users may access the database from a single process or from multiple processes. Caches are shared among all users, and Berkeley DB uses native O / S locking support on all platforms to guarantee that readers and writers are able to work without interfering with one another.

Fine-grained Locking

.

Fine-grained page locking allows many readers and writers to be active in the database at the same time. For high-concurrency workloads, this dramatically improves throughput. Coarse-grained, database-wide locking allows many readers to access the database at the same .

Before- and after-image logging

Berkeley DB includes support for write-ahead logging, a database management technique that provides the ability to make many changes to the database at the same logical instant, while preserving the ability to back out erroneous changes later. This logging facility simplifies transaction commit and abort And Makes It Possible To Recover from Catastrophicle, Including Application Or System Crashs.

GROUP commit

Berkeley DB supports group commit, a strategy for improving the performance of applications with a very high degree of write concurrency. Under group commit, if multiple transactions complete at close to the same time, Berkeley DB will automatically combine the operations in a single synchronous file system call. This lets multiple transactions take advantage of a single interaction with the operating system. Group commit can dramatically reduce the time spent committing any single transaction.Group commit works automatically in Berkeley DB. The software developer does not need to take any special steps To Turn It On Situations Where It Would Help, and There Is No Overhead Incurred in Low-Concurrency Systems Where Group Commits.

Load balancing

Applications that take advantage of Berkeley DB's replication service can support extremely high query loads and can scale up easily by adding new servers as necessary. With replication, updates go to a single master server, and the master distributes them to as many replicas as desired. Each of The Replicas CAN Answer Read Queries During Normal Processing. The Ability To Direct Readers To any of a large number of replicas makes it Simple to balance the query loading for high-concurrence application.

Reliability and Availability

Database Systems Must Provide Reliable, ON-Demand Access To The Information That They Manage. Berkeley DB Meets The Requirements Through A Combination of Careful Design and Solid Implementation.

Berkeley DB is a small-footprint data manager that includes no extraneous features. By leaving out features that programmers do not need, Berkeley DB is smaller and simpler than products from other vendors. Simplicity improves performance, because code paths are shorter, and reliability, Because Code Review and Testing Are More LIKELY TO FIND ANY Problems That Exist.Recovery from System or Application Failure

Berkeley DB uses write-ahead logging and checkpointing to log changes. Applications that need disaster recovery can use the logging system. After a failure, a database can be restored to its last transaction-consistent state by restoring the database file and rolling the log forward ............... ..

TRANSPARENT FAIL-OVER

Berkeley DB's Replication Service Allows Applications to Run ON A Collection of Cooperating Server Machines. In The Event this.

Replication requires that all updates go to a single master server, which distributes them to as many replicas as the application needs. Each of the replicas can handle read queries during normal processing. In the event that the master system goes down for any reason, one Of The Replicas Is Chosen To Take Its Place. from That Point ON, Updates Go To The New Master. The Application CAN Continue To Run with no interruption of service to the end user.

Hot Backups

Some Applications Must Operate TWENTY-FOUR Hours A Day, Seven Days A Week. For Those Applications, Berkeley DB Includes Support for On-line, OR "Hot," Backups.

Hot backups allow system administrators to back up the database and the log while users are running applications. Berkeley DB allows developers to open, read, and copy database files, even while the database is in active use. As a result, developers can build support For On-Line Backups Directly Into Their Applications. Hot Supplement

Some systems require 7x24 hours to run, Berkeley DB supports online hot standby. The thermal spare allows the system administrator but back up database data and logs when the system is running. Berkeley DB allows programmers to open, read and copy database data files when database data is used. Therefore, programmers can support online direct backup function in his application.

转载请注明原文地址:https://www.9cbs.com/read-15791.html

New Post(0)