10.1 Optimization Overview
In order to make a system faster, the most important part of course is of course basic design. You also need to know that your system will do such a thing, that is your bottleneck.
The most common bottleneck is:
Disk seeks. Disk takes time to find a data, with modern disk in 1999, the average time is usually less than 10ms, so we can find a second for about 1000 times in theory. This time is slowly improved with a new disk and it is difficult to optimize a table. Optimizing it is to spread the data on multiple disks. Disk read / write when the disk needs to read the data. With 1999 modern, a disk transfer is similar to 10-20MB / s. This must be easier to optimize because you can read from multiple disks in parallel. CPU cycle. When we read data into memory, (or if it is there), we need to process it to achieve our results. This is the most common restriction factor when we have relatively memory, but it is usually not a problem with a small surface speed. Memory bandwidth. When the CPU needs to exceed data suitable for the CPU cache, the cache bandwidth becomes a bottleneck of memory. This is an uncommon bottleneck for most systems but you should know it.
10.2 System / Compile and Startup Parameters
We start with system-level things because some of these decisions have been done very well. In other cases, quickly browse this section may be enough because it is not important to big harvest, but there is a lot of feelings about harvesting at this level.
The default OS used is indeed important! In order to maximize the use of multiple CPUs, Solaris should be used (because thread work is really good) or Linux (because 2.2-this core is indeed good SMP support). And on the 32-bit machine, Linux defaults to 2G file size restrictions. When the new file system is released (XFS), I hope this will not be corrected soon.
Because we didn't run production mysql on many platforms, we admire you to test the platform that you intend to run before you may choose it.
other suggestion:
If you have enough RAM, you can delete all swap devices. Some operating systems will use a SWAP device in some cases, even if you have idle memory. Use the -skip-locking MySQL option to avoid external locking. Note that this will not affect the MySQL function, as long as it is only running on one server. As long as you run Myisamchk, you can remember to stop the server (or lock the relevant part). This switch is forcibly on some systems because external locks are not working in any case. When compiled with mit-pthreads, the Skip-Locking option is default to open (ON) because Flock () is not fully supported by Mit-PThreads on all platforms. The only situation is if you run the MySQL server (not a customer) for the same data, you can't use - Skip-Locking, otherwise you don't have to clear it first or you lock the MySAMCHK. You can still use Lock Tables / Unlock Tables, even if you are using - Skip-Locking.
10.2.1 How to compile and link how to affect the speed of MySQL
Most of the following tests are performed on Linux and perform them with MySQL, but they should give some instructions for other operating systems and workloads.
When you use the -static link, you get the fastest executable. Use the UNIX socket instead of TCP / IP to connect a database to give some performance.
On Linux, when compiling with PGCC and -O6, you will get the fastest code. In order to compile "SQL_YACC.cc" with these options, you need about 200M memory because GCC / PGCC requires many memory to embed all functions (inline). When configuring mysql, you should also set CXX = GCC to avoid including libstdc libraries (it doesn't need).
Only by using a better compiler or better compiler option, you can get a 10-30% acceleration in your application. If you compile SQL servers yourself, this is especially important! On Intel, you should maximize, for example, using the PGCC or Cygnus CodeFusion compiler. We have tested a new Fujitsu compiler, but it is not enough not enough to optimize compilation MySQL.
Here is some of the measuring tables we have done:
If you use PGCC with -O6 and compile anything, the MySQLD server is more than 11% of GCC (using a string 99 version). If you dynamically link (no -static), the result is 13%. Note that you can still use a dynamically connected MySQL library. Only servers are critical to performance. If you use TCP / IP instead of UNIX socket, the result is 7.5% slow. On a Sun SparcStation 10, GCC2.7.3 is 13% faster than Sun Pro C 4.2. On Solaris 2.5.1, Mit-Pthreads on a single processor is 8-12% slower than Solaris with proton thread. With more load / CPUs, the difference should become larger.
The distribution of mysql-linux provided by TCX is compiled and static with PGCC.
10.2.2 Disk problem
As mentioned earlier, the disk seek is a large bottleneck of performance. This problem becomes more and more obvious when the data begins to grow. For large databases, where you are more or less to access data, you can rely on at least one disk seek to read and write several disks. In order to minimize this problem, use a disk with low find time. In order to increase the number of available disk axes (and thereby reducing the finding overhead), the symbol join file to different disks or split disk is possible.
Use symbolic connection
This means that you connect the index / data file symbol to other disks from the normal data directory (that can also be split). This makes the seek and read time (if the disk is not used for other things). see
10.2.2.1 Symbolic links using databases and tables
.
segmentation
Split means having a number of disks and places the first piece on the first disk, placed on the second disk, and the nth block is on the (n mod number_of_disks) disk, and so on. This means that if your normal data is sized for segmentation (or perfectly aligned), you will get better performance. Note that the segmentation is dependent on the OS and segmentation. So test your application with different segmentation sizes. see
10.8 Using your own benchmark
. Note that the difference in the splitting speed is very dependent on the parameters, depending on how you split the parameters and the number of disk, you can derive differences in order. Note that you have to choose to optimize random or sequential access.
For reliability, you may want to use the attack RAID 0 1 (split mirror), but in this case you will need 2 * n drives to save N drives. If you have money, this may be the best choice! However, you may also have to invest in some volume management software investments to process it efficiently. A good choice is to let a slightly important data (it can be regenerated) on the RAID 0 disk, and there will be an RAID 0 1 or RAID N disk that will indeed important data (like host information and log files). If you have a lot of writing because of the update parity, RAID N may be a problem. You can also set parameters for file systems used by the database. An easy change is to mount the file system in the NoAtime option. This is the last accessed time that is skipped in Inode, and this will avoid some disks.
10.2.2.1 Use symbolic links for databases and tables
You can move a table and database from the database directory to others, and replace them with symbols that link to new locations. You may want to do this, for example, transfer a database to a file system with more free space.
If mysql notes a table is a symbolic link, it will resolve the symbol link and use the table that actually points to the table, it works on all systems that support RealPath () (at least Linux and Solaris support realpath ())! On systems that do not support RealPath (), you should access the table through the real path and symbolic link! If you do this, the table will inconsistency after any update. MySQL default does not support database links. As long as you don't make a symbolic link between the database, everything will work normally. Assume that you have a database DB1 in the MySQL data directory, and make a symbolic link DB2 pointing to DB1:
Shell> CD / PATH / TO / DATADIR
Shell> LN -s DB1 DB2
Now, any table TBL_A in DB1 is also like a table TBL_A in DB2. If a thread updates db1.tbl_a and another thread updates db2.tbl_a, there will be problems.
If you really need this, you have to change the following code in "mysys / mf_format.c":
IF (! lstat (to, & stat_buff) / * Check if it's a symbolic link * /
IF (S_ISLNK (STAT_BUFF.ST_MODE && Realpath (To, BUFF))
Change the code to this:
IF (RealPath (To, BUFF))
10.2.3 Adjusting Server Parameters
You can use this command to get the MYSQLD server default buffer size:
Shell> mysqld --help
This command generates a table for all MySQLD options and configurable variables. The output includes default and looks like this:
Possible Variables for option --set-variable (-o) area:
Back_log current value: 5
Connect_timeout Current Value: 5
Delayed_Insert_timeout Current Value: 300
Delayed_Insert_limit Current Value: 100
Delayed_Queue_size Current Value: 1000
Flush_time Current Value: 0
Interactive_timeout Current Value: 28800
JOIN_BUFFER_SIZE CURRENT VALUE: 131072
Key_Buffer_Size Current Value: 1048540
Lower_case_table_names current value: 0
Long_Query_time Current Value: 10
MAX_ALLOWED_PACKET CURRENT VALUE: 1048576
Max_Connections Current Value: 100
MAX_CONNECT_ERRORS CURRENT VALUE: 10
Max_Delayed_threads Current Value: 20
MAX_HEAP_TABLE_SIZE CURRENT VALUE: 16777216
Max_join_size current value: 4294967295
Max_sort_length current value: 1024
Max_TMP_TABLES CURRENT VALUE: 32
Max_write_lock_count capital value: 4294967295
NET_BUFFER_LENGTH CURRENT VALUE: 16384
Query_Buffer_Size Current Value: 0
RECORD_BUFFER CURRENT VALUE: 131072
Sort_Buffer Current Value: 2097116
Table_Cache Current Value: 64
Thread_Concurrency Current Value: 10
TMP_TABLE_SIZE CURRENT VALUE: 1048576
Thread_Stack Current value: 131072
WAIT_TIMEOUT CURRENT VALUE: 28800
If there is a MySQLD server being running, by performing this command, you can see the value of the variable it actually use:
Shell> MySQLADMIN VARIABLES
Each option is described below. For buffers, lengths, and stack size are given by bytes, you can use the suffix "k" or "m" to indicate the display value in the K byte or megabytes. For example, 16m points to 16 megabytes. The suffix letter has no relationship; 16m and 16m are the same.
You can also see some statistics from a running server with a command show status. See 7.21 Show syntax (get information on the table, column).
Back_log
Requires the number of connections that mysql can have. When the primary mysql thread gets a lot of connection requests in a short period of time, this works, then the main thread takes some time (despite short) check the connection and start a new thread. The back_log value indicates how many requests can be present in the stack in the short time before mySQL temporarily stops answering the new request. Only if you expect to have a lot of connections in a short period of connection, you need to add it, in other words, this value is the size of the listening queue that comes to the TCP / IP connection. Your operating system has its own restrictions on this queue. Unix Listen (2) The man page for system call should have more details. Check your OS document to find the maximum value of this variable. Trying to set the restriction of Back_LOG higher than your operating system will be invalid.
Connect_timeout
MySQLD Server is waiting for a second number of seconds to connect packets before answering with Bad Handshake.
Delayed_INSERT_TIMEOUT
A INSERT Delayed thread should wait for the INSERT statement before terminating.
Delayed_Insert_limit
After inserting the delayed_insert_limit, the Insert Delayed processor will check if any SELECT statement is not executed. If so, perform these statements before proceeding.
Delayed_Queue_size
It should be allocated for the INSERT Delayed allocation to all a queue (in line). If the queue is full, any customer who performs INSERT Delayed will wait until the queue has space.
FLUSH_TIME
If this is set to a non-zero value, then every FLUSH_TIME seconds all tables will be turned off (to release resources and SYNC to disk).
Interactive_timeout
The server is waiting for a second number of seconds to act on an interactive connection before closing it. A interactive customer is defined as a customer who uses the client_interactive option for mysql_real_connect (). You can also see WAIT_Timeout.
JOIN_BUFFER_SIZE
The size of the buffer used for all the coupling (not using the index). The buffer allocates a buffer to each all-in-one in 2 tables. When the index is impossible, add this value to obtain a faster all-in-one. (The best way to get a quick connection is to increase the index.)
Key_Buffer_Size
Index blocks are buffered and shared by all threads. Key_buffer_size is a buffer size for index blocks, increasing it to get a better index (written to all readings and more), and you can afford to afford. If you make it too big, the system will start switching and really slowing down. Remember that since MySQL does not caches read data, you will have to cache some space for the OS file system. To get more speeds when writing multiple rows, use Lock Tables. See 7.24Lock Tables / Unlock Tables Syntax
.
Long_query_time
If a query is used exceeds it (in seconds), the Slow_Queries deciple will be increased.
MAX_ALLOWED_PACKET
A maximum size of a package. The message buffer is initialized to NET_BUFFER_LENGTH bytes, but can be added to the max_allowed_packet by the MAX_ALLOWED_PACKET. By default, this value will be too small to capture large (possibly wrong) packages. If you are using a large BLOB column, you must add this value. It should be as big as the biggest blob you want to use.
Max_Connections
Allowed customers' quantity. Increase this value increases the number of file descriptors required by mysqld. See the annotations for file descriptors. see
18.2.4 Too Many Connections Error
.
MAX_CONNECT_ERRORS
If there is more than the number of connections from one host interrupt, this host will prevent further connections. You can use the flush hosts command to clear a host.
MAX_DELAYED_THREADS
Don't start more than the number of threads to handle the INSERT Delayed statement. If you try to insert data in all INSERT DELAYED threads, the row will be inserted, just like the delayed property is not specified.
MAX_JOIN_SIZE
A link to read more than MAX_JOIN_SIZE will return an error will be returned. If your user wants to perform without a WHERE clause, it takes a long time and returns to a million line link, set it.
Max_sort_length
The number of bytes used in sorting the BLOB or Text value (each value is used only the max_sort_length byte; the rest is ignored).
MAX_TMP_TABLES
(This choice is not currently doing anything). A customer can maintain the maximum number of temporary tables that open.
NET_BUFFER_LENGTH
The communication buffer is reset between the query to the size. Usually this should not be changed, but if you have little memory, you can set it to query the size of the expectations. (Ie, the length of the SQL statement is expected. If the statement exceeds this length, the buffer is automatically expanded until the max_allowed_packet paragraph.)
RECORD_BUFFER
Each of the threads performing a sequential scan allocates a buffer of this size for each table scanned. If you do a lot of order scans, you may want to add this value.
Sort_buffer
The thread that needs to be sorted allocates a buffer of the size. Add this value to accelerate the ORDER BY or Group BY operation. see
18.5 Where to store temporary files in Mysql
.
Table_cache
Open the number of tables for all threads. Increase this value to increase the number of file descriptors required by mysqld. MySQL requires two file descriptors for each unique open table, see the annotations for file descriptors. For information on how to work, see
10.2.4 MYSQL how to open and close the table
.
TMP_TABLE_SIZE
If a temporary table exceeds the size, MySQL produces an error in the form of the Table TBL_NAME IS FULL. If you do a lot of advanced Group By queries, add the TMP_TABLE_SIZE value.
Thread_stack
The stack size of each thread. Many restrictions detected by Crash-Me test depends on this value. The general operation of the default team is big enough. see
10.8 Using your own benchmark
.
Wait_timeout
The server is waiting for a second number of seconds that is connected to a connection before turning off it. INTERACTIVE_TIMEOUT can also be seen.
MySQL uses is a very scalable algorithm, so you usually use very little memory to run or give Mysql more to get better performance.
If you have a lot of memory and a lot of tables and have a medium-quantity customer, you want the biggest performance, you should be like this:
Shell> SAFE_MYSQLD -O key_buffer = 16m -o table_cache = 128 /
-O sort_buffer = 4m -o record_buffer = 1M &
If you have fewer memory and a lot of connections, use such things:
Shell> Safe_Mysqld -o key_buffer = 512k -o sort_buffer = 100K /
-O record_buffer = 100k &
Or even:
Shell> SAFE_MYSQLD -O key_buffer = 512k -o sort_buffer = 16K /
-O table_cache = 32 -o record_buffer = 8k -o net_buffer = 1k &
If there are a lot of connections, "Switching issues" may occur unless the mysqld has been configured with very little memory. Of course, if you have enough memory for all connections, MySQLD is performed better.
Note that if you change an option of MySQLD, it actually only holds the example of the server.
In order to understand the effect of a variation of parameters, do this:
Shell> mysqld -o key_buffer = 32m --help
Ensure that the --help option is the last one; otherwise, the effect of any options listed in the command line will not be reflected in the output.
10.2.4 MYSQL how to open and close database tables
Table_cache, max_connections, and max_tmp_tables affect the maximum number of files that affect the server to keep open. If you add one or two of these values, you can encounter your operating system to open the file descriptor on the number of restrictions on the number of file descriptors. However, you can add this limit on many systems. Please teach your OS document to find out how to do this, because the system has different systems in each system.
Table_Cache is related to Max_Connections. For example, for 200 open connections, you should make a buffer of a table with at least 200 * n, which is the maximum number of tables in a join.
The cache of the open table can be added to the maximum value of Table_Cache (the default is 64; this can be changed by mysqld -o Table_Cache = # option). A table is absolutely not closed unless it is full and another thread is trying to open a table or if you use MySQLADMIN REFRESH or MySQLADMIN FLUSH-TABLES.
When the table is buffered, the server uses the following procedure to find a cache entry to use:
Not the current table is released, in order recently used (LRU) order. If the buffer is full and there is no table, it can be released, but a new table needs to open, and the cache must be temporarily expanded. If the cache is in a temporary expanded state and a table is turned off from being used, it is turned off and released from the cache.
Open a table for each concurrent access. This means that if you let 2 threads access the same table or to access the table twice in the same query (with AS), the table needs to be opened twice. The first opening of any table accounts for 2 file descriptors; each additional use of the table accounts for only one file descriptor. For the extra descriptor for the first time for the index file; this descriptor is shared between all threads. 10.2.5 Disadvantages of Creating a large amount of database table in the same database
If you have a lot of files in a directory, open, close, and create actions will be slow. If you perform a SELECT statement on many different tables, there will be a little overhead when the table is over, because the table that must be opened, and the other must be turned off. You can reduce this overhead by making the table buffer.
10.2.6 Why is there so many open tables?
When you run mysqladmin status, you will see something like this:
Uptime: 426 Running Threads: 1 Questions: 11082 Reloads: 1 Open Tables: 12
If you have only 6 tables, this may be a bit confusing.
MySQL is multi-threaded, so it can have many inquiry on the same table at the same time. In order to be two threads, there is a different state in the same file minus, the show is opened independently by each concurrent process. This consumes some memory and an additional file descriptor for the data file. The index file descriptor is shared between all threads.
10.2.7 how to use memory 10.2.7
The following table indicates some ways to use the MYSQLD server. Where the application is applied, the name of the server variable related to the memory is given.
Keyword buffer (variable key_buffer_size) is shared by all threads; other buffers used by the server are allocated when needed. See 10.2.3 Adjusting Server Parameters. Each connection uses some thread-specific spaces; a stack (default 64K, variable thread_stack), a connection buffer (variable net_buffer_length) and a result buffer (variable net_buffer_length). When needed, the connection buffer and the result buffer are dynamically expanded to max_allowed_packet. The string is also assigned when a query is running a copy of the current query. All threads share the same base memory. There is currently no memory maps (except for compression tables, but that is another story). This is because the 4GB 32-bit memory space is not enough for the largest database. When a 64-bit addressing space becomes more common, we can increase comprehensive support for memory mappings. Each sequential scan request assigns a read buffer (variable RECORD_BUFFER). All joints are completed over and over again and most of the coupling can do not even have a temporary table. The most temporary table is a memory-based (HEAP) table. There is a temporary table or a table containing a BLOB column in a table with a large record length (the length of the length of the column) or a table containing the BLOB column is stored on the disk. In MySQL version 3.23.2 Previous question is if a Heap table exceeds the size of TMP_TABLE_SIZE, you get the error The Table TBL_NAME IS FULL. In the updated version, this will automatically transform the memory (HEAP) table to a disk-based table if necessary. To solve this problem, you can add the size of the temporary table by setting the mysqld TMP_TABLE_SIZE option or by setting SQL_BIG_TABLES options in the client program. See 7.25 Set Option Syntax. In Mysql 3.20, the maximum size of the temporary table is RECORD_BUFFER * 16, so if you are using this version, you must increase the RECORD_BUFFER value. You can also start MySQLD using the --big-table option to store the temporary table on disk, however, this will affect the speed of many complex queries. Most of the sequencing requests assign a sort buffer and one or two temporary files. See where 18.5 mysql stores temporary files. Almost all grammar analysis and computing are completed in a local memory. There is no memory overhead for small projects and general slower memory allocation and release are avoided. Memory is only interested in unexpected large string (this is done with malloc () and free ()). Each index file is only opened once, and the data file opens once for each concurrent running thread. For each concurrent thread, a table structure is assigned, a buffer for each column structure and size of 3 * n (here N is the largest proceedient, not a blob column). A blob uses 5-8 bytes plus BLOB data. For each table with a BLOB column, a buffer is dynamically expanded to read a larger BLOB value. If you scan a table, assign a buffer that is as large as the maximum BLOB value. For all tables in the table, it is saved in a cache and is managed as a FIFO. There are usually 64 portals. If a table is used by 2 running threads, the cache contains two portions. See how 10.2.4 mysql opens and closes the database table. A mysqladmin flush-tables command turns off all the tables that are not used and the currently executed thread is completed, tag all the table preparations in use are closed. This will effectively release most of the memory used. PS and other system status programs can be reported using MYSQLD to use many memory. This can be caused by thread stacks on different memory addresses. For example, the Solaris version of the PS will use the stored memory that is not used in the stack. You can verify it by checking the available swaps with swap -s. We tested mysqld with the commercial memory vulnerability detector, so there should be no memory vulnerability.
10.2.8 MYSQL how to lock the database table
All locks in MySQL will not be deadlocked. This is always requested by always requesting all necessary locks immediately before a query and always locks the table in the same order. The principle of locking method used for Write, MySQL is as follows:
If there is no lock on the table, put a lock above it. Otherwise, put the lock request in the write lock queue.
The principle of locking method used for read and mysql is as follows:
If there is no write lock on the table, place a read lock on it top. Otherwise, put the lock request in the read lock queue.
When a lock is released, the lock can be written in the write lock queue is obtained, and then the thread in the read lock queue.
This means that if you have a lot of changes on a table, the SELECT statement will wait until there is more changes.
In order to solve a lot of INSERT and SELECT operations in a table, you can insert rows in a temporary table and update the real table with records from the temporary table.
This can be done by the following code:
MySQL> LOCK TABLES REAL_TABLE WRITE, INSERT_TABLE WRITE
MySQL> INSERT INTO REAL_TABLE SELECT * from Insert_Table;
mysql> delete from insert_table;
mysql> unlock Tables;
If you distinguish between some specific cases, you can use the Low_Priority option for Insert. See 7.14 INSERT syntax.
You can also change the lock code in "MySys / THR_LOCK.C" to use a single queue. In this case, write lock and read lock will have the same priority, which may help some applications.
10.2.9 Database list lock problem
MySQL's table lock code will not be dead.
MySQL uses a table-level lock (instead of a row lock or column-level lock) to achieve a high locking speed. For large tables, table-level locks are better than the programs, but of course there are some defects.
In MySQL3.23.7 and later, one person can insert the line into myisam table and other threads are reading the table. Note that there is currently only the order in the table is deleted.
Table-level lock makes many threads to read a table at the same time, but if a thread wants to write a table, it must first get exclusive access rights. During the changes, all other threads that you want to access the specific table will wait until you do.
Because the database changes are often considered more important than Select, updating all statements of a table have higher priorities than the statements that retrieve information from a table. This should ensure that the changes are not "starved" because a person will make a lot of heavy queries for a particular table.
Starting from MySQL 3.23.7, a person can use the max_write_lock_count variable to force MySQL to issue a SELECT after a specific number of inserts on a table.
One of the main problems is as follows:
A customer makes a SELECT that spends a long time. The other customers then send an update on a table; this customer will wait until SELECT is complete. Another customer issues another SELECT statement on the same table; because Update has a higher priority than SELECT, the SELECT will wait for the completion of Update. It will also wait for the first SELECT to complete!
Some possible solutions to this issue are:
Try to make the SELECT statement faster; you may have to create some summary tables. Use --Low-Priority-Updates to boot mysqld. This will give all updates (modifications) a statement of a table to a lower priority than the SELECT statement. In this case, the last SELECT statement in the previous situation will be executed before the INSERT statement. You can use the low_priority property to give a lower priority with a specific INSERT, UPDATE or DELETE statement. Specify a low value for max_write_lock_count to start MySQLD so that the READ lock is given after a certain number of Write locks. By using SQL commands: SET SQL_LOW_PRIORITY_UPDATES = 1, you can specify all changes from a specific thread should be done with low priority. See 7.25 Set Option Syntax. You can use the high_priority property to indicate that a specific Select is important. See 7.12 SELECT syntax. If you have a problem about INSERT combine SELECT, switch to the use of the new Myisam table because they support concurrent SELECT and INSERT. If you mainly mix the INSERT and SELECT statement, the INSERT of the delayed property will likely solve your problem. See 7.14 INSERT syntax. If you have questions about SELECT and DELETE, DELETE for the Limit option helps you. See 7.11 Delete syntax. 10.3 Make your data as small as possible
One of the most basic optimizations is to make your data (and index) occupy the space occupied on the disk (and in memory) as small as possible. This gives a huge improvement because the disk reads faster and usually also uses fewer main memory. If you do an index on a smaller column, the index also occupies less resources.
You can use the following technology to make the table performance better and minimize storage space:
Use the most effective (minimum) type as much as possible. MySQL has a lot of specialization of disk space and memory. If you may make the table smaller, the smaller integer type is used. For example, Mediumint is often better than Int. If possible, the declaration is listed as NOT NULL. It makes anything faster and you saved one for each column. Note If you do need NULL in your application, you should use it, just avoid default, there is it on all columns. If you don't have a long-length column (VARCHAR, TEXT or BLOB column), use fixed-size recording formats. This is faster but unfortunately, you might waste some space. See 10.6 Select a table type. Each table should have as short as possible. This makes it easy to identify a row. For each table, you must decide which storage / index method to use. See 9.4 mysql Table Type. See also 10.6 Select a table type. Just create the index you really need. Indexes are good for retrieving but when you need to quickly store things, it will become bad. If you primarily access a table primarily through a combination of columns, do an index. The first index portion should be the most commonly used column. If you always use a lot of columns, you should first use more copies to get better column index. If it is possible that an index has a unique prefix on the first few characters, it is only better to index the prefix. MySQL supports an index on a part of a character column. A shorter index is faster, not only because they account for less disk space and because they will give you more hit rates in the index cache and therefore have fewer disks. See 10.2.3 Adjusting Server Parameters. In some cases, it is beneficial to segment a table that is often scanned into 2 tables. Especially if it is a dynamic format table and it may make a table that can be used to find a smaller static format of the relevant line.
10.4 use of mysql index
The index is used to quickly find rows that use a specific value on a column. There is no index, MySQL has to first start with the first record and then read the whole table until it finds the relevant line. The larger the table, the more time spent. If the table has an index for the column of the query, MySQL can quickly reach a location to search for the intermediate of the data file, and there is no need to consider all data. If a table has 1000 lines, this is more than 100 times more than the order. Note that you need to access almost all 1,000 lines, and it is read in a faster order, because we avoid disk findings. All mysql indexes are stored in the B tree. Strings are automatically compressed and end space. See 7.27 CREATE INDEX syntax.
Index for:
Quickly find rows that match a WHERE clause. Retrieve rows from other tables when the join is performed. Find a MAX () or min () value for a specific index column. If you are sorted or packet on the leftmost prefix (for example, ORDER BY Key_PART_1, Key_PART_2), sort or group a table. If all key values follow the DESC, the keys are read in reverse sequence. In some cases, a query can be optimized to retrieve values without consulting data files. If all the columns of some tables are digital and constitute the leftmost prefixed to some keys, for faster, the value can be retrieved from the index tree.
Suppose you issued the following SELECT statement:
MySQL> SELECT * from TBL_NAME WHERE COL1 = VAL1 and col2 = VAL2;
If a multi-column index is stored on COL1 and COL2, the appropriate row can be taken out directly. If the separate single-line column index is in COL1 and COL2, the optimizer attempts to find less rows by deciding which index will find a more restrictive index and use the index.
If the table has a multi-column index, any leftmost index prefix can be used by the optimizer to find out. For example, if you have a 3-rantric index (Col1, Col2, Col3), you have indexed search capabilities on (COL1), (COL1, COL2) and (COL1, COL2, COL3).
If the column does not constitute the leftmost prefixed to the index, MySQL cannot use a part of the index. Assume that the SELECT statement you appear below:
MySQL> SELECT * from TBL_NAME WHERE COL1 = VAL1;
MySQL> SELECT * from TBL_NAME WHERE COL2 = VAL2;
MySQL> SELECT * from TBL_NAME WHERE COL2 = VAL2 AND COL3 = VAL3;
If an index is stored on (COL1, COL2, COL3), only the first query displayed in the above uses the index. The second and third queries have indeed columns, but (col2) and (col2, col3) are not (COL1, COL2, COL3) of the leftmost prefix.
If the LIKE parameter is a constant string that does not start with a wildcard character, MySQL also uses an index for the LIKE. For example, the following SELECT statement uses an index:
MySQL> Select * from TBL_NAME WHERE KEY_COL LIKE "patrick%";
MySQL> SELECT * from TBL_NAME WHERE Key_COL LIKE "PAT% _CK%";
In the first statement, only the row of "patrick" <= key_col <"patricl" is considered. In the second statement, only the line of "PAT" <= Key_col <"PAU" is considered.
The following SELECT statement will not use an index: mysql> select * from tbl_name where key_col like "% patrick%";
MySQL> SELECT * from TBL_NAME WHERE KEY_COL LIKE Other_col;
In the first statement, the LIKE value begins with a wildcard character. In the second statement, the LIKE value is not a constant.
If column_name is an index, search using Column_Name Is Null will use an index.
MySQL usually uses the index of a minimum number of rows. A index is used for columns that compare them with the following operators: =,>,> =, <, <=, betWeen, and a column with a non-all-in-one prerigation icon 'something%'.
Any index that does not span all the AND levels of the WHERE clause is not to optimize inquiry.
The following WHERE clause uses an index:
... where index_part1 = 1 and index_part2 = 2
... where index = 1 or a = 10 and index = 2 / * index = 1 or index = 2 * /
... where index_part1 = 'Hello' and index_part_3 = 5
/ * Optimized Like "index_part1 = 'hello'" * /
These WHERE clauses do not use indexes:
... where index_part2 = 1 and index_part3 = 2 / * index_part_1 is not ready * /
... where index = 1 or a = 10 / * no index * /
... where index_part1 = 1 or index_part2 = 10 / * no index spans all rows * /
10.5 Access or update data query speed
First, a thing affects all inquiry. The more complex the system settings you have, you get more overhead.
If you don't let any GRANT statement execute, MySQL will slightly optimize the license check. So if you have a lot, it is worth spending time to avoid authorization, otherwise more licenses have more overhead.
If your problem is related to some obvious mysql functions, you can always calculate our time in MySQL customers:
MySQL> SELECT BENCHMARK (1000000, 1 1);
----------------------
Benchmark (1000000, 1 1) |
----------------------
| 0 |
----------------------
1 row in set (0.32 sec)
The above shows that mysql can perform 1,000,000 expressions in Pentium II 400MHz.
All MySQL functions should be highly optimized, but in some exceptions and benchmark (loop_count, expression) is an excellent tool for finding your query.
10.5.1 Estimation Query Performance
In most cases, you can calculate the disk seek estimation performance. For small tables, you can usually find rows in 1 disk seek (because this index may be buffered). For larger tables, you can estimate it (using B tree index), you will need: log (row_count) / log (INDEX_BLOCK_LENGTH / 3 * 2 / (index_length data_point_length) 1 seek to find row. In MySQL, the index block is usually 1024 bytes and the data pointer is usually 4 bytes, which gives you a 500,000 row with a index length of 3 (medium integers): log (500,000) / log ( 1024/3 * 2 / (3 4)) 1 = 4 times.
The index above will require approximately 500,000 * 7 * 3/2 = 5.2m, (assuming that the index buffer is filled with 2/3 (it is typical)), you may be able to have indexes in memory and you It may only take 1-2 to call from OS read data to find out.
However, for writing, you will need 4 seek requests (such as above) to find where to store new indexes and usually need to update this index and write to the line.
Note that the above does not mean that your application will slowly degrade with n log n! When the table becomes larger, as long as everything is buffered by the OS or SQL server, things will be slower than or less. After the data becomes too large, things will begin becomes more slow until your application is limited to disk seek restrictions (it increases in n log n). In order to avoid this increase, the index buffer increases with the data increase. See 10.2.3 Adjusting Server Parameters.
10.5.2 Speed in SELECT query
In general, when you want to make a slower Select ... where faster, the first thing you have checked is whether you can add an index. See the use of 10.4 mysql index. All references between different tables should usually be done with an index. You can use explain to determine which index is used to a SELECT statement. See 7.22 Explain syntax (get information about a SELECT).
Some general recommendations:
To help MySQL better optimize the query, running Myisamchk - Anseze on a table after it has been loaded. This is a value for each update, indicating that there is the same value of the same value (of course, for the unique index, this is always 1.) In order to sort an index and data according to an index, use myisamchk --sort-index --sort- RECORDS = 1 (if you want to sort on index 1). If you have a unique index, you want to read all records in order based on the index, which is a good way to make it faster. Note that this sort is not optimally written, and it will take a long time to a big table!
10.5.3 how to optimize WHERE clause
WHERE optimization is placed in Select because they are most mainly in use, but the same optimization is used for DELETE and UPDATE statements.
It should also be noted that this section is incomplete. MySQL did make a lot of optimization and we didn't have time to record them.
Some optimized columns implemented by MySQL:
Delete unnecessary parentheses: ((a and b) and c or ((a and b) and (c and d)))))
-> (a and b and c) OR (A and B and C and D)
Constant is transferred: (a
-> b> 5 and b = c and a = 5
Delete constant conditions (required due to constant transfer): (b> = 5 and b = 5) or (b = 6 and 5 = 5) OR (b = 7 and 5 = 6) -> b = 5 or b = 6
The constant expression used by the index is only calculated once. There is no WHERE's count (*) in a single table retrieves information from the table. When only one table is used, any Not NULL expression is also done. Early detection of invalid constant expressions. MySQL quickly detects some SELECT statements that it is impossible and does not return. If you don't use Group By or group function (count (), min () ...), Having with WHERE mergers. Constructs a simpler WHERE to get a faster WHERE computation for each sub-connection (SUB JOIN) to get a faster WHERE calculation and skip the record as soon as possible. All constants are first read in any other table in the query. A constant table is:
A empty table or a table with 1 row. Table for use with the WHERE clause of a UNIQUE index, or a primary key, all index portions use a constant expression and the index portion is defined as NOT NULL. All the following tables are used as constant tables: mysql> select * from T where primary_key = 1;
MySQL> SELECT * FROM T1, T2
WHERE T1.PRIMARY_KEY = 1 and t2.primary_key = t1.id;
The best connection combination for the link table is to find all the possibilities: (. If all the columns in ORDER BY and GROUP BY are from the same table, the table is first selected when it is honest. If there is a Order by The clause and a different group By clause, or if the order by or group by contains columns not from other tables from the first table in the joint queue, create a temporary table. If you use SQL_SMALL_RESULT, MySQL will use one Table in memory. Because Distinct is transformed into a group by in all columns, the Distinct and Order By combine will also need a temporary table in many cases. Each table is queried and uses less than 30 % Of the index of the row. If such an index cannot be found, use a quick table scan. In some cases, MySQL can read the row from the index, and even if you don't consult the data file. All columns used by the index are numbers. Then only the index tree is used to answer the query. Before each record is output, the rows that do not match the Having clause are skipped.
Here is some very fast query examples:
MySQL> Select Count (*) from TBL_NAME;
MySQL> SELECT MIN (key_part1), max (key_part1) from TBL_NAME;
MySQL> SELECT MAX (Key_Part2) from TBL_NAME
WHERE key_part_1 = constant;
MySQL> SELECT ... from TBL_NAME
Order by key_part1, key_part2, ... limit 10;
MySQL> SELECT ... from TBL_NAME
Order by key_part1 desc, key_part2 desc, ... limited 10;
The following query can be solved only using the index tree (assuming the index race is a number):
MySQL> Select Key_Part1, Key_Part2 from TBL_NAME Where Key_Part1 = VAL;
MySQL> SELECT Count (*) from TBL_NAME
WHERE key_part1 = val1 and key_part2 = val2; mysql> select key_part2 from tbl_name group by key_part1;
The following query uses the index to sort order, no one another sort:
MySQL> SELECT ... from TBL_NAME ORDER BY key_part1, key_part2, ...
MySQL> SELECT ... from TBL_NAME ORDER BY Key_Part1 DESC, Key_Part2 DESC, ...
10.5.4 MYSQL how to optimize Left Join
In MySQL, A Left Join B is implemented as follows:
Table B is set to depend on Table A. Table A is set to rely on all tables with the LEFT JOIN condition (except B). All LEFT JOIN conditions are moved to the WHERE clause. Perform all standard connection optimization, except that a table is always read after all it is dependent on. If there is a loop dependency, MySQL will issue an error. Perform all standard WHERE optimizations. If there is a row in A, there is no row in b matching the Left Join condition, and all columns set to NULL are generated in B. If you use Left Join to find rows that do not exist in some tables and you have the following tests in the where section: Column_name is null, where column_name is declared as a column of not null, then MySQL has found the matching Left Join After a row of conditions, it will stop looking for (a specific key combination).
10.5.5 MYSQL How to Optimize Limit
In some cases, Mysql will process queries in different ways when you use LIMIT # without using haVing.
If you use Limit to choose some lines, when MySQL generally likes to make a complete table scan, it will use the index in some cases. If you use Limit # and ORDER BY, MySQL Once you find the first # line, you will end sorting instead of sorting the entire table. When combined with Limit # and Distinct, once the mysql found # a unique row, it will stop. In some cases, a group By can solve it by sequential reading keys (or sorted on the keys) and then calculates the summary until the key value change. In this case, LIMIT # will not calculate any unnecessary group. As long as MySQL has sent the first # to the customer, it will give up the query. LIMIT 0 will always return an empty collection. This is useful for checking queries and getting the column type of the result column. The size of the temporary table uses LIMIT # calculations to solve the query.
10.5.6 INSERT query speed
Insert a record time consists of the following:
Connection: (3) Send a query to the server: (2) Analysis query: (2) Insert record: (1 x Record size) Insert index: (1 x index) Close: (1)
The numbers here are a bit proportional to the overall time. This doesn't consider opening the initial overhead of the table (it is once for each query that runs concurrently).
The size of the table slows down the insertion of the index at the speed of N log n (B).
Accelerate some methods inserted:
If you insert a lot of rows from the same customer, use the INSERT statement of multiple values. This is faster than using the INSERT statement (several times in some cases). If you insert a lot from different customers, you can get a higher speed by using the Insert Delayed statement. See 7.14 INSERT syntax. Note that with Myisam if there is no deleted row in the table, you can insert the row while SELECT: S is running. Use Load Data Infile when loading a table from a text file. This is usually 20 times faster than using a lot of INSERT statements. See 7.16 Load Data Infile syntax. When there are a lot of indexes, you may do more work makes Load Data INFILE faster. Use the following procedure: Selectively create a table with CREATE TABLE. For example, using mysql or perl-dbi. Execute the flush table, or the housing command mysqladmin flush-tables. Use Myisamchk - Keys-used = 0 -RQ / PATH / TO / DB / TBL_NAME. This will remove the use of all indexes from the table. Insert the data into the table with Load Data Infile, which will not update any index, so it is very fast. If you have Myisampack and want to compress the table, run myisampack above it. See the feature of 10.6.3 compression table. Create an index with myisamchk -r -q / path / to / db / tbl_name. This will create an index tree in memory before writing it to the disk, and it is faster because it avoids a lot of disk findings. As a result, the index tree is also perfectly balanced. Execute the flush table, or the housing command mysqladmin flush-tables. This process will be constructed into a future version of the MySQL to Load Data Infile. You can lock your table to accelerate insertion. Mysql> Lock Tables A Write;
Mysql> Insert Into a Values (1, 23), (2, 34), (4, 33);
MySQL> Insert Into a Values (8, 26), (6, 29);
mysql> unlock Tables;
The main speed difference is that the index buffer is only cleaned to the disk, after all INSERT statements are completed. Generally there is index buffer cleaning with different INSERT statements. If you can insert all rows with a single statement, lock does not need. The lock will also reduce the overall time of multi-connection test, but the maximum wait time for some threads will rise (because they wait for the lock). For example: Thread 1 Does 1000 Inserts
Thread 2, 3, And 4 Does 1 INSERT
Thread 5 Does 1000 Inserts
If you don't use the lock, 2, 3 and 4 will be completed before 1 and 5. If you use lock, 2, 3 and 4 will not be completed before 1 or 5, but the overall time should be about 40%. Because INSERT, UPDATE, and DELETE operations are quickly in MySQL, you will get better overall performance through more than about 5 times of continuous plug-in or updating something. If you do a lot of rows, you can do a Lock Tables, and occasionally then make a unlock Tables (approximately 1000 rows) to allow additional thread access tables. This will still lead to good performance. Of course, Load Data Infile is still faster to load data.
In order to get some faster speeds, the keyword buffer is expanded to Load Data Infile and INSERT. See 10.2.3 Adjusting Server Parameters.
10.5.7 UPDATE query speed
Changing the query is optimized to have a SELECT query with a write overhead. Write speed depends on the number of updated data sizes and the number of indexes being updated.
Another way to make changes faster is to postpone changes and then do a lot of changes in one line. If you lock the table, make a lot of changes to a row of a row than once. Note that the dynamic recording format changes a longer total log, which may cut the record. So if you often do this, it is very important when you have Optimize Table from time to time. See 7.9 Optimize Table syntax.
10.5.8 Delete query speed
Deleting a recorded time is accurately proportional to the number of indexes. In order to delete records faster, you can increase the size of the index cache. See 10.2.3 Adjusting Server Parameters.
Deleting all rows from a table is much more than a large part of the delete row.
10.6 Select a table type
With mysql, current (version 3.23.5) You can choose from a speed view in the format of the 4 available table.
Static Myisam
This format is the easiest and safe format, which is also the fastest in disk format. The speed comes from the difficult way to find on the disk. When there is a index and static format, it is simple, but the line length is multiplied by the number of rows. Moreover, when scanning a table, it is easy to read a constant record with each disk read. Security comes from if your computer crashes when writing a static Myisam file, Myisamchk can easily point out where every line starts and ends, so it usually reclaims all records, except that partially written. Note that in MySQL, all indexes are always rebuilt.
Dynamic Myisam
This format is a bit complicated because each line must have a header to explain how long it is. When a record becomes longer when the change is changed, it can also end in more than one position. You can use Optimize Table or Myisamchk to organize a table. If you have a static data that is accessed / changed like some VARCHAR or BLOB columns, move the dynamic column into another table to avoid debris.
Compressed Myisam
This is a read-only type, generated with an optional MyisamPack tool.
Memory (HEAP)
This type of table format is useful for small / medium-sized lookup tables. To copy / create a common lookup table (in connection) to a (metaphysque) HEAP table may speed up multiple table links. Assume that we want to do the following coupons, you may take a few times with the same data.
Select Tab1.a, Tab3.a from Tab1, Tab2, Tab3
Where tab1.a = tab2.a and tab2.a = tab3.a and tab2.c! = 0;
To accelerate it, we can create a temporary table with Tab2 and Tab3, because use the same column (Tab1.a). Here is the command to create the table and the result selection.
Create Temporary Table Test Type = HEAP
SELECT
Tab2.a as a2, tab3.a as a3
From
Tab2, TAB3
WHERE
Tab2.a = tab3.a and c = 0;
SELECT TAB1.A, TEST.A3 from Tab1, Test where tab1.a = TEST.A1;
SELECT TAB1.B, TEST.A3 from Tab1, Test where tab1.a = TEST.A1 AND SOMETHING
10.6.1 Characteristics of Static (Length) Table
This is the default format. It is used in the table does not contain varchar, blob, or text. All Char, Numeric and Decimal columns fill the column width. very fast. It is easy to buffer. It is easy to rebuild after crash because the record is located at a fixed location. It is not necessary to reorganize (with Myisamchk) unless a huge record is deleted and you want to return to the operating system to the free disk space. More disk space is usually required than a dynamic table.
10.6.2 Characteristics of dynamic tables
If the table contains any varchar, blob, or text column, use this format. All string columns are dynamic (except those columns less than 4). Each record is in front of a bitmap, indicating which column is empty (''), or which is zero to the digital column (this is different from a column containing the NULL value). If the string column has zero length after deleting the tail blank, or the number is column with zero value, it is tagged in the bitmap and does not save it on the disk. Non-empty strings are stored as a length byte plus string content. Usually more disk spaces are usually compared to the minimum. Each record only uses the required space. If a record becomes larger, it is cut by multiple segments as needed, which causes the fragmentation. If you update the line with more than a line length, the bank will be segmented. In this case, you may have to run myisamchk -r to make performance better. Do some statistics using Myisamchk-Ei TBL_NAME. It is not easy to reconstruct after the crash, as a record can be a lot of sections and a connection (fragment) can be lost. The length of the desired line of dynamic size is: 3 (Number of Column 7) / 8
(Number of Char Column)
Packed Size of Numeric Column
Length of Strings
(Number of Null Column 7) / 8
There are 6 bytes of punishment for each connection. One dynamic record is linked whenever the change causes the record. Each new link will be at least 20 bytes, so the next increase will be in the same link. If not, there will be another link. You can check how much links with myisamchk -ed. All links can be deleted with Myisamchk -r.
10.6.3 Characteristics of Compressed Table
A read-only table created with myisampack utility. All customers with MySQL extended email support can keep a Myisampack copy for them. The decompression code exists in all MySQL distribution so that there is no MYisamPack to read the table compressed with MyisamPack. Take a small disk space to minimize disk usage. Each record is compressed separately (small access overhead). The head of a record is a fixed length (1-3 bytes), depending on the largest record in the table. Each column is compressed in different ways. Some compression types are:
Usually there is a different Haxman watch for each column. The suffix blank compression. Prefix blank compression. 1 bits are stored using a number of values 0. If the value of the integer column has a small range, the column uses the minimum possible type to store. For example, if all values are in the range of 0 to 255, a Bigint column (8 bytes) can be stored as a tinyint column (1 byte). If the column is only a small set, the column type is transformed to the enum. A combination of the above compression method can be used. You can handle a record of a dignified or dynamic length, but you cannot process the BLOB or Text column. Can decompress with Myisamchk.
MySQL supports different index types, but the general type is ISAM. This is a B tree index and you can calculate the index file for the index file as (key_length 4) * 0.67, in the sum of the keys. (This is the worst case, when all keys are inserted in ordering order.)
The string index is blank compressed. If the first index portion is a string, it also compresses the prefix. If the character is column, there are many tail blank or a blank compression makes the index file smaller blank compression. If many strings have the same prefix, prefix compression is helpful.
10.6.4 Features of the memory table
The stack is only in memory, so if mysqld is turned off or collapsed, they will be lost, but because they are very fast, no matter how they are useful.
The HEAP table inside MySQL uses 100% dynamic hash without overflow zones and does not have problems related to deleting. You can only access things (usually = operators) by using an index in a stack table (usually = operator).
The disadvantage of the stack is:
You have to need enough additional memory for all the stacks you want to use at the same time. You can't search on a part of the index. You can't search for the next entry in order (ie, make an order by this index). MySQL does not count how many rows between 2 values. This is used by the optimizer to determine which index is used, but on the other hand, there is no need to find a disk.
10.7 Other Optimization Tips
The unsociated suggestions for accelerating systems are:
Use a lasting connection database to avoid connection overhead. Always check all your inquiry to use the index you have created in the table. In MySQL, you can do it with an explain command. See 7.22 Explain Syntax (get information about SELECT). Try to avoid complex SELECT queries that have changed a lot of tables. This avoids problems related to the lock table. In some cases, it makes it meaningful to introduce a "hash" column based on information from the columns from other tables. If the column is shorter and there is a reasonable unique value, it can be more faster than a large index on many columns. In MySQL, it is easy to use this additional column: select * from table where hash = 'calculated hash on col1 and col2' and col_1 = 'constant' and color 'and color' and .. For tables with a lot of changes, you should try to avoid all VARCHAR or BLOB columns. As long as you use a single varchar or blob column, you will get the dynamic line length. See 9.4 mysql Table Type. Just because the rows are too large, the segmentation table is different, which is generally not used. For the sake of travel, the biggest performance hit is a disk seek to find the first byte of the line. After finding the data, most new disks are fast enough to most applications, and can read the entire line. It does have the only case where it is necessary to split is if its dynamic row-sized table (see above) can become a fixed row size, or if you often need to scan the table without need most columns. See 9.4 mysql Table Type. If you often need to calculate information from a lot of rows (such as count), introducing a new table and updating the counter in real time may be better. Type Change Update Table Set Count = Count 1 Where INDEX_COLUMN = Constant is very fast! This is indeed important when you use only a table-level lock (multiple read / single write) database like MySQL. This will also give the performance of most databases because the lock manager has fewer things in this case. 11111111111111111111 If you need to collect statistics from a large record file table, use summary tables instead of scanning the entire table. Maintenance summary should be faster than trying to do "real-time" statistics. When there is a change rather than the running application, it is necessary to regenerate a new summary table (depending on business decision) from the recorded file! If possible, the report should be classified as "Real Time" or "Statistics", where the data required to report only is generated based on the summary table generated from the actual data. Take advantage of the facts of the default value. When the insert value is different from the default value, it is only explicitly inserted. This reduces the syntax analysis that mysql needs to do and improve the insertion speed. In some cases, it is convenient to pack and store data into a blob. In this case, you must add additional code in your application to package / uninterolize things in blob, but this method can save a lot of access at some stage. This is very practical when you have data that does not meet the static table structure. In general, you should try to save data in a third paradigm, but if you need these to get a faster speed, you should don't have to worry about repeated or create a summary table. Store procedures or UDF (user-defined functions) may be a good way to get better performance, however if you use some databases that don't support it, in this case, you should always have one method (slower Yes) do this. You can always get some benefits by buffering query / answers in your app and trying to do a lot of insert / updates at the same time. If your database supports lock tables (like mysql and oracle), this should help ensure that the index cache is only emptied once after all updates. But you don't know when you write your data, use INSERT / *! Delayed * /. This speeds up processing because many records can be written with a disk write.
When you want to make your choice more important, use INSERT / *! Low_priority * /. Use select / *! High_priority * / to get the selection of the queue, which is even if someone waits to do a choice to do. Use multi-line INSERT statements to store a lot of rows with a SQL command (many SQL servers support it). Use load data infile to load a large number of data. This is even more faster than the general insertion and when Myisamchk is integrated in MySQLD. Use the Auto_Increment column to form a unique value. When using a dynamic table format, Optimize Table occasionally uses Optimize Table to avoid fragmentation. See 7.9o Ptimize Table syntax. It may be possible to use the HEAP table to get a faster speed. See 9.4 mysql Table Type. When using a normal web server setting, the image should be stored as a file. This is only a reference to a file that is stored in the database. The main reason for this is that a normal web server is much better than the database content than the database, so if you are using a file, it is easier to get a fast system. Use memory tables for frequent access to frequent access (like information on the last display slogan without cookie users). Columns with the same information in different tables should be declared as the same and have the same name. Before version 3.23, you can only rely on slower linkages. Try to simplify the name (using Name in the customer table instead of Customer_Name). In order to make your name to other SQL servers, you should make them shorter than 18 characters. If you need to do very high speed, you should study the underlying interface of the data store supported by different SQL servers! For example, you can access mysql myisam directly, you can get 2-5 times the speed improvement than using the SQL interface. However, in order to be able to do it, the data must be on the server with the application in the same machine, and usually it should only be accessed by one process (because the external file lock is really slow). By introducing the underlying Myisam command in the MySQL server, the above problem can be eliminated (if necessary, this may be an easy way to achieve better performance). This type of optimization should be easily supported with a well-designed database interface. In many cases, from a data inventory (using a real-time connection) is faster than accessing a text file, just because the database is more compact (if you use digital data) and this will involve less disk storage take. You also save code because you don't have to analyze your text files to find out the boundaries of the rows and columns. You can also use replication acceleration. See 19.1 Database Copy. 10.8 Using your own benchmark
You decide to test your app and database to find the bottleneck. You can easily determine the next bottleneck (etc.) by correcting it (or by using a "dummy module" instead of the bottleneck). Even if you have your app, the overall performance is "enough", you should at least make a "plan" for each bottleneck, if someone "does need to correct it", how to solve it.
For some examples of some portable reference procedures, see the mysql reference kit. See the 11 mysql baseline kit. You can use any of this kit and modify it for you. Through this, you can try different solutions to your problem and test which one is the fastest solution.
Some problems occurring in the system loads are very common, and we have a lot of customers contact us, they have a (test) system in the production system and have load problems. So far, one of these situations are issues related to basic design (table scanning in high load performance) or OS / library issues. If the system is not in the production system, most of them will be easily corrected.
In order to avoid this problem, you should put some strength to test your entire app under the worst load!
10.9 Design Selection
MySQL stores row data and index data in separate files. Many (almost all) other databases mix line and index data in the same file. We believe that MySQL's choice is better for very wide modern systems. Another way to store row data is to save each column information in a separate area (example is SDBM and FOCUS). This will get a performance breakthrough for each query for each access than one column. Because when more than one column is accessed, this rapidly degradation, we believe this model is not best to the general function database.
More common situations are indexes and data being stored together (like Oracle / Sybase). In this case, you will find row information on the indexed leaf page. The advantage of this layout is that it saves a disk read in many cases (depending on this index being buffered). The disadvantage of this layout is:
Table scans slower because you have to read the index to get the data. You lose a lot of space because you have to repeat the index from the node (because you cannot store the row on the node) to delete the deterioration of the database table over time (because the index in the node is usually not updated after deletion). You can't use only the index table as a query to retrieve data. Index data is difficult to buffer.
10.10 mysql design limit / compromise
Because MySQL uses extremely fast table lock (multiple read / one writes), the biggest problem left is a mixing of one inserted stable data stream in the same table with a slow selection.
We believe that in other situations, it is a winner for most systems and abnormalities. This situation is usually solved by multiple copies of the table, but it takes more strength and hardware.
For some common applications, we also develop some extension features to resolve this issue.
10.11 portability
Because all SQL servers implement different parts of SQL, it is necessary to spend a portable SQL application. For very simple selection / insert, it is easy, the more you need, the more difficult it is, and if you want the application, it is more difficult to make a lot of databases!
In order to make a complex application portable, you need to choose a lot of SQL servers that it should work with.
When you can use the MySQL's Crash-ME program (http://www.mysql.com/crash-me-choose.htmy) to find functions, types, and restrictions you can use. Crash-ME is now tested for any possible things for a long time, but it still understands about 450 things tested.
For example, if you want to use INFORMIX or DB2, you should not have a longer column name than 18 characters.
The MySQL benchmark program and Crash-Me are independent of the database. By observing how we deal with it, you can get a feeling, you have to do what you have written independently of your database. The base itself can be found in the "SQL-Bench" directory distributed in the mysql source code. They use the DBI database interface to write (it solves the access part of the problem).
Look at this benchmark to http://www.mysql.com/benchmark.html.
As you can see, all databases have some weaknesses like these results. This is different behaviors that have different design compromises.
If you work hard for the independence of the database, you need to get a good feeling of each SQL server bottleneck. MySQL is very fast in retrieving and updating, but there will be a problem in the same table. On the other hand, Oracle has a big problem when you try to access the row you have recently updated (until they are emptied to disk). The transaction database is generally not very good in generating a summary table from the recorded file table, because in this case, the row-level lock is almost useless.
In order to "indeed independent of the database", you need to define an easy scalable interface that can be manipulated with you. Because C can be obtained on most systems, a C class interface using the database is meaningful.
If you use some database-specific features (in MySQL, like the replace command), you should encode a method for the SQL server to achieve the same function (but slow). With MySQL, you can use / *! * / Syntax to add mysql-specific keywords to the query. The code in / ** will be considered an annotation (ignored) by most other SQL servers (ignored). If high performance is more important than accuracy, just like some web applications. One possibility is to create an application layer to buffer all the results to give you higher performance. By making the old result in a short time 'expired', you can keep the cache to refresh. This is quite good in extremely high loads. In this case, you can dynamically increase the cache to larger and set higher expiration time limits until everything goes back.
In this case, the table for creating information should include cache initial size and tables that should generally be refreshed several times.
10.12 Where have we used mysql?
During the initial development of MySQL, MySQL's features are suitable for our maximum customer. They handle data warehouses for some of the largest retailers in Sweden.
We get all the weekly summary of all the bonus trading and we are expected to provide useful information for all owners to help them get their customers how to affect their customers.
The data is quite a huge amount (approximately 7 million transactions per month) and the data we have saved in 4-10 needs to be presented to the user. We get a request every week, they want to "immediately" access new reports from this data.
We save all information in a compressed "transaction" table to solve it through monthly. We have a simple macro / script to generate different conditions (product group, customer ID, shop ...) from the trading list. The report is a web page dynamically generated by a small Perl script that performs a grammar analysis web page, and executes the SQL statement in the script and inserts the result. Now we really want to use PHP or MOD_PERL, but they didn't.
For graphics data, we have written a simple tool with the C language that generates gifts based on the results of the SQL query (some processing of the results), which is also dynamically implemented from the Perl script of the HTML file.
In most cases, a new report is completed by simply copying an existing script and modifying the SQL query. In some cases, we will need to add more fields to an existing summary table or generate a new, but this is quite simple because we save all trading tables on the disk. (Currently we have at least 50G trading forms and 200G other customer data).
We also let our customers use ODBC access to summary tables to advanced users can test data.
We use very mid-range Sun Ultra SparcStation (2x200 mz) to do without any problems. Recently we upgraded one of the servers to a 2 CPU 400 MZ Ultra SPARC, and we plan to process product-level transactions, which will mean that data increases by 10. We think we can catch up with it by adding more disks to our system.
We are also experimenting with Intel-Linux so that you can get more CPU power. Since we have binary portable database format (introduced in 3.32), we will start using it in some parts of the application.
Our initial feeling is that Linux is performing better when it is low to moderate load, but when you start to get high load, Solaris will behave better because the disk IO limits, but we have no conclusion regarding this. After discussing with some Linux core developers, this may be Linux side effects, which gives batch to have too much resource makes the performance of interaction very low. This makes the machine feel slow and no response when the large batch is in progress. I hope this will be solved in the future Linux kernel.