MySQL Çorbası: InnoDB - Storage Engine

Giriş

Açıklaması şöyle. MySQL 5.5'ten sonra varsayılan depolama motoru (storage engine) InnoDB oldu

MySQL can integrate with many different storage engines. Each storage engine has its own set of pros and cons for different use cases. In other words, storage engine can be considered as an interface, and can have different underlying implementations. For example, there are InnoDB, MyISAM, Memory, CSV, Archive, Merge, Blackhole.

InnoDB is certainly the most widely used. It is the default ever since MySQL 5.5 version.

Engine - InnoDB
Tablonun kullanacağı engine şöyle belirtilir.

CREATE TABLE myTable (...) ENGINE=InnoDB;

Şöyle yaparız.

CREATE TABLE my_table (...)
ENGINE=InnoDB AUTO_INCREMENT=4500000 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin

Engine - MyISAM

MyISAM artık kullanılmıyor.

Eğer engine MyIsam ise tablo 3 dosyada tutulur. tableName.MYI (indeksler için ) and tableName.MYD (veri için)

Eğer engine InnoDb ise tablo tek dosyada tutulur.

InnoDB Veriyi Nerede Saklar?

InnoDB veriyi hem bellekte hem de diskte saklar. Şeklen şöyle

1. Bellek

İki kısımdan oluşur

1. Buffer pool

2 Log buffer

1.1 Buffer pool

Açıklaması şöyle. Veriyi sayfalar (page) şeklinde bellekte saklar

Buffer pool is very important for InnoDB. MySQL usually is very fast in processing queries, and the reason is that the data is actually stored and served from memory (in most cases NOT from disk, contrary to what many people may think). This memory component is the buffer pool.

1.1.1 Sayfalar

- Her sayfa Doubly Linked List şeklinde birbirine bağlıdır

- Sayfada User Records bulunur. Bunlar Single Linked List şeklinde birbirine bağlıdır

- Yeni satır en sona eklenir

1.1.2 Infimum ve Supremum

Açıklaması şöyle. Yani bir sayfadaki en büyük ve en küçük primary key değerlerini saklarlar

The two fields represents the largest and smallest row in a page, respectively. In other words, the two form a min-max filter. By checking these two fields in O(1) time, InnoDB can decide whether the row to look for is stored in this particular page.

For example, say our primary key is numerical, and we have supremum = 1, infimum = 99 for a particular page. If we are trying to look for a row with primary key = 101, then clearly in O(1) time, InnoDB decides that it is not in this page, and will go to another page.

1.1.3 Page Directory

Açıklaması şöyle.

Now, if the target row is not in the page, InnoDB skips the page. However, if the row is actually in the page, does InnoDB still traverse the entire list? The answer is no, and the “other fields” again come handy to help.

Page directory
As the name suggests, it is like the “table of contents” of a book.

Page directory stores a pointer to a block of rows. Say we have row 1 to n stored in User Records, page directory will have pointers to row1, row7, row 13 …, with a block size of 6. This design is very similar to Skip List.

As you can probably imagine now, InnoDB first checks the page directory, and quickly determines which block to look into. Then utilizes the pointers to jump to the corresponding row in the singly linked list and start traversing from there. In this way, InnoDB avoids traversing the entire list, and needs to traverse a much smaller sublist.

The block we mentioned above is officially called Slot. Page directory has multiple slots, and each slot has a pointer to a row in the User Records.

Every time when a new row is inserted to User Records, InnoDB updates the Page Directory as well to make the two consistent with each other. InnoDB creates a slot for every 6 rows.

1.2 Change Buffer

Açıklaması şöyle.

Let’s say we updated some fields of the row id = 100. If a secondary index is built on one of the field, then the secondary index needs to be updated at the same time. However, if the page containing the secondary index is not in the Buffer Pool, then does InnoDB loads the page to Buffer Pool as well?

If the secondary index is not going to be used later, but every time eagerly gets loaded to Buffer Pool whenever a related field is updated, there will be a lot of random disk I/O operations, which will inevitably slow down InnoDB.

This is why Change Buffer is designed to “lazily” update secondary indexes.

When a secondary index needs to be updated, and the page containing it is not in the Buffer Pool, the update will be temporarily stored in Change Buffer. At a later time, if the page gets loaded into Buffer Pool (due to updates to primary index), InnoDB will merge the temporary changes in Change Buffer to pages in Buffer Pool.

There is a drawback to this Change Buffer design though. If the Change Buffer has accumulated a large amount of temporary changes, and say we want to merge it all at once with Buffer Pool, it may take hours to complete!! And during this merging, there will be a lot of disk I/O, which occupies CPU cycles, and affects the overall performance of MySQL.

So some tradeoff are further made. We earlier mentioned that merge is triggered when the page containing the secondary index gets loaded back to Buffer Pool. In InnoDB design, the merge can also be triggered by other events to avoid large merges that may take a few hours. These events include transactions, server shut-down, and sever restart.

MySQL Çorbası

14 Mart 2023 Salı

InnoDB - Storage Engine

Hiç yorum yok:

Yorum Gönder

LIMIT ve Covering Index + Subquery

Kötüye Kullanım Bildir

Etiketler