MySQL 檔案儲存集合管理與索引應用

MySQL 檔案儲存提供便捷的 JSON 檔案管理方式，本文將介紹如何在 MySQL 中操作集合及應用索引。首先說明如何使用 create_collection() 建立集合，並透過 SHOW CREATE TABLE 指令檢視集合結構。接著，示範如何使用 Python 程式碼搭配 get_collection() 與 get_collections() 方法檢索單一或所有集合。最後，說明如何使用 drop_collection() 刪除集合，以及如何建立和刪除不同型別的索引，例如普通索引、空間索引和組合索引，以提升查詢效能。

MySQL 檔案儲存中的集合管理與索引應用

MySQL 的檔案儲存（Document Store）提供了一種靈活的方式來儲存和操作 JSON 檔案。在 MySQL 中，這些檔案被儲存在集合（Collection）中，類別似於傳統資料函式庫中的表格。本文將深入探討如何在 MySQL 檔案儲存中建立、檢索和管理集合，以及索引的建立和應用。

建立集合

在 MySQL 檔案儲存中，建立集合是透過 create_collection() 方法實作的。這個方法會在指定的資料函式庫（Schema）中建立一個新的集合，如果該集合不存在的話。

import mysqlx
from config import connect_args

# 建立資料函式庫連線
db = mysqlx.get_session(**connect_args)

# 重置 py_test_db 資料函式庫
db.drop_schema("py_test_db")
schema = db.create_schema("py_test_db")

# 建立 my_docs 集合
docs = schema.create_collection("my_docs")

# 關閉資料函式庫連線
db.close()

內容解密：

此範例程式碼展示瞭如何建立一個名為 my_docs 的集合。首先，我們需要建立一個資料函式庫連線並重置 py_test_db 資料函式庫。接著，我們使用 create_collection() 方法在 py_test_db 資料函式庫中建立 my_docs 集合。最後，我們關閉資料函式庫連線。

檢視集合定義

建立集合後，可以使用 SHOW CREATE TABLE 陳述式檢視集合的定義。

mysql> SHOW CREATE TABLE py_test_db.my_docs\G
***************************1. row ***************************
Table: my_docs
Create Table: CREATE TABLE `my_docs` (
 `doc` json DEFAULT NULL,
 `_id` varbinary(32) GENERATED ALWAYS AS (json_unquote(json_extract(`doc`, _utf8mb4'$._id'))) STORED NOT NULL,
 PRIMARY KEY (`_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)

圖表翻譯：

此圖示展示了 MySQL 檔案儲存中集合的預設表格定義。集合中的檔案儲存在 doc 列中，使用 JSON 資料型別。 _id 列是一個生成的列，用於提取檔案中的 _id 物件，並作為主鍵。

  flowchart TD
 A[建立集合] --> B[檢視集合定義]
 B --> C[分析集合結構]
 C --> D[建立索引]

圖表翻譯：

此圖示展示了管理 MySQL 檔案儲存中集合的流程。首先，建立一個新的集合。接著，檢視集合的定義以瞭解其結構。然後，分析集合的結構以確定是否需要建立索引。最後，建立索引以提高查詢效能。

檢索集合

MySQL 提供了兩種方法來檢索集合：get_collection() 和 get_collections()。

檢索單一集合

可以使用 get_collection() 方法根據集合名稱檢索特定的集合。

import mysqlx
from config import connect_args

# 建立資料函式庫連線
db = mysqlx.get_session(**connect_args)

# 取得 py_test_db 資料函式庫
schema = db.get_schema("py_test_db")

# 檢索 my_docs 集合
docs = schema.get_collection("my_docs")

# 列印集合名稱
print("Name of collection: {0}".format(docs.name))

# 關閉資料函式庫連線
db.close()

內容解密：

此範例程式碼展示瞭如何使用 get_collection() 方法檢索名為 my_docs 的集合。我們首先建立資料函式庫連線並取得 py_test_db 資料函式庫。接著，使用 get_collection() 方法檢索 my_docs 集合，並列印其名稱。

檢索所有集合

可以使用 get_collections() 方法檢索資料函式庫中的所有集合。

import mysqlx
from config import connect_args

# 建立資料函式庫連線
db = mysqlx.get_session(**connect_args)

# 重置 py_test_db 資料函式庫
db.drop_schema("py_test_db")
schema = db.create_schema("py_test_db")

# 建立兩個集合
schema.create_collection("employees")
schema.create_collection("customers")

# 檢索所有集合
collections = schema.get_collections()

# 列印所有集合的名稱
for collection in collections:
 print("Collection name: {0}".format(collection.name))

# 關閉資料函式庫連線
db.close()

內容解密：

此範例程式碼展示瞭如何使用 get_collections() 方法檢索 py_test_db 資料函式庫中的所有集合。我們首先重置 py_test_db 資料函式庫並建立兩個集合：employees 和 customers。接著，使用 get_collections() 方法檢索所有集合，並列印其名稱。

刪除集合

可以使用 drop_collection() 方法刪除不再需要的集合。

import mysqlx
from config import connect_args

# 建立資料函式庫連線
db = mysqlx.get_session(**connect_args)

# 取得 py_test_db 資料函式庫
schema = db.get_schema("py_test_db")

# 刪除集合
schema.drop_collection("my_docs")
schema.drop_collection("employees")
schema.drop_collection("customers")

# 刪除資料函式庫
db.drop_schema("py_test_db")

# 關閉資料函式庫連線
db.close()

內容解密：

此範例程式碼展示瞭如何使用 drop_collection() 方法刪除集合。我們首先取得 py_test_db 資料函式庫，然後刪除 my_docs、employees 和 customers 集合。最後，刪除 py_test_db 資料函式庫。

MySQL 檔案儲存中的索引建立與應用

在 MySQL 的檔案儲存（Document Store）中，索引的建立對於提升查詢效能至關重要。與傳統的 SQL 表格相比，檔案儲存中的索引建立更為複雜，因為除了索引定義本身，還需要定義如何從檔案中檢索值以及這些值代表的意義。

建立索引的方法

可以使用集合（collection）物件的 create_index() 方法來定義索引。以下是一個建立集合並新增三個索引的範例：

import mysqlx
from config import connect_args

# 建立資料函式庫連線
db = mysqlx.get_session(**connect_args)

# 重新初始化 py_test_db 綱要中的 employees 集合
schema = db.create_schema("py_test_db")
schema.drop_collection("employees")
employees = schema.create_collection("employees")

# 定義將用於索引的三個欄位
field_name = {
 "field": "$.Name",
 "type": "TEXT(60)",
 "required": True,
 "collation": "utf8mb4_0900_ai_ci",
}

field_office_location = {
 "field": "$.Office.Location",
 "type": "GEOJSON",
 "required": True,
 "options":1,
 "srid":4326,
}

field_birthday = {
 "field": "$.Birthday",
 "type": "DATE",
 "required": False,
}

# 在員工姓名上建立普通索引
index_name = "employee_name"
index_def = {
 "fields": [field_name],
 "type": "INDEX",
}
index = employees.create_index(index_name, index_def)
index.execute()
print("Index created: {0}".format(index_name))

# 在員工辦公室位置上建立空間索引
index_name = "employee_office_location"
index_def = {
 "fields": [field_office_location],
 "type": "SPATIAL",
}
employees.create_index(index_name, index_def).execute()
print("Index created: {0}".format(index_name))

# 在員工生日和姓名上建立組合索引
index_name = "employee_birthday_name"
index_def = {
 "fields": [field_birthday, field_name],
 "type": "INDEX",
}
index = employees.create_index(index_name, index_def)
index.execute()
print("Index created: {0}".format(index_name))

db.close()

程式碼解析：

此範例程式碼展示瞭如何在 MySQL 的檔案儲存中建立索引。首先，重新初始化 employees 集合以確保每次執行範例時的起始點相同。接著，定義了三個將用於索引的欄位：field_name、field_office_location 和 field_birthday。這些欄位分別對應員工的姓名、辦公室位置和生日。

範例中建立了三個索引：employee_name、employee_office_location 和 employee_birthday_name。employee_name 是根據員工姓名的普通索引，employee_office_location 是根據辦公室位置的空間索引，而 employee_birthday_name 則是結合員工生日和姓名的組合索引。

  flowchart TD
 A[開始建立索引] --> B{選擇索引型別}
 B -->|普通索引| C[建立 employee_name 索引]
 B -->|空間索引| D[建立 employee_office_location 索引]
 B -->|組合索引| E[建立 employee_birthday_name 索引]
 C --> F[完成索引建立]
 D --> F
 E --> F

圖表解析：

此圖表展示了建立索引的流程。首先，決定要建立的索引型別（普通索引、空間索引或組合索引）。根據選擇的索引型別，分別建立對應的索引。最終，所有索引建立完成後，流程結束。

MySQL 檔案儲存中的索引管理與 CRUD 操作

MySQL 檔案儲存提供了一套完整的 CRUD（建立、讀取、更新、刪除）操作介面，讓開發者能夠靈活地管理檔案資料。在進行 CRUD 操作之前，瞭解如何建立和刪除索引是至關重要的。

刪除索引

刪除索引的操作比建立索引簡單許多。開發者只需呼叫集合物件的 drop_index() 方法，並傳入要刪除的索引名稱即可。MySQL 將自動處理索引的刪除作業，如果該索引所對應的生成欄位不再被其他索引使用，MySQL 也會一併刪除這些生成欄位。

import mysqlx
from config import connect_args

# 建立資料函式庫連線
db = mysqlx.get_session(**connect_args)

# 取得指定資料函式庫的結構描述（schema）
schema = db.get_schema("py_test_db")

# 取得 employees 集合
employees = schema.get_collection("employees")

# 刪除 employee_name 索引
employees.drop_index("employee_name")

index_name)

employee_name

index_def = {
 "fields": [field_name],
 "type": "INDEX",
}
employees.drop_index(index_name).execute()
print("Index {0} has been dropped".format(index_name))

# index = employees.create_index(index_name, index_def)
# index.execute()
# print("Index created: {0}".format(index_name))

# 關閉資料函式庫連線
db.close()

程式碼解析：

首先匯入必要的模組並建立資料函式庫連線。
透過 get_schema() 方法取得目標資料函式庫的結構描述。
使用 get_collection() 方法取得指定的集合物件。
呼叫 drop_index() 方法刪除指定的索引。
最後關閉資料函式庫連線。

從技術架構視角來看，MySQL 的檔案儲存透過集合和索引機制，在關係型資料函式庫中提供了 NoSQL 的靈活檔案處理能力。深入分析其核心架構，可以發現，create_collection()、get_collection()、drop_collection() 等方法簡化了集合的管理，而 create_index() 和 drop_index() 則提供了對索引的精細控制，允許開發者針對不同查詢需求建立普通索引、空間索引和組合索引。然而，MySQL 檔案儲存的效能瓶頸仍存在於複雜 JSON 檔案的查詢和更新操作上，尤其是在缺乏完善索引的情況下。對於需要高吞吐量和低延遲的應用場景，需要仔細評估索引策略和查詢設計。隨著 MySQL 持續最佳化 JSON 檔案處理引擎，預計其在混合工作負載處理能力上將更具競爭力。對於希望在關係型資料函式庫中整合非結構化資料的企業，MySQL 檔案儲存提供了一個兼顧效能和靈活性的務實選擇，技術團隊應著重於掌握索引的建立和應用，才能最大化其效能優勢。

玄貓

技術愛好者，專注於分享程式開發、雲端技術與 AI 應用的心得體會。