Python資料結構與清單推導式應用

Python 的 List、Tuple 和 Dictionary 提供了有效率的資料組織和操作方式，是資料科學的根本。List 可變動且有序，適用於需要頻繁修改的資料集合；Tuple 不可變，適用於固定資料，確保資料完整性；Dictionary 則以鍵值對儲存，方便快速查詢。理解這些特性，才能根據不同情境選擇合適的資料結構，提升程式效能。此外，清單推導式提供簡潔的語法，能快速生成 List，提升程式碼可讀性。結合檔案讀寫操作，更能有效處理大量資料，是資料科學的必備技能。

import csv
import numpy as np

def read_txt(f):
    with open(f, 'r') as f:
        d = f.readlines()
    return [x.strip() for x in d]

def conv_csv(t, c):
    data = read_txt(t)
    with open(c, 'w', newline='') as csv_file:
        writer = csv.writer(csv_file)
        for line in data:
            ls = line.split()
            writer.writerow(ls)

def read_csv(f):
    contents = ''
    with open(f, 'r') as f:
        reader = csv.reader(f)
        return list(reader)

def read_dict(f, h):
    input_file = csv.DictReader(open(f), fieldnames=h)
    return input_file

def od_to_d(od):
    return dict(od)

if __name__ == "__main__":
    # ... (rest of the code remains unchanged)

Python 資料結構：List、Tuple 與 Dictionary 的應用

Python 提供了多種資料結構以便於處理不同型別的資料，其中 List、Tuple 和 Dictionary 是最常用到的三種結構。本篇文章將探討這三種資料結構的使用方法，並透過實際範例來展示它們的特性與應用場景。

List（串列）

List 是 Python 中最基本且最常用的資料結構之一。它是一種有序的集合，可以包含不同型別的元素，並且支援多種操作方法，如索引、切片、新增和刪除元素等。

List 的基本操作

以下範例程式碼展示瞭如何建立一個 List，並進行各種操作：

import numpy as np

if __name__ == "__main__":
    ls = ['orange', 'banana', 10, 'leaf', 77.009, 'tree', 'cat']
    print('list length:', len(ls), 'items')
    print('cat count:', ls.count('cat'), ',', 'cat index:', ls.index('cat'))

    # 對 List 進行操作
    cat = ls.pop(6)
    print('cat:', cat, ', list:', ls)
    ls.insert(0, 'cat')
    ls.append(99)
    print(ls)
    ls[7] = '11'
    print(ls)
    ls.pop(1)
    print(ls)
    ls.pop()
    print(ls)

    # 對 List 進行切片操作
    print('\nslice list:')
    print('1st 3 elements:', ls[:3])
    print('last 3 elements:', ls[-3:])
    print('start at 2nd to index 5:', ls[1:5])
    print('start from 2nd to next to end of list:', ls[1:-1])

    # 從另一個 List 建立新的 List
    fruit = ['orange']
    more_fruit = ['apple', 'kiwi', 'pear']
    fruit.append(more_fruit)
    print('appended:', fruit)
    fruit.pop(1)
    fruit.extend(more_fruit)
    print('extended:', fruit)

    # 從兩個 List 建立矩陣
    matrix = np.array([ls, fruit])
    print(matrix)
    print('1st row:', matrix[0])
    print('2nd row:', matrix[1])

內容解密：

len(ls)：計算 List ls 的長度，即元素的數量。
ls.count('cat')：計算 'cat' 在 List ls 中出現的次數。
ls.index('cat')：找出 'cat' 在 List ls 中的索引位置。
ls.pop(6)：移除並傳回 List ls 中索引為 6 的元素。
ls.insert(0, 'cat')：在 List ls 的索引 0 處插入元素 'cat'。
ls.append(99)：在 List ls 的末尾新增元素 99。
ls[7] = '11'：將 List ls 中索引為 7 的元素替換為 '11'。
ls.pop(1) 和 ls.pop()：分別移除 List ls 中索引為 1 的元素和最後一個元素。
ls[:3], ls[-3:], ls[1:5], ls[1:-1]：對 List ls 進行切片操作，分別取得前三個元素、最後三個元素、從第二個到第五個元素、以及從第二個到倒數第二個元素。
fruit.append(more_fruit) 和 fruit.extend(more_fruit)：將 List more_fruit 新增到 List fruit 中，區別在於 append 將整個 more_fruit 作為一個元素新增，而 extend 則將 more_fruit 的元素逐一新增到 fruit 中。
np.array([ls, fruit])：使用 NumPy 將兩個 List 合併成一個二維陣列（矩陣）。

Tuple（元組）

Tuple 是另一種有序的資料結構，與 List 不同的是，Tuple 是不可變的，即一旦建立後就不能修改其內容。

Tuple 的基本操作

以下範例程式碼展示瞭如何建立一個 Tuple，並進行各種操作：

import numpy as np

if __name__ == "__main__":
    tup = ('orange', 'banana', 'grape', 'apple', 'grape')
    print('tuple length:', len(tup))
    print('grape count:', tup.count('grape'))

    # 對 Tuple 進行切片操作
    print('\nslice tuple:')
    print('1st 3 elements:', tup[:3])
    print('last 3 elements:', tup[-3:])
    print('start at 2nd to index 5:', tup[1:5])
    print('start from 2nd to next to end of tuple:', tup[1:-1])

    # 從 Tuple 和 List 建立矩陣
    fruit = ['pear', 'grapefruit', 'cantaloupe', 'kiwi', 'plum']
    matrix = np.array([tup, fruit])
    print(matrix)

內容解密：

len(tup)：計算 Tuple tup 的長度。
tup.count('grape')：計算 'grape' 在 Tuple tup 中出現的次數。
tup[:3], tup[-3:], tup[1:5], tup[1:-1]：對 Tuple tup 進行切片操作，分別取得前三個元素、最後三個元素、從第二個到第五個元素、以及從第二個到倒數第二個元素。
np.array([tup, fruit])：使用 NumPy 將 Tuple tup 和 List fruit 合併成一個二維陣列（矩陣）。

Dictionary（字典）

Dictionary 是 Python 中的一種無序資料結構，用於儲存鍵值對（key-value pairs）。它是一種極為重要且常用的資料結構，特別是在處理資料時。

Dictionary 的基本操作

以下範例程式碼展示瞭如何建立一個 Dictionary，並進行各種操作：

if __name__ == "__main__":
    audio = {'amp': 'Linn', 'preamp': 'Luxman', 'speakers': 'Energy',
             'ic': 'Crystal Ultra', 'pc': 'JPS', 'power': 'Equi-Tech',
             'sp': 'Crystal Ultra', 'cdp': 'Nagra', 'up': 'Esoteric'}
    
    # 刪除 Dictionary 中的元素
    del audio['up']
    print('dict "deleted" element;')
    print(audio, '\n')

    # 新增 Dictionary 中的元素
    audio['up'] = 'Oppo'
    print('dict "added" element;')
    print(audio, '\n')

    # 存取 Dictionary 中的元素
    print('universal player:', audio['up'], '\n')

    # 將 Dictionary 新增到 List 中並遍歷
    dict_ls = [audio]
    video = {'tv': 'LG 65C7 OLED', 'stp': 'DISH', 'HDMI': 'DH Labs',
             'cable': 'coax'}
    dict_ls.append(video)
    
    for i, row in enumerate(dict_ls):
        print('row', i, ':')
        print(row)

內容解密：

del audio['up']：刪除 Dictionary audio 中的鍵 'up' 對應的元素。
audio['up'] = 'Oppo'：在 Dictionary audio 中新增或更新鍵 'up' 對應的元素為 'Oppo'。
audio['up']：存取 Dictionary audio 中鍵 'up' 對應的元素值。
dict_ls = [audio] 和 dict_ls.append(video)：將 Dictionary audio 和 video 新增到 List dict_ls 中。
for i, row in enumerate(dict_ls)：遍歷 List dict_ls，並印出每個 Dictionary 的內容。

資料處理與清單推導式在資料科學中的應用

在資料科學的領域中，讀取和寫入資料是基本且重要的技能。資料的儲存格式多種多樣，其中最常見的包括純文字檔和逗號分隔值（CSV）檔。本篇文章將探討如何使用 Python 進行文字檔和 CSV 檔的讀寫，並介紹清單推導式（List Comprehension）的使用方法。

讀取和寫入資料

首先，我們來看看如何讀取和寫入文字檔和 CSV 檔。以下是一個 Python 程式碼範例，展示瞭如何進行這些操作。

import csv

def read_txt(f):
    with open(f, 'r') as f:
        d = f.readlines()
    return [x.strip() for x in d]

def conv_csv(t, c):
    data = read_txt(t)
    with open(c, 'w', newline='') as csv_file:
        writer = csv.writer(csv_file)
        for line in data:
            ls = line.split()
            writer.writerow(ls)

def read_csv(f):
    contents = ''
    with open(f, 'r') as f:
        reader = csv.reader(f)
        return list(reader)

def read_dict(f, h):
    input_file = csv.DictReader(open(f), fieldnames=h)
    return input_file

def od_to_d(od):
    return dict(od)

if __name__ == "__main__":
    f = 'data/names.txt'
    data = read_txt(f)
    print('文字檔資料範例：')
    for i, row in enumerate(data):
        if i < 3:
            print(row)

    csv_f = 'data/names.csv'
    conv_csv(f, csv_f)
    r_csv = read_csv(csv_f)
    print('\n文字轉CSV範例：')
    for i, row in enumerate(r_csv):
        if i < 3:
            print(row)

    headers = ['first', 'last']
    r_dict = read_dict(csv_f, headers)
    dict_ls = []
    print('\nCSV轉有序字典範例：')
    for i, row in enumerate(r_dict):
        r = od_to_d(row)
        dict_ls.append(r)
        if i < 3:
            print(row)

    print('\n字典元素列表範例：')
    for i, row in enumerate(dict_ls):
        if i < 3:
            print(row)

內容解密：

read_txt 函式：讀取一個文字檔，並去除每行末尾的空白字元。
conv_csv 函式：將文字檔轉換為 CSV 檔，並將其儲存到磁碟上。
read_csv 函式：讀取一個 CSV 檔，並將其內容以列表形式傳回。
read_dict 函式：讀取一個 CSV 檔，並將其內容以有序字典（OrderedDict）列表形式傳回。
od_to_d 函式：將有序字典元素轉換為常規字典元素。
在主程式區塊中，首先讀取一個文字檔並清理資料。然後，將文字檔轉換為 CSV 檔，並從磁碟上讀取該 CSV 檔。接著，建立一個有序字典列表，並將其轉換為常規字典元素列表。

清單推導式

清單推導式提供了一種簡潔的方式來建立清單。它的邏輯被封閉在方括號內，包含一個表示式，後面跟著一個 for 子句，可以透過更多的 for 或 if 子句進行擴充。

以下是一個使用清單推導式的範例。

if __name__ == "__main__":
    miles = [100, 10, 9.5, 1000, 30]
    kilometers = [x * 1.60934 for x in miles]
    print('英里轉公里：')
    for i, row in enumerate(kilometers):
        print('{:>4} {:>8}{:>8} {:>2}'.format(miles[i], '英里是', round(row, 2), '公里'))

    pet = ['cat', 'dog', 'rabbit', 'parrot', 'guinea pig', 'fish']
    print('\n寵物：')
    print(pet)

    pets = [x + 's' if x != 'fish' else x for x in pet]
    print('\n寵物們：')
    print(pets)

    subset = [x for x in pets if x != 'fish' and x != 'rabbits' and x != 'parrots' and x != 'guinea pigs']
    print('\n最常見的寵物：')
    print(subset[1], '和', subset[0])

    sales = [9000, 20000, 50000, 100000]
    print('\n獎金：')
    bonus = [0 if x < 10000 else x * 0.02 if x >= 10000 and x <= 20000 else x * 0.03 for x in sales]
    print(bonus)

    people = ['dave', 'sue', 'al', 'sukki']
    d = {}
    for i, row in enumerate(people):
        d[row] = bonus[i]
    print('\n獎金字典：')
    print(d)
    print('\n{:<5} {:<5}'.format('員工', '獎金'))
    for k, y in d.items():
        print('{:<5} {:>6}'.format(k, y))

內容解密：

英里轉公里：使用清單推導式將英里轉換為公里。
寵物名單處理：對寵物名單進行處理，加上複數形。
篩選常見寵物：從處理後的寵物名單中篩選出常見的寵物。
計算獎金：根據銷售額計算獎金，使用清單推導式進行條件判斷和計算。
建立獎金字典：將員工姓名與獎金對應起來，建立一個字典。

玄貓 BlackCat

技術愛好者，專注於分享程式開發、雲端技術與 AI 應用的心得體會。