Pandas DataFrame處理雙維度資料方法(1)

Q:什麼是Pandas DataFrame?

現在有許多的企業或商家，都會利用取得的使用者資料來進行分析，瞭解其中的趨勢或商機，由此可見，資料分析越來越受到重視，而這時候，能夠懂得使用資料分析工具就非常的重要。

在Pandas Series處理單維度資料方法文章中，分享了Pandas Series資料結構用於處理單維度資料集的實用方法，而本文則要來介紹Pandas套件的另一個非常重要的資料結構，也就是DataFrame。

DataFrame主要用來處理雙維度的資料，也就是具有列(row)與欄(column)的表格式資料集，所以經常應用於讀取CSV檔案、網頁表格或資料庫等，來進行其中的資料分析或處理

相較於Pandas Series處理單維度或單一欄位的資料，Pandas DataFrame則可以處理雙維度或多欄位的資料，就像是Excel的表格(Table)，具有資料索引(列)及欄位標題(欄)。

Q:如何建立Pandas DataFrame?

想要使用Pandas DataFrame來儲存雙維度的資料，就要先建立Pandas DataFrame物件，語法如下：

my_dataframe = pandas.DataFrame(字典或陣列資料)

範例：

grades = {

"name": ["Mike", "Sherry", "Cindy", "John"],

"math": [80, 75, 93, 86],

"chinese": [63, 90, 85, 70]

}

df = pd.DataFrame(grades)

print("使用字典來建立df：")

print(df)

grades = [

["Mike", 80, 63],

["Sherry", 75, 90],

["Cindy", 93, 85],

["John", 86, 70]

]

new_df = pd.DataFrame(grades)

print("使用陣列來建立df：")

print(new_df)

從執行結果可以看到，相同的資料內容，使用Python字典(Dictionary)來進行指定的話，鍵值(Key)就是Pandas DataFrame的欄位名稱，值(Value)則是該欄位的資料內容。而使用陣列來指定的話，就是單純的每一筆資料內容。

如果想要客製化Pandas DataFrame的資料索引及欄位名稱，可以分別利用index及columns屬性(Attribute)來達成，如下範例：

grades = {

"name": ["Mike", "Sherry", "Cindy", "John"],

"math": [80, 75, 93, 86],

"chinese": [63, 90, 85, 70]

}

df = pd.DataFrame(grades)

df.index = ["s1", "s2", "s3", "s4"] #自訂索引值

df.columns = ["student_name", "math_score", "chinese_score"] #自訂欄位名稱

print(df)

Q:如何修改Pandas DataFrame資料?

利用Pandas DataFrame的at[]及iat[]取得所要修改的單一值後，來進行資料內容的修改，如下範例：

grades = {

"name": ["Mike", "Sherry", "Cindy", "John"],

"math": [80, 75, 93, 86],

"chinese": [63, 90, 85, 70]

}

df = pd.DataFrame(grades)

print("原來的df")

print(df)

df.at[1, "math"] = 100 #修改索引值為1的math欄位資料

df.iat[1, 0] = "Larry" #修改索引值為1的第一個欄位資料

print("修改後的df")

print(df)

如果想要學習更多的Python應用教學，歡迎前往Learn Code With Mike( https://www.learncodewithmike.com/2020/11/python-pandas-dataframe-tutorial.html )網站觀看更多精彩內容。

直播限定優惠

【真人直播】零基礎Python數據分析與即時看板實作

13853 14

NT$ 4,380

Mike的Python學院

Data Analysis、資料結構、Python、Pandas

Pandas DataFrame處理雙維度資料方法(1)