CSV,Character Seperated Values. 也有 Comma Seperated Values 一稱。
但分隔符可以是任何字元,不一定要是逗號。
Ruby 已經有內建處理 CSV 的標準庫了,require 'csv'
即可。
CSV 的官方文件,這裡可以查所有可用的方法:
假設今天有一個 CSV 文件,students.csv
,內容如下:
name | attendance | GPA | comment |
---|---|---|---|
Bob | 144 | 4.0 | Great at everything! |
Alice | 124 | 3.9 | Great, very talented! |
Steve | 119 | 3.5 | He is "good", but could be better |
David | 100 | 3.0 | He needs to attend more classes! |
這個 CSV 要怎麼表示呢?
Bob,144,4.0,Great at everything!
Alice,124,3.9,"Great, very talented!"
Steve,119,3.5,"He is ""good"", but could be better"
David,100,3.0,He needs to attend more classes!
Comment 欄位有兩列要注意:
-
Alice
的comment
用 1 個雙引號包起來,因為裡面有逗號。 -
Steve
的comment
用雙引號包起來,裡面用兩個雙引號表示“雙引號”。
首先要處理 CSV,要先把檔案讀進來,有兩種讀法:
-
一次全部讀進來,存在記憶體裡。
-
一次讀一列(row)。
不管用那種讀法,Ruby 都會把每一列視為一個陣列。打開 Pry / Irb 試試。
> require 'csv'
> students = CSV.read 'students.csv'
=> [["Bob", "144", "4.0", "Great at everything!"],
["Alice", "124", "3.9", "Great, very talented!"],
["Steve", "119", "3.5", "He is \"good\", but could be better"],
["David", "100", "3.0", "He needs to attend more classes!"]]
> CSV.foreach('students.csv') do |row|
> p row
> end
["Bob", "144", "4.0", "Great at everything!"]
["Alice", "124", "3.9", "Great, very talented!"]
["Steve", "119", "3.5", "He is \"good\", but could be better"]
["David", "100", "3.0", "He needs to attend more classes!"]
=> nil
字串:
> CSV.parse "Ruby,1995,Rails,2007"
=> [["Ruby", "1995", "Rails", "2007"]]
區塊:
> CSV.parse("Ruby,1995,Rails,2007") { |row| p row }
["Ruby", "1995", "Rails", "2007"]
=> nil
從上例可以看出,傳入字串的行為類似於 CSV.read
;而傳區塊則與 CSV.foreach
類似。
實際上呢,
CSV.read('students.csv')
# 等同於
CSV.parse(File.read('students.csv'))
若個欄位之間,不是用逗號區隔,那該怎麼處理呢?
# students_col.csv
Ruby;1995;First appear
Rails;2007;2.0
Perl6;2048;"Who;Knows;When"
很簡單,在讀取的時候,用 :col_sep
選項指定分隔符即可:
new_students = CSV.read('students_col.csv', { col_sep: ';' })
new_students = CSV.foreach('students_col.csv', { col_sep: ';' }).to_a
除了 :col_sep
選項之外,其他可用選項請參考:http://ruby-doc.org/stdlib-2.1.2/libdoc/csv/rdoc/CSV.html#method-c-new
當 CSV 讀東西進來時,全部被當成字串讀進來,並存在陣列裡。先前的例子:
> str_arr = CSV.parse "Ruby,1995,Rails,2007"
> str_arr[0][1].class
=> String
現在假設我們有些欄位是數字,某幾個欄位要做一些運算,字串不能做運算,該怎麼處理呢?
> fixnum_arr = CSV.parse "Ruby,1995,Rails,2007", converters: :numeric
> fixnum_arr[0][1].class
=> Fixnum
譬如有週支出表:
day | expense |
---|---|
Mon | 100 |
Tue | 120 |
Wed | 130 |
Thu | 140 |
Fri | 220 |
Sat | 320 |
Sun | 100 |
我們想要算這週到底花了多少錢:
# week_expense.csv
Mon,100
Tue,120
Wed,130
Thu,140
Fri,220
Sat,320
Sun,100
把每列的第二格存到陣列裡,再加總一下即可。
CSV.foreach('week_expense.csv', converters: :numeric).inject([]) do |acc, row|
acc << row[1]
end.reduce(:+)
=> 1130
如果沒轉成數字,便會變成字串相加了 String#+
:
CSV.foreach('week_expense.csv').inject([]) do |acc, row|
acc << row[1]
end.reduce(:+)
=> "100120130140220320100"
讓我們把原本的 students.csv
讀進來,加一列新資料:
Obie,144,3.8,Great great!
接著寫回去。
首先先讀進來:
students = CSV.read('students.csv')
students << 'Obie,144,3.8,Great great!'.split(',')
再把每列(陣列形式)寫回去:
CSV.open('students.csv', 'w') do |csv|
students.each { |student| csv << student }
end
(待續)